[HN Gopher] Show HN: Autotab Instruct - Claude Computer Use with...
___________________________________________________________________
Show HN: Autotab Instruct - Claude Computer Use with Guardrails for
Reliability
Hi HN, We've built a desktop app to create highly reliable AI
agents that use a computer with mouse and keyboard. Until last
week, we had tried many different approaches to open-ended agentic
features but none of them had met our reliability bar. With
Anthropic's Computer Use this finally changed, and we just shipped
a feature we're calling Instruct. Instruct allows users to create
agentic blocks as part of a larger Autotab skill that provides the
structured logical flow to keep the automation on track. If you
haven't had a chance to try Computer Use yet, it is an impressive
leap from the last generation of vision models (e.g. gpt4o
struggles with relative positions, let alone coordinates). At the
same time, it is still not good enough to be given a prompt and let
loose. One of the big surprises to us early on was just how much
intent specification is required for most real world workflows to
run reliably. What looks at first like a simple form filling task
usually turns out to have dozens of edge cases and super specific,
hidden rules. Even human employees need to be shown how to perform
these tasks, and then refined with question-asking + feedback over
time. We wanted to build a tool for specifying intent, and
iterating with the model to make it reliable enough for real work.
- Automations run on top of an action scaffold, which works kind of
like a very fuzzy programming language with strict types. This
gives the model a high level plan that guides execution, and makes
it easy to break out discrete steps to get the reliability you
need. (Interestingly, this has also proven useful not just for the
agent, but also for the human trying to create, verify and edit the
automation.) - When the model is unsure it asks for clarification.
For example, if you are in editing mode and the model thinks that
an element looks meaningfully different than before, it will ask
you to verify that it is the same element. - The agent has access
to a memory system that lets it recall information from past runs
as well as instructions and feedback from the user. Here's a short
video of Autotab Instruct in action:
https://www.loom.com/share/ccf4e9d8c798450da3324a6cff024971?....
There are a few more demos at
https://twitter.com/autotabai/status/1852393973165199425
a75f06f82cab521bc78672ed35d85e8a. We'd love to hear what you
think!
Author : jonasnelle
Score : 11 points
Date : 2024-11-01 16:56 UTC (6 hours ago)
___________________________________________________________________
(page generated 2024-11-01 23:01 UTC)