hngopher.com

       [HN Gopher] Show HN: Autotab Instruct - Claude Computer Use with...
       ___________________________________________________________________
        
       Show HN: Autotab Instruct - Claude Computer Use with Guardrails for
       Reliability
        
       Hi HN,  We've built a desktop app to create highly reliable AI
       agents that use a computer with mouse and keyboard.  Until last
       week, we had tried many different approaches to open-ended agentic
       features but none of them had met our reliability bar.  With
       Anthropic's Computer Use this finally changed, and we just shipped
       a feature we're calling Instruct. Instruct allows users to create
       agentic blocks as part of a larger Autotab skill that provides the
       structured logical flow to keep the automation on track.  If you
       haven't had a chance to try Computer Use yet, it is an impressive
       leap from the last generation of vision models (e.g. gpt4o
       struggles with relative positions, let alone coordinates). At the
       same time, it is still not good enough to be given a prompt and let
       loose.  One of the big surprises to us early on was just how much
       intent specification is required for most real world workflows to
       run reliably. What looks at first like a simple form filling task
       usually turns out to have dozens of edge cases and super specific,
       hidden rules. Even human employees need to be shown how to perform
       these tasks, and then refined with question-asking + feedback over
       time.  We wanted to build a tool for specifying intent, and
       iterating with the model to make it reliable enough for real work.
       - Automations run on top of an action scaffold, which works kind of
       like a very fuzzy programming language with strict types. This
       gives the model a high level plan that guides execution, and makes
       it easy to break out discrete steps to get the reliability you
       need. (Interestingly, this has also proven useful not just for the
       agent, but also for the human trying to create, verify and edit the
       automation.) - When the model is unsure it asks for clarification.
       For example, if you are in editing mode and the model thinks that
       an element looks meaningfully different than before, it will ask
       you to verify that it is the same element. - The agent has access
       to a memory system that lets it recall information from past runs
       as well as instructions and feedback from the user.  Here's a short
       video of Autotab Instruct in action:
       https://www.loom.com/share/ccf4e9d8c798450da3324a6cff024971?....
       There are a few more demos at
       https://twitter.com/autotabai/status/1852393973165199425
       a75f06f82cab521bc78672ed35d85e8a.  We'd love to hear what you
       think!
        
       Author : jonasnelle
       Score  : 11 points
       Date   : 2024-11-01 16:56 UTC (6 hours ago)
        
       ___________________________________________________________________
       (page generated 2024-11-01 23:01 UTC)