[HN Gopher] Show HN: Autotab - Programmable AI browser for turni...
___________________________________________________________________
Show HN: Autotab - Programmable AI browser for turning web tasks
into APIs
Hey HN, we're Alexi and Jonas the co-founders of Autotab
(https://autotab.com). Autotab is a chrome-based browser you can
teach to do complex tasks, with a simple API for running them from
your app or backend. Here is a walkthrough of how it works:
https://youtu.be/63co74JHy1k, and you can try it for free at
https://autotab.com by downloading the app. Why a dedicated
editor? The number one blocker we've found in building more
flexible, agentic automations is performance quality BY FAR
(https://www.langchain.com/stateofaiagents#barriers-and-chall...).
For all the talk of cost, latency, and safety, the fact is most
people are still just struggling to get agents to work. The keys to
solving reliability are better models, yes, but also intent
specification. Even humans don't zero-shot these tasks from a
prompt. They need to be shown how to perform them, and then refined
with question-asking + feedback over time. It is also quite
difficult to formulate complete requirements on the spot from
memory. The editor makes it easy to build the specification up as
you step through your workflow, while generating successful task
trajectories for the model. This is the only way we've been able to
get the reliability we need for production use cases. But why
build a browser? Autotab started as a Chrome extension (with a
Show HN post! https://news.ycombinator.com/item?id=37943931). As we
iterated with users, we realized that we needed to focus on
creating the control surface for intent specification, and that
being stuck in a chrome sidepanel wasn't going to work. We also
knew that we needed a level of control for the model that we
couldn't get without owning the browser. In Autotab, the browser
becomes a canvas on which the user and the model are taking turns
showing and explaining the task. Key features: 1. Self-healing
automations that don't break when sites change 2. Dedicated
authoring tool that builds memory for the model while defining
steps for the automation 3. Control flows and deep configurability
to keep automations on track, even when navigating complex
reasoning tasks 4. Works with any website (no site-specific APIs
needed) 5. Runs securely in the cloud or locally 6. Simple REST
API + client libraries for Python, Node We'd love to get any early
feedback from the HN community, ideas for where you'd like the
product to go, or experiences in this space. We will be in the
comments for the next few hours to respond!
Author : jonasnelle
Score : 25 points
Date : 2024-11-20 20:22 UTC (2 hours ago)
| MattDaEskimo wrote:
| Very neat in theory but I'm failing to find any technical
| details.
|
| Which layer is the automation happening? Inside using Dev tools?
| Multiple?
|
| What is the self-healing mechanic? I'm guessing invoking an LLM
| to find what happened and fix it?
|
| I guess what I'm wondering is. Is this some sort of hybrid
| between computer use and Dev tools usage?
| jonasnelle wrote:
| Autotab is definitely a hybrid approach, because when it comes
| to deciding where on the page to take an action, Autotab has to
| be fast & cheap (humans are both of those) while also being
| robust to changes. The solution we use is a "ladder of compute"
| where Autotab uses everything from really fast heuristics and
| local models up to the biggest frontier models, depending on
| how difficult the task is.
|
| For instance, if Autotab is trying to click the "submit" button
| on a sparse page that looks like previous versions of that
| page, that click might take a few hundred milliseconds. But if
| the page is very noisy, and Autotab has to scroll, and the
| button says "next" on it because the flow has an additional
| step added to it, Autotab will probably escalate to a bigger
| model to help it find the right answer with enough certainty to
| proceed.
|
| There is a certain cutoff in that hierarchy of compute that we
| decided to call "self-healing" because latency is high enough
| that we wanted to let users know it might take a bit longer for
| Autotab to proceed to the next step.
| Carrok wrote:
| You say "try it for free" but your website has no pricing
| information at all. Is this free for just a while? Free forever?
| What is your monetization strategy?
|
| Can I point it at my own LLM or am I locked into using OpenAI?
| alexirobbins wrote:
| We have unlimited free editing, so you can fully try everything
| out and know your skill will work before we ask you to
| subscribe. You also get 5m of free runtime. Subscriptions start
| at $39/month with 300 minutes of runtime included.
|
| Right now we do not let you BYO llm, but it's something we
| would love to provide an option for where possible!
| Carrok wrote:
| 5 minutes seems like barely enough time to complete any given
| task, let alone actually try it out. $40/mo for a capped plan
| seems steep, but maybe I'm not your target customer. Best of
| luck!
| alexirobbins wrote:
| The free edit mode has all of the features of run mode, and
| lets you fully test the skill. The only difference is that
| inside of a loop it will ask you to click to continue.
|
| A lot of AI tools promise the world and don't deliver. We
| explicitly don't want anyone to pay us until they're sure
| Autotab can do their task, even though the model costs
| during editing are actually much higher than during
| runtime.
| jonasnelle wrote:
| Good point, will add pricing information to our website ASAP,
| had skipped that one in the push to launch (it is only
| available in the app at the moment)
| handfuloflight wrote:
| I see it's able to perform data extraction, but what if you
| wanted to enter in data from another system, or generated by an
| LLM during the workflow?
| jonasnelle wrote:
| Data from external systems can be provided to Autotab in the
| form of CSV files or string inputs, which can be passed to the
| API to parametrize skills. However, in most cases, ingesting
| data into Autotab is easiest by just having Autotab navigate to
| the website where the data is present.
|
| Autotab has a structured type system underlying the workflows,
| so any data processed in the course of an automation can be
| referenced in later steps. It's a bit like a fuzzy programming
| language for automation, and the model generates schemas to
| ensure data flows reliably through the series of steps.
|
| For example, users often start by collecting information in one
| system (using an extract step as you mentioned), then cross
| reference it in another and then submit some data by having
| Autotab type it into a third system. In Autotab, you can just
| type @ to reference a variable, each step has access to data
| from previous steps.
|
| At the end, you can get a dump of all of Autotab's data from a
| run as a JSON file, or turn specific arrays of data into CSV
| files using a table step.
| smashah wrote:
| If this was an OSS project automating a specific service many HN-
| ers would come and bleet about TOS violations & being scared/wary
| of C&Ds.
|
| How does this not violate TOS? Do you have legal protection set
| up from megacorps trying to bully you with legal threats?
|
| Automation despite TOS via Adversarial Interop should be a
| Digital Human Right. Godspeed.
| jonasnelle wrote:
| This has been much less of an issue than I would have expected
| - Autotab is optimized for reasoning heavy tasks in core
| systems that require high reliability over being really fast at
| doing giant scrapes. More automating leads in Salesforce,
| tickets in Jira and data in Airtable than hawking tickets.
| pacifi30 wrote:
| Pretty slick. I recorded a session for ordering from a restaurant
| website, and it did repeat the entire workflow. It had some
| issues with a modal popped up but all in all well done! We have
| been trying to robotify the task of ordering from restaurant for
| our clients and seems like your solution can work well for us. I
| am guessing that you want your users to use Autotab browser, what
| is use for API?
| jonasnelle wrote:
| Thanks! We think of the browser as an authoring tool where you
| create, test and refine skills.
|
| After you've done that, the API is great for cases where you
| want to incorporate Autotab into a larger data flow or product.
|
| For instance, say Company A has taught Autotab to migrate their
| customers' data - so their customers just see a sync button in
| the Company A product, which kicks off a Autotab run via API.
| Same for restaurant booking, if you'd want that to happen
| programatically.
| jonasnelle wrote:
| Also for the modal popup - this is the kind of issue that goes
| away in run mode because Autotab will escalate to bigger models
| to self-heal.
|
| If the modal pops up frequently you can also record an click to
| dismiss it and make that click optional so Autotab knows to
| move on if the modal does not pop up sometimes.
___________________________________________________________________
(page generated 2024-11-20 23:00 UTC)