[HN Gopher] Show HN: Blast - Fast, multi-threaded serving engine...
___________________________________________________________________
Show HN: Blast - Fast, multi-threaded serving engine for web
browsing AI agents
Hi HN! BLAST is a high-performance serving engine for browser-
augmented LLMs, designed to make deploying web-browsing AI easy,
fast, and cost-manageable. The goal with BLAST is to ultimately
achieve google search level latencies for tasks that currently
require a lot of typing and clicking around inside a browser. We're
starting off with automatic parallelism, prefix caching, budgeting
(memory and LLM cost), and an OpenAI-Compatible API but have a ton
of ideas in the pipe! Website & Docs: https://blastproject.org/
https://docs.blastproject.org/ MIT-Licensed Open-Source:
https://github.com/stanford-mast/blast Hope some folks here find
this useful! Please let me know what you think in the comments or
ping me on Discord. -- Caleb (PhD student @ Stanford CS)
Author : calebhwin
Score : 81 points
Date : 2025-05-02 17:42 UTC (5 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| debo_ wrote:
| I know it's impossible to avoid name collisions at this stage of
| the game, but BLAST is basically the Google of biological
| sequence alignment / search:
|
| https://blast.ncbi.nlm.nih.gov/Blast.cgi
| calebhwin wrote:
| Right I figured there isn't a huge overlap of interested
| communities so hopefully not a point of confusion. I guess that
| could change!
| esafak wrote:
| The bigger collision is https://withblast.com/ (found via
| https://kagi.com/search?q=blast+llm )
| diggan wrote:
| What measures are you using to make sure you're not bombarding
| websites with a ton of requests, since you seem to automatically
| "scale up" the concurrency to create even more requests/second?
| Does it read any of the rate-limit headers from the responses or
| do something else to back-off in case what it's trying to visit
| suddenly becomes offline or starts having slower response times?
|
| Slightly broader question: Do you feel like there is any ethical
| considerations one should think about before using something like
| this?
| calebhwin wrote:
| The main sort of parallelism we exploit is across distinct
| websites. For example "find me the cheapest rental" spawning
| tasks to look at many different websites. There is another
| level of parallelism that could be exploited within a web
| site/app. And yes we would have to make our planner rate limit
| aware for that.
|
| Absolutely agree there are ethical considerations with web
| browsing AI in general. (And the whole general ongoing shift
| from using websites to using chatgpt/perplexity)
| diggan wrote:
| > Absolutely agree there are ethical considerations with web
| browsing AI in general.
|
| I'm personally not sure there are, but I'm curious to hear
| what those are for you :)
| calebhwin wrote:
| Maybe more of a legal than ethical consideration but web
| browsing AI makes scraping trivial. You could use that for
| surveillance, profiling (get a full picture of a user's
| whole online life before they even hit Sign Up), cutting
| egress cost in certain cases. Right now CAPTCHA is actually
| holding up pretty well against web browsing AI for sites
| that really want to protect their IP but it will be
| interesting to see if that devolves into yet another
| instance of an AI vs AI "arms race".
| rollcat wrote:
| > There is another level of parallelism that could be
| exploited within a web site/app. And yes we would have to
| make our planner rate limit aware for that.
|
| People are already deploying tools like Anubis[1] or go-
| away[2] to cope with the insane load that bots put on their
| server infrastructure. This is an arms race. In the end, the
| users lose.
|
| [1]: https://anubis.techaro.lol
|
| [2]: https://git.gammaspectra.live/git/go-away
| calebhwin wrote:
| IMO it depends on how this tech is deployed. One way I see
| this being extremely useful is for developers to quickly
| build AI automation for their own sites.
|
| E.g. if I'm the developer of a workforce management app
| (e.g. https://WhenIWork.com) I could deploy BLAST to
| quickly provide automation for users of my app.
| rollcat wrote:
| That's my point. You can use a knife to slice bread or to
| stab your neighbor. We're seeing an unprecedented amount
| of stabbings. People are getting away with murder,
| there's no accountability. Refining the stilettos doesn't
| help the problem.
| TheTaytay wrote:
| Cool!
|
| I read through the docs and want to try this. I couldn't figure
| out what you were using g under the covers for the actual webpage
| "use" I did see: " What we're not focusing on is building a
| better Browser-Use, Notte, Steel, or other vision LLM. Our focus
| is serving these systems in a way that is optimized under
| constraints"
|
| Cool! That makes sense!but I was still curious what your default
| AI-driven browser use library was.
|
| If I were to use your library right now on my MacBook, is it
| using "browser-use" under the covers by default? (I should poke
| around the source more. I just thought it might be helpful to ask
| here in case I misunderstand or in case others had the same
| question)
| calebhwin wrote:
| Yes! And browser-use is great though I'm hoping at some point
| we can swap it out for something leaner, maybe one day it'll
| just be a vision language model. All we'll have to do within
| BLAST is implement a new Executor and all the
| scheduling/planning/resource management stays the same.
| anxman wrote:
| I was a little unclear at first, after looking at the source
| code, it looks like Blast uses Browser Use which uses your
| local browser (in dev) under the hood
| badmonster wrote:
| How does BLAST handle browser instance isolation and resource
| contention under high concurrency?
| ivape wrote:
| _resource contention under high concurrency_
|
| A queue? What else can you really do. Your server is at the
| mercy of OpenAI, so all you can do is queue up everyone's
| requests. I don't know how many parallel requests you can send
| out to OpenAI (infinite?), so that bottleneck is probably just
| dependent on your server stack (how many threads).
|
| There's a lot of language being thrown out here, and I'm trying
| to see if we're using too much language to discuss basic
| concepts.
| calebhwin wrote:
| There's definitely opportunities to parallelize. BLAST
| exploits these with an LLM-planner and tool calls to
| dynamically spawn/join subtasks (there's also data
| parallelism and request hedging which further reduce
| latency).
|
| Now you are right that at some point you'll get throttled
| either by LLM rate limits or a set budget for browser memory
| usage or LLM cost. BLAST's scheduler is aware of these
| constraints and uses them to effectively map tasks to
| resources (resource=browser+LLM).
| joshstrange wrote:
| This looks really cool but wouldn't this be better as an MCP
| server? It feels like it's mixing too many concepts and can't be
| plugged into another system. What if I want to extend my agent to
| use this but I already have MCP servers tied in or I'm going
| through another OpenAI proxy-type thing? I wouldn't want to stack
| proxies.
| calebhwin wrote:
| Great point, we are working on an MCP server implementation
| which should address this. The main benefit of having a serving
| engine here is to abstract away browser-LLM specific
| optimizations like parallelism, caching, browser memory
| management, etc. It's closer to vllm but I agree an MCP server
| implementation will make integration easier.
|
| Though ultimately I think the web needs something better than
| MCP and we're actively working on that as well.
| barbazoo wrote:
| Looking forward to hearing more about that MCP successor
| you're working on.
| lgiordano_notte wrote:
| Looks really cool. Curious how you're handling action
| abstraction? We've found that semantically parsing the DOM to
| extract high-level intents--like "click 'Continue'" instead of
| 'click div#xyz' helps reduce hallucination and makes agent
| planning more robust.
| mtrovo wrote:
| I don't work close to LLM APIs so not sure what exactly is the
| use case here? Is it something that could be adapted to work as a
| deep research feature on a custom product?
| xena wrote:
| How do I block your service? Do you read robots.txt and have an
| identifiable user agent?
| calebhwin wrote:
| Good point, we should probably integrate that. Feel free to
| submit a PR!
|
| BLAST can also be used to add automation to your own site/app
| FWIW.
| pal9000i wrote:
| The whole point of AI browser automation is mimicking human
| behaviour, fighting the anti-bot detection systems. If the
| point is interacting with systems, we'd be using APIs
| diggan wrote:
| Seems Blast uses browser-use (https://github.com/browser-
| use/browser-use) which seems to be some client specifically for
| AIs to connect to/run browser runtimes.
|
| Unfortunately, it seems like browser-use tries to hide that
| it's controlled by an AI, and uses a typical browser user-
| agent: https://github.com/browser-use/browser-
| use/blob/d8c4d03d9ea9...
|
| I'm guessing because of the amount of flags, you could probably
| come up with a unique fingerprint for browser-use, based on
| available features, screen/canvas size and so on, that could be
| reused for blocking everyone using Blast/browser-use.
|
| If calebhwin wanted to make Blast easier to identify, they
| could set a custom user-agent for browser-use that makes it
| clear it's Blast doing the browsing for the user.
| ATechGuy wrote:
| Can browser-use be blocked using Anubis or other anti-bot
| measures?
| pal9000i wrote:
| Great work! I just tried it and Google immediately captcha'd me
| on the first attempt. Is it using playwright or patchright?
| patchright using chrome and not chromium is more robust
| pal9000i wrote:
| Also any plans to add remote browser control feature? For Human
| in the loop tasks, for example advanced captcha bypassing and
| troubleshooting tasks that are stuck
| grahamgooch wrote:
| Interesting. Could I use this to automate testing of massive web
| applications (100s of screens). And potentially load test?
| diggan wrote:
| > And potentially load test?
|
| You wanna load test the local DOM rendering or what? Otherwise,
| whatever endpoint is serving the HTML, you configure your load
| tests to hit that, if anything. Although you'd just be doing
| the same testing your HTTP server probably already doing before
| doing releases, usually you wanna load test your underlying
| APIs or similar instead.
___________________________________________________________________
(page generated 2025-05-02 23:00 UTC)