[HN Gopher] Show HN: Blast - Fast, multi-threaded serving engine...
       ___________________________________________________________________
        
       Show HN: Blast - Fast, multi-threaded serving engine for web
       browsing AI agents
        
       Hi HN!  BLAST is a high-performance serving engine for browser-
       augmented LLMs, designed to make deploying web-browsing AI easy,
       fast, and cost-manageable.  The goal with BLAST is to ultimately
       achieve google search level latencies for tasks that currently
       require a lot of typing and clicking around inside a browser. We're
       starting off with automatic parallelism, prefix caching, budgeting
       (memory and LLM cost), and an OpenAI-Compatible API but have a ton
       of ideas in the pipe!  Website & Docs: https://blastproject.org/
       https://docs.blastproject.org/  MIT-Licensed Open-Source:
       https://github.com/stanford-mast/blast  Hope some folks here find
       this useful! Please let me know what you think in the comments or
       ping me on Discord.  -- Caleb (PhD student @ Stanford CS)
        
       Author : calebhwin
       Score  : 81 points
       Date   : 2025-05-02 17:42 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | debo_ wrote:
       | I know it's impossible to avoid name collisions at this stage of
       | the game, but BLAST is basically the Google of biological
       | sequence alignment / search:
       | 
       | https://blast.ncbi.nlm.nih.gov/Blast.cgi
        
         | calebhwin wrote:
         | Right I figured there isn't a huge overlap of interested
         | communities so hopefully not a point of confusion. I guess that
         | could change!
        
           | esafak wrote:
           | The bigger collision is https://withblast.com/ (found via
           | https://kagi.com/search?q=blast+llm )
        
       | diggan wrote:
       | What measures are you using to make sure you're not bombarding
       | websites with a ton of requests, since you seem to automatically
       | "scale up" the concurrency to create even more requests/second?
       | Does it read any of the rate-limit headers from the responses or
       | do something else to back-off in case what it's trying to visit
       | suddenly becomes offline or starts having slower response times?
       | 
       | Slightly broader question: Do you feel like there is any ethical
       | considerations one should think about before using something like
       | this?
        
         | calebhwin wrote:
         | The main sort of parallelism we exploit is across distinct
         | websites. For example "find me the cheapest rental" spawning
         | tasks to look at many different websites. There is another
         | level of parallelism that could be exploited within a web
         | site/app. And yes we would have to make our planner rate limit
         | aware for that.
         | 
         | Absolutely agree there are ethical considerations with web
         | browsing AI in general. (And the whole general ongoing shift
         | from using websites to using chatgpt/perplexity)
        
           | diggan wrote:
           | > Absolutely agree there are ethical considerations with web
           | browsing AI in general.
           | 
           | I'm personally not sure there are, but I'm curious to hear
           | what those are for you :)
        
             | calebhwin wrote:
             | Maybe more of a legal than ethical consideration but web
             | browsing AI makes scraping trivial. You could use that for
             | surveillance, profiling (get a full picture of a user's
             | whole online life before they even hit Sign Up), cutting
             | egress cost in certain cases. Right now CAPTCHA is actually
             | holding up pretty well against web browsing AI for sites
             | that really want to protect their IP but it will be
             | interesting to see if that devolves into yet another
             | instance of an AI vs AI "arms race".
        
           | rollcat wrote:
           | > There is another level of parallelism that could be
           | exploited within a web site/app. And yes we would have to
           | make our planner rate limit aware for that.
           | 
           | People are already deploying tools like Anubis[1] or go-
           | away[2] to cope with the insane load that bots put on their
           | server infrastructure. This is an arms race. In the end, the
           | users lose.
           | 
           | [1]: https://anubis.techaro.lol
           | 
           | [2]: https://git.gammaspectra.live/git/go-away
        
             | calebhwin wrote:
             | IMO it depends on how this tech is deployed. One way I see
             | this being extremely useful is for developers to quickly
             | build AI automation for their own sites.
             | 
             | E.g. if I'm the developer of a workforce management app
             | (e.g. https://WhenIWork.com) I could deploy BLAST to
             | quickly provide automation for users of my app.
        
               | rollcat wrote:
               | That's my point. You can use a knife to slice bread or to
               | stab your neighbor. We're seeing an unprecedented amount
               | of stabbings. People are getting away with murder,
               | there's no accountability. Refining the stilettos doesn't
               | help the problem.
        
       | TheTaytay wrote:
       | Cool!
       | 
       | I read through the docs and want to try this. I couldn't figure
       | out what you were using g under the covers for the actual webpage
       | "use" I did see: " What we're not focusing on is building a
       | better Browser-Use, Notte, Steel, or other vision LLM. Our focus
       | is serving these systems in a way that is optimized under
       | constraints"
       | 
       | Cool! That makes sense!but I was still curious what your default
       | AI-driven browser use library was.
       | 
       | If I were to use your library right now on my MacBook, is it
       | using "browser-use" under the covers by default? (I should poke
       | around the source more. I just thought it might be helpful to ask
       | here in case I misunderstand or in case others had the same
       | question)
        
         | calebhwin wrote:
         | Yes! And browser-use is great though I'm hoping at some point
         | we can swap it out for something leaner, maybe one day it'll
         | just be a vision language model. All we'll have to do within
         | BLAST is implement a new Executor and all the
         | scheduling/planning/resource management stays the same.
        
           | anxman wrote:
           | I was a little unclear at first, after looking at the source
           | code, it looks like Blast uses Browser Use which uses your
           | local browser (in dev) under the hood
        
       | badmonster wrote:
       | How does BLAST handle browser instance isolation and resource
       | contention under high concurrency?
        
         | ivape wrote:
         | _resource contention under high concurrency_
         | 
         | A queue? What else can you really do. Your server is at the
         | mercy of OpenAI, so all you can do is queue up everyone's
         | requests. I don't know how many parallel requests you can send
         | out to OpenAI (infinite?), so that bottleneck is probably just
         | dependent on your server stack (how many threads).
         | 
         | There's a lot of language being thrown out here, and I'm trying
         | to see if we're using too much language to discuss basic
         | concepts.
        
           | calebhwin wrote:
           | There's definitely opportunities to parallelize. BLAST
           | exploits these with an LLM-planner and tool calls to
           | dynamically spawn/join subtasks (there's also data
           | parallelism and request hedging which further reduce
           | latency).
           | 
           | Now you are right that at some point you'll get throttled
           | either by LLM rate limits or a set budget for browser memory
           | usage or LLM cost. BLAST's scheduler is aware of these
           | constraints and uses them to effectively map tasks to
           | resources (resource=browser+LLM).
        
       | joshstrange wrote:
       | This looks really cool but wouldn't this be better as an MCP
       | server? It feels like it's mixing too many concepts and can't be
       | plugged into another system. What if I want to extend my agent to
       | use this but I already have MCP servers tied in or I'm going
       | through another OpenAI proxy-type thing? I wouldn't want to stack
       | proxies.
        
         | calebhwin wrote:
         | Great point, we are working on an MCP server implementation
         | which should address this. The main benefit of having a serving
         | engine here is to abstract away browser-LLM specific
         | optimizations like parallelism, caching, browser memory
         | management, etc. It's closer to vllm but I agree an MCP server
         | implementation will make integration easier.
         | 
         | Though ultimately I think the web needs something better than
         | MCP and we're actively working on that as well.
        
           | barbazoo wrote:
           | Looking forward to hearing more about that MCP successor
           | you're working on.
        
       | lgiordano_notte wrote:
       | Looks really cool. Curious how you're handling action
       | abstraction? We've found that semantically parsing the DOM to
       | extract high-level intents--like "click 'Continue'" instead of
       | 'click div#xyz' helps reduce hallucination and makes agent
       | planning more robust.
        
       | mtrovo wrote:
       | I don't work close to LLM APIs so not sure what exactly is the
       | use case here? Is it something that could be adapted to work as a
       | deep research feature on a custom product?
        
       | xena wrote:
       | How do I block your service? Do you read robots.txt and have an
       | identifiable user agent?
        
         | calebhwin wrote:
         | Good point, we should probably integrate that. Feel free to
         | submit a PR!
         | 
         | BLAST can also be used to add automation to your own site/app
         | FWIW.
        
         | pal9000i wrote:
         | The whole point of AI browser automation is mimicking human
         | behaviour, fighting the anti-bot detection systems. If the
         | point is interacting with systems, we'd be using APIs
        
         | diggan wrote:
         | Seems Blast uses browser-use (https://github.com/browser-
         | use/browser-use) which seems to be some client specifically for
         | AIs to connect to/run browser runtimes.
         | 
         | Unfortunately, it seems like browser-use tries to hide that
         | it's controlled by an AI, and uses a typical browser user-
         | agent: https://github.com/browser-use/browser-
         | use/blob/d8c4d03d9ea9...
         | 
         | I'm guessing because of the amount of flags, you could probably
         | come up with a unique fingerprint for browser-use, based on
         | available features, screen/canvas size and so on, that could be
         | reused for blocking everyone using Blast/browser-use.
         | 
         | If calebhwin wanted to make Blast easier to identify, they
         | could set a custom user-agent for browser-use that makes it
         | clear it's Blast doing the browsing for the user.
        
           | ATechGuy wrote:
           | Can browser-use be blocked using Anubis or other anti-bot
           | measures?
        
       | pal9000i wrote:
       | Great work! I just tried it and Google immediately captcha'd me
       | on the first attempt. Is it using playwright or patchright?
       | patchright using chrome and not chromium is more robust
        
         | pal9000i wrote:
         | Also any plans to add remote browser control feature? For Human
         | in the loop tasks, for example advanced captcha bypassing and
         | troubleshooting tasks that are stuck
        
       | grahamgooch wrote:
       | Interesting. Could I use this to automate testing of massive web
       | applications (100s of screens). And potentially load test?
        
         | diggan wrote:
         | > And potentially load test?
         | 
         | You wanna load test the local DOM rendering or what? Otherwise,
         | whatever endpoint is serving the HTML, you configure your load
         | tests to hit that, if anything. Although you'd just be doing
         | the same testing your HTTP server probably already doing before
         | doing releases, usually you wanna load test your underlying
         | APIs or similar instead.
        
       ___________________________________________________________________
       (page generated 2025-05-02 23:00 UTC)