[HN Gopher] Biomni: A General-Purpose Biomedical AI Agent
       ___________________________________________________________________
        
       Biomni: A General-Purpose Biomedical AI Agent
        
       Author : GavCo
       Score  : 102 points
       Date   : 2025-07-09 19:20 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | freedomben wrote:
       | Awesome! This is the type of stuff I'm most excited about with AI
       | - improvements to medical research and capabilities. AI can be
       | awesome at identifying patterns in data that humans can't, and
       | there has to be troves of data out there full of patterns that we
       | aren't catching.
       | 
       | Of course there's also the possibility of engineering new
       | drugs/treatments and things, which is also super exciting.
        
       | AIorNot wrote:
       | very cool -passed on to my friend who is working a Crispr lab
        
       | Edmond wrote:
       | This is nice, a lot of possibilities regarding AI use for
       | scientific research.
       | 
       | There is also the possibility of building intelligent workspaces
       | that could prove useful in aiding scientific research:
       | 
       | https://news.ycombinator.com/item?id=44509078
        
       | SalmoShalazar wrote:
       | Not to take away from this or its usefulness (not my intent), but
       | it is wild to me how many pieces of software of this type are
       | being developed. We're seeing endless waves of specialized
       | wrappers around LLM API calls. There's very little innovation
       | happening beyond specializing around particular niches and
       | invoking LLMs in slightly different ways with carefully directed
       | context and prompts.
        
         | gronky_ wrote:
         | I see it a bit differently - LLMs are an incredible innovation
         | but it's hard to do anything useful with them without the right
         | wrapper.
         | 
         | A good wrapper has deep domain knowledge baked into it,
         | combined with automation and expert use of the LLM.
         | 
         | It maybe isn't super innovative but it's a bit of an art form
         | and unlocks the utility of the underlying LLM
        
           | mrlongroots wrote:
           | Exactly.
           | 
           | To present a potential usecase: there's a ridiculous and
           | massive backlog in the Indian judicial system. LLMs can be
           | let loose on the entire workflow: triage cases (simple,
           | complicated, intractable, grouped by legal principles or
           | parties), pull up related caselaw, provide recommendations,
           | throw more LLMs and more reasoning at unclear problems. Now
           | you can't do this with just a desktop and chatgpt, you need a
           | systemic pipeline of LLM-driven workflows, but doing that
           | unlocks potentially billions of dollars of value that is
           | otherwise elusive.
        
             | lawlessone wrote:
             | >pull up related caselaw
             | 
             | Or just make some up...
        
               | mrlongroots wrote:
               | At the token layer an LLM can make things up, but not as
               | part of a structured pipeline that validates an invariant
               | that all suggestions are valid entities in the database.
               | 
               | Can google search hallucinate webpages?
        
         | okdood64 wrote:
         | > We're seeing endless waves of specialized wrappers around LLM
         | API calls.
         | 
         | AFAIK, doing proper RAG is much, much more than this.
         | 
         | What's your technical background if you don't mind me asking?
        
           | SalmoShalazar wrote:
           | I'm a software engineer in the biotech space. I haven't
           | worked with RAG though, maybe I'm underestimating the
           | complexity.
        
           | agpagpws wrote:
           | I work at a top three lab. RAG is just Mumbai magic.
           | Throwaway. Hi dang.
        
             | jjtheblunt wrote:
             | What is a top three lab?
        
               | zachthewf wrote:
               | We know they don't work at OpenAI or Anthropic, but
               | beyond that have no information
        
         | epistasis wrote:
         | The application of a new technology to new fields always looks
         | like this. SQL databases become widespread, there's a wave of
         | specialized software development for business practices. The
         | internet becomes widespread, and there's a wave of SaaS solving
         | specialized use cases.
         | 
         | We are going to see the same for anything that Claude or
         | similar can't handle out of the box.
        
         | mlboss wrote:
         | By that argument every SaaS is a db wrapper
        
       | andy99 wrote:
       | I'm sure they've thought of this but curious how it fared on
       | evaluations for supporting biological threats, ie elevating
       | threat actor capabilities with respect to making biological
       | weapons.
       | 
       | I'm personally sceptical that LLMs can currently do this (and
       | it's based on Claude that does test this) but still interesting
       | to see.
        
       | deepdarkforest wrote:
       | Interesting. It's just an agent loop with access to python exec
       | and web search as standard, BUT with premade, curated, 150 tools
       | like analyze_circular_dichroism_spectra, with very specific
       | params that just execute a hardcoded python function. Also with
       | easy to load databases that conform to the tools' standards.
       | 
       | The argument is that if you just ask claude code to do niche
       | biomed tasks, it will not have the knowledge to do it like that
       | by just searching pubmed and doing RAG on the fly, which is fair,
       | given the current gen of LLM's. It's an interesting approach,
       | they show some generalization on the paper(with well known tidy
       | datasets), but real life data is messier, and the approach
       | here(correct me if im wrong) is to identify the correct tool for
       | a task, and then use the generic python exec tool to shape the
       | data into the acceptable format if needed, try the tool and go
       | again.
       | 
       | It would be useful to use the tools just as a guidance to inform
       | a generic code agent imo, but executing the "verified" hardcoded
       | tools narrows the error scope, as long as you can check your data
       | is shaped correctly, the analysis will be correct. Not sure how
       | much of an advantage this is in the long term for working with
       | proprietary datasets, but it's an interesting direction
        
       | epistasis wrote:
       | This is great, I've been on the waitlist for their website for a
       | while and am now excited to be able to try it out!
        
       | teenvan_1995 wrote:
       | I wonder if giving 150+ tools is really a good idea considering
       | context limitations. Need to check out if this works IRL.
        
         | Herring wrote:
         | There's an inner ToolRetriever which is a LLM call that selects
         | the most relevant tools to reduce context size.
        
       | dmezzetti wrote:
       | Very interesting work!
       | 
       | If biomedical research and paper analysis is of interest to you,
       | I've been working on a set of open source projects that enable
       | RAG over medical literature for a while.
       | 
       | PaperAI: https://github.com/neuml/paperai
       | 
       | PaperETL: https://github.com/neuml/paperetl
       | 
       | There is also this tool that annotates papers inline.
       | 
       | AnnotateAI: https://github.com/neuml/annotateai
        
       ___________________________________________________________________
       (page generated 2025-07-09 23:00 UTC)