[HN Gopher] SWE-Grep and SWE-Grep-Mini: RL for Fast Multi-Turn C...
       ___________________________________________________________________
        
       SWE-Grep and SWE-Grep-Mini: RL for Fast Multi-Turn Context
       Retrieval
        
       Author : meetpateltech
       Score  : 70 points
       Date   : 2025-10-16 16:59 UTC (6 hours ago)
        
 (HTM) web link (cognition.ai)
 (TXT) w3m dump (cognition.ai)
        
       | marstall wrote:
       | SWE-1 has been being booped up by WindSurf to me lately and I've
       | been impressed - often (enough?) getting me the same answers as
       | GPT5 etc., but almost instantly. Gotta say speed is nice.
        
         | swyx wrote:
         | nice, what does booped up mean? is this gen z lingo?
        
           | marstall wrote:
           | ha more like how i talk to my two year old. WindSurf's
           | Cascade sidebar tool (which i use in RubyMine) has a stable
           | of LLMs and it somewhat randomly switches the active one out
           | from time to time. So I get a taste of what different ones
           | are like, it's kind of cool.
        
       | tifa2up wrote:
       | Searched for 'hi' and it took 166s to return a response using
       | this model: https://pasteboard.co/oB4VqVC5FGkl.png
       | 
       | Claude Code took 0.1s, Cursor CLI 19s
        
         | mgambati wrote:
         | If you ask a real question, then you might get real results.
        
       | silasalberti wrote:
       | hey I'm from the SWE-grep team - feel free to ask me any
       | questions :)
        
         | daralthus wrote:
         | this would be useful outside of coding. could you release a
         | benchmark so we can have more models tuned for this?
        
       | swyx wrote:
       | (coauthor) main charts/evals here
       | https://x.com/cognition/status/1978867021669413252
       | 
       | you can try the https://playground.cognition.ai/ here
       | 
       | i wrote a longer explainer here
       | https://x.com/swyx/status/1978874342743343254 but saving you the
       | click
       | 
       | this was a perspective cut from the blogpost, but let me explain
       | why subagents kill long context
       | 
       | Like you can spend $500m building 100 million context models, and
       | they would be 1) slow, 2) expensive to use, 3) have huge context
       | rot. O(n) is the lower bound.
       | 
       | Cog's approach is something you learn in day 1 of CS50 - divide
       | and parallelize. Embeddings are too dumb, Agentic Search is too
       | slow. So train limited-agency (max 4 turns), natively parallel
       | tool calling (avg parallelism of 7-8, custom toolset) fast
       | (2800tok/s) subagents to give the performance of Agentic Search
       | under an acceptable "Flow Window" that feels immaterially slower
       | than Embeddings.
       | 
       | The benefit of this is threefold:
       | 
       | - 8 ^ 4 toolcalls cover a very large code search space. can
       | compound subagent calls if more needed.
       | 
       | - predictable cost & end to end latency
       | 
       | - subagent outputs "clean" contexts, free of context failure
       | modes like context poisoning and context rot
       | 
       | we originally called this Rapid Agentic Search, to contrast with
       | RAG. but Fast Context rolls off the tongue better.
       | 
       | -- Second perspective --
       | 
       | The Fundamental Equation of Coding Agents is:
       | 
       | Coding Agent Performance = Ability to Read the Right Files *
       | Ability to Generate the Right Diffs
       | 
       | Fast Context is Cognition's first solution for the Read. As
       | codebases get larger and and tasks get more complex, Reads get
       | more important. the average production codebase first query in
       | Cascade is >60% just searching and reading files.
       | 
       | But if this were just about speed, it might not be that exciting.
       | I think there are unappreciated effects in performance as well
       | when you have very good context. In other words:
       | 
       | Context Engineering is Actually Very Important. Too important for
       | humans and hardcoded rules.
       | 
       | The swe-greps are the first dedicated context engineer agent
       | models.
        
         | vessenes wrote:
         | Thanks for the summary. I noticed from the announcement you
         | trained on parallel tool calling to save on serial round
         | tripping. This is awesome.
         | 
         | Most LLM coding is so slow that you're permanently out of flow
         | state, and in 'manager' state right now - I'm interested in a
         | future where you've got enough fast low TTFT support that an
         | engineer could maintain flow state and have sort of super power
         | type productivity at the same time, and this tool makes me
         | think of that.
         | 
         | That is, it looks fast enough to be used as a sort of sidebar
         | info tool, as in "what you're coding might need / refer to
         | these other parts of the codebase" -- effectively increasing an
         | engineer's working memory. Super cool. And obviously useful for
         | an AI engineer as well. Thanks for the writeup!
        
       | ntntnt wrote:
       | lol dead thread, cognition begging to grab some traction in this
       | space.
        
       | kburman wrote:
       | I thought https://playground.cognition.ai/ was just returning
       | some cached query results, but no, they're actually spinning up
       | real VMs and running live queries without any authentication or
       | restrictions. That must be costing them a fortune.
        
         | groby_b wrote:
         | Currently, all queries are returning "We're under load and
         | processing too many requests. Please try again later."
         | 
         | So that's how that is going ;)
        
       | awsanswers wrote:
       | LLM product managers: Show me what's in the context convenient to
       | where I am prompting. Likely the user knowing and editing the
       | precise context between requests will be a user task for a long
       | time
        
       | breadislove wrote:
       | guys please release the benchmark or the benchmark code. like
       | this is just "trust me bro"
        
         | swyx wrote:
         | well thats what the playground is for! playground.cognition.ai
        
           | breadislove wrote:
           | yeah but if people would like to double check the results it
           | would be nice to have the actual benchmark. especially given
           | that your playground is broken...
           | 
           | "We ran into an error processing your request. Please try
           | again"
        
       ___________________________________________________________________
       (page generated 2025-10-16 23:00 UTC)