[HN Gopher] Reverse Engineering Cursor's LLM Client
       ___________________________________________________________________
        
       Reverse Engineering Cursor's LLM Client
        
       Author : paulwarren
       Score  : 138 points
       Date   : 2025-06-07 02:59 UTC (1 days ago)
        
 (HTM) web link (www.tensorzero.com)
 (TXT) w3m dump (www.tensorzero.com)
        
       | CafeRacer wrote:
       | Soooo.... wireshark is no longer available or something?
        
         | Maxious wrote:
         | The article literally says at the end this was just the first
         | post about looking before getting into actually changing the
         | responses.
         | 
         | (that being said, mitmproxy has gotten pretty good for just
         | looking lately
         | https://docs.mitmproxy.org/stable/concepts/modes/#local-capt...
         | )
        
           | spmurrayzzz wrote:
           | Yea the proxying/observability is without question the
           | simplest part of this whole problem space. Once you get into
           | the weeds of automating all the eval and prompt optimizing,
           | you realize how irrelevant wireshark actually is in the
           | feedback loop.
           | 
           | But I also like you landed on mitmproxy as well, after
           | starting with tcpdump/wireshark. I recently started building
           | a tiny streaming textual gradient based optimizer (similar to
           | what adalflow is doing) by parsing the mitmproxy outputs in
           | realtime. Having a turnkey solution for this sort of thing
           | will definitely be valuable at least in the near to mid term.
        
             | vrm wrote:
             | if you haven't check out our repo -- it's free, fully self-
             | hosted, production-grade, and designed for precisely this
             | application :)
             | 
             | https://github.com/TensorZero/tensorzero
        
               | spmurrayzzz wrote:
               | Looks very buttoned up. My local project has some
               | features tuned for my explicit agent flows however (built
               | directly into my inference engine), so can't really jump
               | ship just yet.
               | 
               | Looking great so far though!
        
         | vrm wrote:
         | wireshark would work for seeing the requests from the desktop
         | app to Cursor's servers (which make the actual LLM requests).
         | But if you're interested in what the actual requests to LLMs
         | look like from Cursor's servers you have to set something like
         | this up. Plus, this lets us modify the request and A/B test
         | variations!
        
           | stavros wrote:
           | Sorry, can you explain this a bit more? Either you're putting
           | something between your desktop to the server (in which case
           | Wireshark would work) or you're putting something between
           | Cursor's infrastructure and their LLM provider, in which
           | case, how?
        
             | vrm wrote:
             | we're doing the latter! Cursor lets you configure the
             | OpenAI base URL so we were able to have Cursor call Ngrok
             | -> Nginx (for auth) -> TensorZero -> LLMs. We explain in
             | detail in the blog post.
        
               | stavros wrote:
               | Ah OK, I saw that, but I thought that was the desktop
               | client hitting the endpoint, not the server. Thanks!
        
       | robkop wrote:
       | There is much missing from this prompt, tool call descriptors is
       | the most obvious. See for yourself using even a year old
       | jailbreak [1]. There's some great ideas in how they've setup
       | other pieces such as cursor rules.
       | 
       | [1]:
       | https://gist.github.com/lucasmrdt/4215e483257e1d81e44842eddb...
        
         | ericrallen wrote:
         | Maybe there is some optimization logic that only appends tool
         | details that are required for the user's query?
         | 
         | I'm sure they are trying to slash tokens where they can, and
         | removing potentially irrelevant tool descriptors seems like
         | low-hanging fruit to reduce token consumption.
        
           | vrm wrote:
           | I definitely see different prompts based on what I'm doing in
           | the app. As we mentioned there are different prompts for if
           | you're asking questions, doing Cmd-K edits, working in the
           | shell, etc. I'd also imagine that they customize the prompt
           | by model (unobserved here, but we can also customize per-
           | model using TensorZero and A/B test).
        
           | joshmlewis wrote:
           | Yes this is one of the techniques apps can use. You vectorize
           | the tool description and then do a lookup based on the users
           | query to select the most relevant tools, this is called pre-
           | computed semantic profiles. You can even hash queries
           | themselves and cache tools that were used and then do
           | similarity lookups by query.
        
             | tough wrote:
             | cool stuff
        
         | GabrielBianconi wrote:
         | They use different prompts depending on the action you're
         | taking. We provided just a sample because our ultimate goal
         | here is to start A/B testing models, optimizing prompts +
         | models, etc. We provide the code to reproduce our work so you
         | can see other prompts!
         | 
         | The Gist you shared is a good resource too though!
        
         | cloudking wrote:
         | https://github.com/elder-plinius/CL4R1T4S/blob/main/CURSOR/C...
        
       | notpushkin wrote:
       | Hmm, now that we have the prompts, would it be possible to
       | reimplement Cursor servers and have a fully local ( _ahem_
       | pirated) version?
        
         | deadbabe wrote:
         | Absolutely
        
         | handfuloflight wrote:
         | Were you really waiting for the prompts before disembarking on
         | this adventure?
        
         | tomr75 wrote:
         | presumably their apply model is run on their servers
         | 
         | I wonder how hard it would be to build a local apply
         | model/surely that would be faster on a macbook
        
           | notpushkin wrote:
           | It's possible, but they allow you to specify your own API
           | (that's how they got the prompts in this article).
        
         | smcleod wrote:
         | Or you could just use Cline / Roo Code which are better for
         | agentic coding and open source anyway...
        
           | tmikaeld wrote:
           | But extremely expensive in comparison
        
       | bredren wrote:
       | Cursor and other IDE modality solutions are interesting but train
       | sloppy use of context.
       | 
       | From the extracted prompting Cursor is using:
       | 
       | > Each time the USER sends a message, we may automatically attach
       | some information about their current state...edit history in
       | their session so far, linter errors, and more. This information
       | may or may not be relevant to the coding task, it is up for you
       | to decide.
       | 
       | This is the context bloat that limits effectiveness of LLMs in
       | solving very hard problems.
       | 
       | This particular .env example illustrates the low stakes type of
       | problem cursor is great at solving but also lacks the complexity
       | that will keep SWE's employed.
       | 
       | Instead I suggest folks working with AI start at chat interface
       | and work on editing conversations to keep clean contexts as they
       | explore a truly challenging problem.
       | 
       | This often includes meeting and slack transcripts, internal docs,
       | external content and code.
       | 
       | I've built a tool for surgical use of code called FileKitty:
       | https://github.com/banagale/FileKitty and more recently
       | slackprep: https://github.com/banagale/slackprep
       | 
       | That let a person be more intentional about what the problem they
       | are trying to solve by only including information relevant to the
       | problem.
        
         | jacob019 wrote:
         | I had this thought as well and find it a bit surprising. For my
         | own agentic applications, I have found it necessary to
         | carefully curate the context. Instead of including an
         | instruction that we "may automatically attach", only include an
         | instruction WHEN something is attached. Instead of "may or may
         | not be relevant to the coding task, it is up for you to
         | decide"; provide explicit instruction to consider the relevance
         | and what to do when it is relevant and when it is not relevant.
         | When the context is short, it doesn't matter as much, but when
         | there is a difficult problem with long context length, fine
         | tuned instructions make all the difference. Cursor may be
         | keeping instructions more generic to take advantage of cached
         | token pricing, but the phrasing does seem rather sloppy. This
         | is all still relatively new, I'm sure both the models and the
         | prompts will see a lot more change before things settle down.
        
       | lyjackal wrote:
       | I've been curious to see the process for selecting relevant
       | context from a long conversation. has anyone reverse engineered
       | what that looks like? how is the conversion history pruned, and
       | how is the latest state of a file represented?
        
         | GabrielBianconi wrote:
         | We didn't look into that workflow closely, but you can
         | reproduce our work (code in GitHub) and potentially find some
         | insights!
         | 
         | We plan to continue investigating how it works (+ optimize the
         | models and prompts using TensorZero).
        
       | serf wrote:
       | Cursor is the only product that I have cancelled in 20+ years due
       | to a lack of customer service response.
       | 
       | Emailed them multiple times over _weeks_ about billing questions
       | -- not a single response. These weren 't like VS code questions ,
       | either -- they needed Cursor staff intervention.
       | 
       | No problem getting promo emails though!
       | 
       | The quicker their 'value' can be spread to other services the
       | better, imo. Maybe the next group will answer emails.
        
         | jjani wrote:
         | Similarly: https://github.com/getcursor/cursor/issues/1052
        
       | SbNn6uJO5wzPww wrote:
       | doing same with mitmproxy
       | https://news.ycombinator.com/item?id=44213073
        
       | Rastonbury wrote:
       | Another similar look at their prompting:
       | https://news.ycombinator.com/item?id=44154962
        
       ___________________________________________________________________
       (page generated 2025-06-08 23:02 UTC)