hngopher.com

       [HN Gopher] Managing context on the Claude Developer Platform
       ___________________________________________________________________
        
       Managing context on the Claude Developer Platform
        
       Author : benzguo
       Score  : 173 points
       Date   : 2025-10-05 05:20 UTC (17 hours ago)
        
 (HTM) web link (www.anthropic.com)
 (TXT) w3m dump (www.anthropic.com)
        
       | mingtianzhang wrote:
       | Edited version:
       | 
       | We try to solve a similar problem to put long documents in
       | context. We built an MCP for Claude to allow you to put long PDFs
       | in your context window that go beyond the context limits:
       | https://pageindex.ai/mcp.
        
         | Szpadel wrote:
         | this is about difficulty thing, it is about maintaining
         | knowledge during long execution what is IMO exiting is long
         | term memory in things like Claude code, where model could learn
         | your preferences as you collaborate. (there is already some
         | hard disabled implementation in CC)
        
         | derleyici wrote:
         | Just a heads-up: HN folks value transparency, so mentioning if
         | it's yours usually builds more trust.
        
           | mingtianzhang wrote:
           | Thanks for the reminder, I have edited the comment.
        
       | siva7 wrote:
       | so this is what claude code 2 uses under the hood? at least i got
       | the impression it stays much better on track than the old version
        
         | npace12 wrote:
         | it doesnt use it yet
        
       | 0wis wrote:
       | That's powerful. Most of the differences I can see between AI
       | generated output and human output comes from the << broad but
       | specific >> context of the task. I mean company culture,
       | organization rules and politics, larger team focus and way of
       | working. It may take time to build the required knowledge bases
       | but it must be worth it
        
       | andrewstuart wrote:
       | Hopefully one day Anthropic will allow zipfile uploads like
       | ChatGPT and Gemini have allowed for ages.
        
       | crvdgc wrote:
       | Nice. When using OpenAI Codex CLI, I find the /compact command
       | very useful for large tasks. In a way it's similar to the context
       | editing tool. Maybe I can ask it to use a dedicated directory to
       | simulate the memory tool.
        
         | EnPissant wrote:
         | Claude Code already compacts automatically.
        
           | crvdgc wrote:
           | I believe Codex CLI also auto compacts when the context limit
           | is met, but in addition to that, you can manually issue a
           | /compact command at any time.
        
             | brulard wrote:
             | Claude Code had this /compact command for a long time, you
             | can even specify your preferences for compaction after the
             | slash command. But this is quite limited and to get the
             | best results out of your agent you need more than rely on
             | how the tool decides to prune your context. I ask it
             | explicitly to write down the important parts of our
             | conversation into an md file, and I review and iterate over
             | the doc until I'm happy with it. Then /clear the context
             | and give it instructions to continue based on the MD doc.
        
           | mritchie712 wrote:
           | CC also has the same `/compact` command if you want to force
           | it
        
             | _joel wrote:
             | /clear too
        
             | danielbln wrote:
             | /compact accepts parameters, so you can tell it to focus on
             | something specific when compacting.
        
       | _pdp_ wrote:
       | Interestingly we rolled out a similar feature recently.
        
         | visarga wrote:
         | I am working on a World Atlas based approach to computer use
         | agents. If the task and app environment are reused, building an
         | atlas of states and policies might be better than observe-plan-
         | execute. We don't rediscover from scratch how an app works
         | every time we use it.
        
           | _pdp_ wrote:
           | Do you have any links where I can read more about your
           | approach?
        
       | deepdarkforest wrote:
       | Context editing is interesting because most agents work on the
       | assumption that KV cache is the most important thing to optimise
       | and are very hesitant to remove parts of the context during work.
       | It also sometimes introduces hallucinations, because parts of the
       | context are with the assumption that eg tool results are there,
       | but theyre not. Example Manus [0]. Eg, read file A, make changes
       | on A. Then prompt on some more changes. If you now remove the
       | "read file A" tool results, not only you break the cache, but in
       | my own agent implementations(on gpt 5 at least) can hallucinate
       | now since my prompt etc all naturally point to the content of the
       | tool still beeing there.
       | 
       | Plus, the model got trained and RLed with a continuous context,
       | except if they now tune it with messing with the context as well.
       | 
       | https://manus.im/blog/Context-Engineering-for-AI-Agents-Less...
        
         | javier2 wrote:
         | We often talk about "hallucinations" like it is its own thing,
         | but is there really anything different about it from the LLM's
         | normal output?
        
           | veidr wrote:
           | AFAICT, no. I think it just means "bad, unhelpful output" but
           | isn't fundamentally different in any meaningful way from
           | their super-helpful top-1% outputs.
           | 
           | It's kind of qualitatively different from the human
           | perspective, so not a useless concept, but I think that is
           | mainly because we can't help anthropomorphizing these things.
        
         | blixt wrote:
         | Yes we had the same issue with our coding agent. We found that
         | instead of replacing large tool results in the context it was
         | sometimes better to have two agents, one long lived with
         | smaller tool results produced by another short lived agent that
         | would actually be the one to read and edit large chunks. The
         | downside of this is you always have to manage the balance of
         | which agent gets what context, and you also increase latency
         | and cost a bit (slightly less reuse of prompt cache)
        
           | swader999 wrote:
           | I found that having sub agents just for running and writing
           | unit tests got me over 90% of my context woes
        
             | brulard wrote:
             | this sounds like a good approach, i need to try it. I had
             | good results with using context7 in specialized docs agent.
             | I wasn't able how to limit MCP to a subagent, likely its
             | not supported.
        
             | daxfohl wrote:
             | Seems like that could be a job local LLMs do fairly well
             | soon; not a ton of reasoning, just a basic ability to
             | understand functions and write fairly boilerplate code, but
             | it involves a ton of tokens, especially if you have lots of
             | verbose output from a test run. So doing it locally could
             | end up being a huge cost savings as well.
        
       | yunohn wrote:
       | Why are both this new Memory API and the Filesystem as (evolving)
       | Context releases only for the Developer API - but not integrated
       | into Claude Code?
        
         | kylegalbraith wrote:
         | I had this same thought. I'm not entirely following how I'm to
         | differentiate between these two things. I guess the API is to
         | create my own CC type agent. But I've heard of folks creating
         | agents that are CC based as well.
        
         | ec109685 wrote:
         | How do you know it's not integrated into Claude code yet?
        
       | simianwords wrote:
       | I'm trying to understand what part of this is something we could
       | not have hacked together already as clients? Maybe new sonnet is
       | rl'ed to be able to use these memories in a better way?
        
       | olliem36 wrote:
       | At Zenning AI, a generalist AI designed to replace entire jobs
       | with just prompts. Our agents typically run autonomously for
       | hours, so effective context management is critical. I'd say that
       | we invest most of our engineering effort into what is ultimately
       | context management, such as:
       | 
       | 1. Multi-agent orchestration 2. Summarising and chunking large
       | tool and agent responses 3. Passing large context objects by
       | reference between agents and tools
       | 
       | Two things to note that might be interesting to the community:
       | 
       | Firstly, when managing context, I recommend adding some evals to
       | our context management flow, so you can measure effectiveness as
       | you add improvements and changes.
       | 
       | For example, our evals will measure the impact of using
       | Anthropics memory over time. Thus allowing our team to make a
       | better informed decisions on that tools to use with our agents.
       | 
       | Secondly, there's a tradeoff not mentioned in this article: speed
       | vs. accuracy. Faster summarisation (or 'compaction') comes at a
       | cost of accuracy. If you want good compaction, it can be slow.
       | Depending on the use case, you should adjust your compaction
       | strategy accordingly. For example, (forgive my major
       | generalisation), for consumer facing products speed is usually
       | preferred over a bump in accuracy. However, in business accuracy
       | is generally preferred over speed.
        
         | _joel wrote:
         | lol, good luck with that
        
       | iamblessed_ wrote:
       | I want to really get into anthropic.
       | 
       | For context: I have background in CV and ML in general. Currently
       | reviewing and revising RL.
       | 
       | Any idea how I can get into RL?
       | 
       | I have 3 years of industry/research experience.
       | 
       | Whenever I see post like this, it triggers a massive fomo
       | creating a scene of urgency on I should work in these problems.
       | 
       | Not being able to work here is making be anxious.
       | 
       | what does it take for someone in Non-US/Non-EU region to get into
       | big labs such as these?
       | 
       | Do I really have to pursue PhD? I am already old that pursuing
       | PhD is a huge burden that I can't afford.
        
         | barrenko wrote:
         | The pace is so fast, if you have FOMO you've already missed out
         | most probably. If you're interested in LLM flavored RL, I'd
         | suggest prime-rl (and their discord) community, hugging face RL
         | courses with smol (you'll need pro and burn a couple of bucks),
         | etc. etc.
        
           | iamblessed_ wrote:
           | Willing to burn few bucks here and there for the projects.
           | 
           | Really need to get hands dirty here. I remember taking RL
           | course from coursera during 2020 covid. I didn't have the
           | chance to apply it in the problems I worked post covid.
           | 
           | But I really want to start doing RL again. Interested in
           | world models and simulation for RL.
        
             | barrenko wrote:
             | If you're good on theory (meaning able to read theory), but
             | interested in non-LLM stuff, pufferlib and it's programmer
             | also seem quite accessible (if you haven't come across it
             | already).
        
         | siva7 wrote:
         | I'll address something else: Fomo is usually a symptom of an
         | underlying deeper issue. Long before Anthropic/Openai, we had
         | these same posts about people desperately wanting to get into
         | Google. They got unhealthy obsessed about this goal that they
         | started prepping for months, even years, documenting their
         | journey on blogs only to get rejected by someone who spent 2
         | seconds on their application. Getting in is more about luck
         | than most people realize. Once you're in, you had for a long
         | time a skewed romanticized fantasy of what it's like to work at
         | this mythical company (Anthropic isn't a startup by any stretch
         | anymore) only to crash hard and realize it is a classic
         | corporate environment.
        
         | Orochikaku wrote:
         | It may be harsh but the reality is, if you have to ask you're
         | likely not a viable candidate.
         | 
         | The leading AI companies have the kind of capital to be able to
         | hire the very best of the industry. If you're only just
         | starting now and even worse yet need your hand held to do so
         | you're totally out of the running...
        
           | iamblessed_ wrote:
           | so what would be wise to do in this situation? leave the hope
           | and surrender or completely start out and work in a nascent
           | field?
        
       | RamtinJ95 wrote:
       | I don't get it. I have been doing something similar for a month
       | with opencode, is the new thing that the new sonnet model is fine
       | tuned to call these tools "better" or simply that they have
       | improved the devex to accomplish these things?
        
         | the_mitsuhiko wrote:
         | > I don't get it. I have been doing something similar for a
         | month with opencode, is the new thing that the new sonnet model
         | is fine tuned to call these tools "better" or simply that they
         | have improved the devex to accomplish these things?
         | 
         | They fine tuned 4.5 to have `clear_tool_uses` marker tools that
         | it understands without regressing the quality of future
         | responses. You will however pay for the cache invalidation hit
         | anyways, so it would need some evaluations how much this helps.
        
       | Razengan wrote:
       | I just want Anthropic to release a way to remove my payment
       | method from Claude before the eventual data breach.
        
         | Aeolun wrote:
         | Don't they use Stripe? You can remove your payment method at
         | any time. But Anthropic doesn't really have your info in the
         | first place.
        
           | Razengan wrote:
           | How? First I wanted to just use iOS In-App-Purchases like
           | ChatGPT and Grok etc all support, but Anthropic has this
           | "We're too special" syndrome and don't offer IAP so I had to
           | sign in on the website (and guess what, iOS app supports Sign
           | In With Apple but the website doesn't) and add my card... but
           | no way to remove the card later.
           | 
           | Steam and others have figured it out, but Anthropic/Discord
           | (who just had a breach like yesterday) still don't let you
           | remove your payment info.
        
             | mh- wrote:
             | My Claude subscription is through iOS IAPs.
        
               | Razengan wrote:
               | I could swear I didn't even see the option for an IAP on
               | iOS... maybe it's region-locked?
        
               | mh- wrote:
               | Maybe, but I wouldn't suggest doing it even if you figure
               | it out. It has some really annoying side effects on
               | subscription management for Claude, and additionally they
               | mark up the plan cost by 20%. Which I didn't care about
               | on the $20 Pro, but now with $200 plans.
               | 
               | I haven't figured out* how to switch to a direct
               | subscription other than cancelling and resubscribing, and
               | I'm afraid of messing up my account access.
               | 
               | * Caveat that I haven't spent more than 20 minutes trying
               | to solve this.
        
       | blixt wrote:
       | From what I can tell the new context editing and memory APIs are
       | essentially formalization of common patterns:
       | 
       | Context editing: Replace tool call results in message history
       | (i.e replace a file output with an indicator that it's no longer
       | available).
       | 
       | Memory: Give LLM access to read and write .md files like a
       | virtual file system
       | 
       | I feel like these formalizations of tools are on the path towards
       | managing message history on the server, which means better vendor
       | lock in, but not necessarily a big boon to the user of the API
       | (well, bandwidth and latency will improve). I see the ChatGPT
       | Responses API going a similar path, and together these changes
       | will make it harder to swap transparently between providers,
       | something I enjoy having the ability to do.
        
         | mkagenius wrote:
         | > managing message history on the server, which means better
         | vendor lock in
         | 
         | I feel that managing context should be doable with a non-SOTA
         | model even locally. Just need a way to select/deselect messages
         | from the context manually say in Claude-CLI.
        
       | qwertox wrote:
       | I wish every instruction and response had a enable/disable
       | checkbox so that I can disable parts of the conversation in such
       | a way that it is excluded from the context.
       | 
       | Let's say I submit or let it create a piece of code, and we're
       | working on improving it. At some point I want to consider the
       | piece of code to be significantly better that what I had
       | initially, so all those initial interactions containing old code
       | could be removed from the context.
       | 
       | I like how Google AI Studio allows one to delete sections and
       | they are then no longer part of the context. Not possible in
       | Claude, ChatGPT or Gemini, I think there one can only delete the
       | last response.
       | 
       | Maybe even AI could suggest which parts to disable.
        
         | diggan wrote:
         | I kind of do this, semi-manually when using the web chat UIs
         | (which happens less and less). I basically never let the
         | conversations go above two messages in total (one message from
         | me + one reply, since the quality of responses goes down so
         | damn quick), and if anything is wrong, I restart the
         | conversation and fix the initial prompt so it gets it right.
         | And rather than manually writing my prompts in the web UIs, I
         | manage prompts with http://github.com/victorb/prompta which
         | makes it trivial to edit the prompts as I find out the best way
         | of getting the response I want, together with some simple shell
         | integrations to automatically include logs, source code, docs
         | and what not.
        
           | tortilla wrote:
           | I work similarly. I keep message rounds short (1-3) and clear
           | often. If I have to steer the conversation too much, I start
           | over.
           | 
           | I built a terminal tui to manage my contexts/prompts:
           | https://github.com/pluqqy/pluqqy-terminal
        
         | Jowsey wrote:
         | Related, it feels like AI Studio is the only mainstream LLM
         | frontend that treats you like an adult. Choose your own safety
         | boundaries, modify the context & system prompt as you please,
         | clear rate limits and pricing, etc. It's something you come to
         | appreciate a lot, even if we _are_ in the part of the cycle
         | where Google 's models aren't particularly SOTA rn
        
           | mirsadm wrote:
           | How are they not SOTA? They're all very similar with ChatGPT
           | being the worst (for my use case anyway). Like adding lambdas
           | and random c++ function calls into my vulkan shaders.
        
             | oezi wrote:
             | Gemini 2.5 Pro is the most capable for my usecase in
             | Pytorch as well. Large context and much better instruction
             | following for code edits make a big difference.
        
             | hendersoon wrote:
             | Gemini 2.5 pro is generally non-competitive with
             | GPT-5-medium or Sonnet 4.5.
             | 
             | But never fear, Gemini 3.0 is rumored to be coming out
             | Tuesday.
        
               | dingnuts wrote:
               | based on what? LLM benchmarks are all bullshit, so this
               | is based on... your gut?
               | 
               | Gemini outputs what I want with a similar regularity as
               | the other bots.
               | 
               | I'm so tired of the religious thinking around these
               | models. show me a measurement.
        
               | samtheprogram wrote:
               | > LLM benchmarks are all bullshit
               | 
               | > show me a measurement
               | 
               | Your comment encapsulates why we have religious thinking
               | around models.
        
               | NuclearPM wrote:
               | Please tell me this comment is a joke.
        
               | kingstnap wrote:
               | The random people tweets I've seen said Oct 9th which is
               | Thursday. I suppose we will know when we know.
        
           | conception wrote:
           | Not sure if msty counts as mainstream but it has so many
           | quality of life enhancements it's bonkers.
        
         | nurettin wrote:
         | They introduced removing from the end of the stack but not the
         | beginning
        
         | james_marks wrote:
         | I exit and restart CC all the time to get a "Fresh perspective
         | on the universe as it now is".
        
           | brulard wrote:
           | Isn't /clear enough to do that? I know some permissions
           | survive from previous sessions, but it served me well
        
             | james_marks wrote:
             | The one time I tried I felt like /clear may have dropped
             | all my .claude files as well, but I didn't look at it
             | closely.
        
           | rapind wrote:
           | I do as well, with Codex though, but OP is asking for more
           | fine grained control of what's in context and what can be
           | thrown away.
           | 
           | You can simulate this of course by doing the reverse and
           | maintaining explicit memory via a markdown files or w/e of
           | what you want to keep in context. I could see wanting both,
           | since a lot of the time it would be easier to just say
           | "forget that last exploration we did" while still having it
           | remember everything from before that. Think of it like an
           | exploratory twig on a branch that you don't want to keep.
           | 
           | Ultimately I just adapt by making my tasks smaller, using git
           | branches and committing often, writing plans to markdown,
           | etc.
        
         | rmsaksida wrote:
         | > I like how Google AI Studio allows one to delete sections and
         | they are then no longer part of the context. Not possible in
         | Claude, ChatGPT or Gemini, I think there one can only delete
         | the last response.
         | 
         | I have the same peeve. My assumption is the ability to freely
         | edit context is seen as not intuitive for most users - LLM
         | products want to keep the illusion of a classic chat UI where
         | that kind of editing doesn't make sense. I do wish ChatGPT & co
         | had a pro or advanced mode that was more similar to Google AI
         | Studio.
        
           | Curzel wrote:
           | /compact does most of that, for me at least
           | 
           | /compact we will now work on x, discard y, keep z
        
             | bdangubic wrote:
             | the trouble with compact is that no one _really_ knows how
             | it works and what it does. hence, for me at least, there is
             | just no way I would _ever_ allow my context to get there.
             | you should seriously reconsider ever using compact (I mean
             | this literally) - the quality of CC at that point is _order
             | of magnitute significantly worse_ that you are doing
             | yourself significant disservice
        
               | staticautomatic wrote:
               | You mean you never stay in a CC session long enough to
               | even see the auto compaction warning?
        
               | bdangubic wrote:
               | 100% - read this -
               | https://blog.nilenso.com/blog/2025/09/15/ai-unit-of-work/
        
         | _boffin_ wrote:
         | I'll be releasing something shortly that does this, plus more.
        
         | wrs wrote:
         | The SolveIt tool [0] has a simple but brilliant feature I now
         | want in all LLM tools: a fully editable transcript. In
         | particular, you can edit the previous LLM responses. This lets
         | you fix the lingering effect of a bad response without having
         | to back up and redo the whole interaction.
         | 
         | [0] https://news.ycombinator.com/item?id=45455719
        
       | mdrzn wrote:
       | I noticed this in the Claude Code interface, I reached "8%
       | context left" but after giving it a huge prompt the warning
       | disappeared, and it kept on working for another 20 minutes before
       | reaching again "10% context left", but it never had to compact
       | the history of the conversation. 10/10 great feature.
        
       | CuriouslyC wrote:
       | > Enable longer conversations by automatically removing stale
       | tool results from context Boost accuracy by saving critical
       | information to memory--and bring that learning across successive
       | agentic sessions
       | 
       | Funny, I was just talking about my personal use of these
       | techniques recently (tool output summarization/abliteration with
       | memory backend). This isn't something that needs to be Claude
       | Code specific though, you can 100% implement this with tool
       | wrappers.
       | 
       | I've been doing this for a bit, dropping summarized old tool
       | output from context is a big win, but it's still level ~0 context
       | engineering. It'll be interesting to see which of my tricks they
       | figure out next.
        
       | hendersoon wrote:
       | I wish claude code supported the new memory tool. The difference
       | is CLAUDE.md is always in your active context while the new
       | memory stuff is essentially local RAG.
        
       | airocker wrote:
       | Would this mean cursor and cline don't have to do context
       | management? Their value is much more just in the ui now?
        
       | ChicagoDave wrote:
       | This is hilarious. It's like they took my usage pattern and made
       | it native. Love it.
        
       | gtsop wrote:
       | I'll let them (ai companies) figure it out while i keep coding
       | manually.
       | 
       | I'll wait for the day they will release a js library to
       | programmatically do all this llm context juggling, instead of
       | using a UI, and then I will adopt it by doing what I do now,
       | writting code.
       | 
       | I will write code that orchestrates llms for writting code.
       | 
       | Edit: This is obviously a joke... but is it really a joke?
        
       ___________________________________________________________________
       (page generated 2025-10-05 23:00 UTC)