[HN Gopher] Managing context on the Claude Developer Platform
___________________________________________________________________
Managing context on the Claude Developer Platform
Author : benzguo
Score : 173 points
Date : 2025-10-05 05:20 UTC (17 hours ago)
(HTM) web link (www.anthropic.com)
(TXT) w3m dump (www.anthropic.com)
| mingtianzhang wrote:
| Edited version:
|
| We try to solve a similar problem to put long documents in
| context. We built an MCP for Claude to allow you to put long PDFs
| in your context window that go beyond the context limits:
| https://pageindex.ai/mcp.
| Szpadel wrote:
| this is about difficulty thing, it is about maintaining
| knowledge during long execution what is IMO exiting is long
| term memory in things like Claude code, where model could learn
| your preferences as you collaborate. (there is already some
| hard disabled implementation in CC)
| derleyici wrote:
| Just a heads-up: HN folks value transparency, so mentioning if
| it's yours usually builds more trust.
| mingtianzhang wrote:
| Thanks for the reminder, I have edited the comment.
| siva7 wrote:
| so this is what claude code 2 uses under the hood? at least i got
| the impression it stays much better on track than the old version
| npace12 wrote:
| it doesnt use it yet
| 0wis wrote:
| That's powerful. Most of the differences I can see between AI
| generated output and human output comes from the << broad but
| specific >> context of the task. I mean company culture,
| organization rules and politics, larger team focus and way of
| working. It may take time to build the required knowledge bases
| but it must be worth it
| andrewstuart wrote:
| Hopefully one day Anthropic will allow zipfile uploads like
| ChatGPT and Gemini have allowed for ages.
| crvdgc wrote:
| Nice. When using OpenAI Codex CLI, I find the /compact command
| very useful for large tasks. In a way it's similar to the context
| editing tool. Maybe I can ask it to use a dedicated directory to
| simulate the memory tool.
| EnPissant wrote:
| Claude Code already compacts automatically.
| crvdgc wrote:
| I believe Codex CLI also auto compacts when the context limit
| is met, but in addition to that, you can manually issue a
| /compact command at any time.
| brulard wrote:
| Claude Code had this /compact command for a long time, you
| can even specify your preferences for compaction after the
| slash command. But this is quite limited and to get the
| best results out of your agent you need more than rely on
| how the tool decides to prune your context. I ask it
| explicitly to write down the important parts of our
| conversation into an md file, and I review and iterate over
| the doc until I'm happy with it. Then /clear the context
| and give it instructions to continue based on the MD doc.
| mritchie712 wrote:
| CC also has the same `/compact` command if you want to force
| it
| _joel wrote:
| /clear too
| danielbln wrote:
| /compact accepts parameters, so you can tell it to focus on
| something specific when compacting.
| _pdp_ wrote:
| Interestingly we rolled out a similar feature recently.
| visarga wrote:
| I am working on a World Atlas based approach to computer use
| agents. If the task and app environment are reused, building an
| atlas of states and policies might be better than observe-plan-
| execute. We don't rediscover from scratch how an app works
| every time we use it.
| _pdp_ wrote:
| Do you have any links where I can read more about your
| approach?
| deepdarkforest wrote:
| Context editing is interesting because most agents work on the
| assumption that KV cache is the most important thing to optimise
| and are very hesitant to remove parts of the context during work.
| It also sometimes introduces hallucinations, because parts of the
| context are with the assumption that eg tool results are there,
| but theyre not. Example Manus [0]. Eg, read file A, make changes
| on A. Then prompt on some more changes. If you now remove the
| "read file A" tool results, not only you break the cache, but in
| my own agent implementations(on gpt 5 at least) can hallucinate
| now since my prompt etc all naturally point to the content of the
| tool still beeing there.
|
| Plus, the model got trained and RLed with a continuous context,
| except if they now tune it with messing with the context as well.
|
| https://manus.im/blog/Context-Engineering-for-AI-Agents-Less...
| javier2 wrote:
| We often talk about "hallucinations" like it is its own thing,
| but is there really anything different about it from the LLM's
| normal output?
| veidr wrote:
| AFAICT, no. I think it just means "bad, unhelpful output" but
| isn't fundamentally different in any meaningful way from
| their super-helpful top-1% outputs.
|
| It's kind of qualitatively different from the human
| perspective, so not a useless concept, but I think that is
| mainly because we can't help anthropomorphizing these things.
| blixt wrote:
| Yes we had the same issue with our coding agent. We found that
| instead of replacing large tool results in the context it was
| sometimes better to have two agents, one long lived with
| smaller tool results produced by another short lived agent that
| would actually be the one to read and edit large chunks. The
| downside of this is you always have to manage the balance of
| which agent gets what context, and you also increase latency
| and cost a bit (slightly less reuse of prompt cache)
| swader999 wrote:
| I found that having sub agents just for running and writing
| unit tests got me over 90% of my context woes
| brulard wrote:
| this sounds like a good approach, i need to try it. I had
| good results with using context7 in specialized docs agent.
| I wasn't able how to limit MCP to a subagent, likely its
| not supported.
| daxfohl wrote:
| Seems like that could be a job local LLMs do fairly well
| soon; not a ton of reasoning, just a basic ability to
| understand functions and write fairly boilerplate code, but
| it involves a ton of tokens, especially if you have lots of
| verbose output from a test run. So doing it locally could
| end up being a huge cost savings as well.
| yunohn wrote:
| Why are both this new Memory API and the Filesystem as (evolving)
| Context releases only for the Developer API - but not integrated
| into Claude Code?
| kylegalbraith wrote:
| I had this same thought. I'm not entirely following how I'm to
| differentiate between these two things. I guess the API is to
| create my own CC type agent. But I've heard of folks creating
| agents that are CC based as well.
| ec109685 wrote:
| How do you know it's not integrated into Claude code yet?
| simianwords wrote:
| I'm trying to understand what part of this is something we could
| not have hacked together already as clients? Maybe new sonnet is
| rl'ed to be able to use these memories in a better way?
| olliem36 wrote:
| At Zenning AI, a generalist AI designed to replace entire jobs
| with just prompts. Our agents typically run autonomously for
| hours, so effective context management is critical. I'd say that
| we invest most of our engineering effort into what is ultimately
| context management, such as:
|
| 1. Multi-agent orchestration 2. Summarising and chunking large
| tool and agent responses 3. Passing large context objects by
| reference between agents and tools
|
| Two things to note that might be interesting to the community:
|
| Firstly, when managing context, I recommend adding some evals to
| our context management flow, so you can measure effectiveness as
| you add improvements and changes.
|
| For example, our evals will measure the impact of using
| Anthropics memory over time. Thus allowing our team to make a
| better informed decisions on that tools to use with our agents.
|
| Secondly, there's a tradeoff not mentioned in this article: speed
| vs. accuracy. Faster summarisation (or 'compaction') comes at a
| cost of accuracy. If you want good compaction, it can be slow.
| Depending on the use case, you should adjust your compaction
| strategy accordingly. For example, (forgive my major
| generalisation), for consumer facing products speed is usually
| preferred over a bump in accuracy. However, in business accuracy
| is generally preferred over speed.
| _joel wrote:
| lol, good luck with that
| iamblessed_ wrote:
| I want to really get into anthropic.
|
| For context: I have background in CV and ML in general. Currently
| reviewing and revising RL.
|
| Any idea how I can get into RL?
|
| I have 3 years of industry/research experience.
|
| Whenever I see post like this, it triggers a massive fomo
| creating a scene of urgency on I should work in these problems.
|
| Not being able to work here is making be anxious.
|
| what does it take for someone in Non-US/Non-EU region to get into
| big labs such as these?
|
| Do I really have to pursue PhD? I am already old that pursuing
| PhD is a huge burden that I can't afford.
| barrenko wrote:
| The pace is so fast, if you have FOMO you've already missed out
| most probably. If you're interested in LLM flavored RL, I'd
| suggest prime-rl (and their discord) community, hugging face RL
| courses with smol (you'll need pro and burn a couple of bucks),
| etc. etc.
| iamblessed_ wrote:
| Willing to burn few bucks here and there for the projects.
|
| Really need to get hands dirty here. I remember taking RL
| course from coursera during 2020 covid. I didn't have the
| chance to apply it in the problems I worked post covid.
|
| But I really want to start doing RL again. Interested in
| world models and simulation for RL.
| barrenko wrote:
| If you're good on theory (meaning able to read theory), but
| interested in non-LLM stuff, pufferlib and it's programmer
| also seem quite accessible (if you haven't come across it
| already).
| siva7 wrote:
| I'll address something else: Fomo is usually a symptom of an
| underlying deeper issue. Long before Anthropic/Openai, we had
| these same posts about people desperately wanting to get into
| Google. They got unhealthy obsessed about this goal that they
| started prepping for months, even years, documenting their
| journey on blogs only to get rejected by someone who spent 2
| seconds on their application. Getting in is more about luck
| than most people realize. Once you're in, you had for a long
| time a skewed romanticized fantasy of what it's like to work at
| this mythical company (Anthropic isn't a startup by any stretch
| anymore) only to crash hard and realize it is a classic
| corporate environment.
| Orochikaku wrote:
| It may be harsh but the reality is, if you have to ask you're
| likely not a viable candidate.
|
| The leading AI companies have the kind of capital to be able to
| hire the very best of the industry. If you're only just
| starting now and even worse yet need your hand held to do so
| you're totally out of the running...
| iamblessed_ wrote:
| so what would be wise to do in this situation? leave the hope
| and surrender or completely start out and work in a nascent
| field?
| RamtinJ95 wrote:
| I don't get it. I have been doing something similar for a month
| with opencode, is the new thing that the new sonnet model is fine
| tuned to call these tools "better" or simply that they have
| improved the devex to accomplish these things?
| the_mitsuhiko wrote:
| > I don't get it. I have been doing something similar for a
| month with opencode, is the new thing that the new sonnet model
| is fine tuned to call these tools "better" or simply that they
| have improved the devex to accomplish these things?
|
| They fine tuned 4.5 to have `clear_tool_uses` marker tools that
| it understands without regressing the quality of future
| responses. You will however pay for the cache invalidation hit
| anyways, so it would need some evaluations how much this helps.
| Razengan wrote:
| I just want Anthropic to release a way to remove my payment
| method from Claude before the eventual data breach.
| Aeolun wrote:
| Don't they use Stripe? You can remove your payment method at
| any time. But Anthropic doesn't really have your info in the
| first place.
| Razengan wrote:
| How? First I wanted to just use iOS In-App-Purchases like
| ChatGPT and Grok etc all support, but Anthropic has this
| "We're too special" syndrome and don't offer IAP so I had to
| sign in on the website (and guess what, iOS app supports Sign
| In With Apple but the website doesn't) and add my card... but
| no way to remove the card later.
|
| Steam and others have figured it out, but Anthropic/Discord
| (who just had a breach like yesterday) still don't let you
| remove your payment info.
| mh- wrote:
| My Claude subscription is through iOS IAPs.
| Razengan wrote:
| I could swear I didn't even see the option for an IAP on
| iOS... maybe it's region-locked?
| mh- wrote:
| Maybe, but I wouldn't suggest doing it even if you figure
| it out. It has some really annoying side effects on
| subscription management for Claude, and additionally they
| mark up the plan cost by 20%. Which I didn't care about
| on the $20 Pro, but now with $200 plans.
|
| I haven't figured out* how to switch to a direct
| subscription other than cancelling and resubscribing, and
| I'm afraid of messing up my account access.
|
| * Caveat that I haven't spent more than 20 minutes trying
| to solve this.
| blixt wrote:
| From what I can tell the new context editing and memory APIs are
| essentially formalization of common patterns:
|
| Context editing: Replace tool call results in message history
| (i.e replace a file output with an indicator that it's no longer
| available).
|
| Memory: Give LLM access to read and write .md files like a
| virtual file system
|
| I feel like these formalizations of tools are on the path towards
| managing message history on the server, which means better vendor
| lock in, but not necessarily a big boon to the user of the API
| (well, bandwidth and latency will improve). I see the ChatGPT
| Responses API going a similar path, and together these changes
| will make it harder to swap transparently between providers,
| something I enjoy having the ability to do.
| mkagenius wrote:
| > managing message history on the server, which means better
| vendor lock in
|
| I feel that managing context should be doable with a non-SOTA
| model even locally. Just need a way to select/deselect messages
| from the context manually say in Claude-CLI.
| qwertox wrote:
| I wish every instruction and response had a enable/disable
| checkbox so that I can disable parts of the conversation in such
| a way that it is excluded from the context.
|
| Let's say I submit or let it create a piece of code, and we're
| working on improving it. At some point I want to consider the
| piece of code to be significantly better that what I had
| initially, so all those initial interactions containing old code
| could be removed from the context.
|
| I like how Google AI Studio allows one to delete sections and
| they are then no longer part of the context. Not possible in
| Claude, ChatGPT or Gemini, I think there one can only delete the
| last response.
|
| Maybe even AI could suggest which parts to disable.
| diggan wrote:
| I kind of do this, semi-manually when using the web chat UIs
| (which happens less and less). I basically never let the
| conversations go above two messages in total (one message from
| me + one reply, since the quality of responses goes down so
| damn quick), and if anything is wrong, I restart the
| conversation and fix the initial prompt so it gets it right.
| And rather than manually writing my prompts in the web UIs, I
| manage prompts with http://github.com/victorb/prompta which
| makes it trivial to edit the prompts as I find out the best way
| of getting the response I want, together with some simple shell
| integrations to automatically include logs, source code, docs
| and what not.
| tortilla wrote:
| I work similarly. I keep message rounds short (1-3) and clear
| often. If I have to steer the conversation too much, I start
| over.
|
| I built a terminal tui to manage my contexts/prompts:
| https://github.com/pluqqy/pluqqy-terminal
| Jowsey wrote:
| Related, it feels like AI Studio is the only mainstream LLM
| frontend that treats you like an adult. Choose your own safety
| boundaries, modify the context & system prompt as you please,
| clear rate limits and pricing, etc. It's something you come to
| appreciate a lot, even if we _are_ in the part of the cycle
| where Google 's models aren't particularly SOTA rn
| mirsadm wrote:
| How are they not SOTA? They're all very similar with ChatGPT
| being the worst (for my use case anyway). Like adding lambdas
| and random c++ function calls into my vulkan shaders.
| oezi wrote:
| Gemini 2.5 Pro is the most capable for my usecase in
| Pytorch as well. Large context and much better instruction
| following for code edits make a big difference.
| hendersoon wrote:
| Gemini 2.5 pro is generally non-competitive with
| GPT-5-medium or Sonnet 4.5.
|
| But never fear, Gemini 3.0 is rumored to be coming out
| Tuesday.
| dingnuts wrote:
| based on what? LLM benchmarks are all bullshit, so this
| is based on... your gut?
|
| Gemini outputs what I want with a similar regularity as
| the other bots.
|
| I'm so tired of the religious thinking around these
| models. show me a measurement.
| samtheprogram wrote:
| > LLM benchmarks are all bullshit
|
| > show me a measurement
|
| Your comment encapsulates why we have religious thinking
| around models.
| NuclearPM wrote:
| Please tell me this comment is a joke.
| kingstnap wrote:
| The random people tweets I've seen said Oct 9th which is
| Thursday. I suppose we will know when we know.
| conception wrote:
| Not sure if msty counts as mainstream but it has so many
| quality of life enhancements it's bonkers.
| nurettin wrote:
| They introduced removing from the end of the stack but not the
| beginning
| james_marks wrote:
| I exit and restart CC all the time to get a "Fresh perspective
| on the universe as it now is".
| brulard wrote:
| Isn't /clear enough to do that? I know some permissions
| survive from previous sessions, but it served me well
| james_marks wrote:
| The one time I tried I felt like /clear may have dropped
| all my .claude files as well, but I didn't look at it
| closely.
| rapind wrote:
| I do as well, with Codex though, but OP is asking for more
| fine grained control of what's in context and what can be
| thrown away.
|
| You can simulate this of course by doing the reverse and
| maintaining explicit memory via a markdown files or w/e of
| what you want to keep in context. I could see wanting both,
| since a lot of the time it would be easier to just say
| "forget that last exploration we did" while still having it
| remember everything from before that. Think of it like an
| exploratory twig on a branch that you don't want to keep.
|
| Ultimately I just adapt by making my tasks smaller, using git
| branches and committing often, writing plans to markdown,
| etc.
| rmsaksida wrote:
| > I like how Google AI Studio allows one to delete sections and
| they are then no longer part of the context. Not possible in
| Claude, ChatGPT or Gemini, I think there one can only delete
| the last response.
|
| I have the same peeve. My assumption is the ability to freely
| edit context is seen as not intuitive for most users - LLM
| products want to keep the illusion of a classic chat UI where
| that kind of editing doesn't make sense. I do wish ChatGPT & co
| had a pro or advanced mode that was more similar to Google AI
| Studio.
| Curzel wrote:
| /compact does most of that, for me at least
|
| /compact we will now work on x, discard y, keep z
| bdangubic wrote:
| the trouble with compact is that no one _really_ knows how
| it works and what it does. hence, for me at least, there is
| just no way I would _ever_ allow my context to get there.
| you should seriously reconsider ever using compact (I mean
| this literally) - the quality of CC at that point is _order
| of magnitute significantly worse_ that you are doing
| yourself significant disservice
| staticautomatic wrote:
| You mean you never stay in a CC session long enough to
| even see the auto compaction warning?
| bdangubic wrote:
| 100% - read this -
| https://blog.nilenso.com/blog/2025/09/15/ai-unit-of-work/
| _boffin_ wrote:
| I'll be releasing something shortly that does this, plus more.
| wrs wrote:
| The SolveIt tool [0] has a simple but brilliant feature I now
| want in all LLM tools: a fully editable transcript. In
| particular, you can edit the previous LLM responses. This lets
| you fix the lingering effect of a bad response without having
| to back up and redo the whole interaction.
|
| [0] https://news.ycombinator.com/item?id=45455719
| mdrzn wrote:
| I noticed this in the Claude Code interface, I reached "8%
| context left" but after giving it a huge prompt the warning
| disappeared, and it kept on working for another 20 minutes before
| reaching again "10% context left", but it never had to compact
| the history of the conversation. 10/10 great feature.
| CuriouslyC wrote:
| > Enable longer conversations by automatically removing stale
| tool results from context Boost accuracy by saving critical
| information to memory--and bring that learning across successive
| agentic sessions
|
| Funny, I was just talking about my personal use of these
| techniques recently (tool output summarization/abliteration with
| memory backend). This isn't something that needs to be Claude
| Code specific though, you can 100% implement this with tool
| wrappers.
|
| I've been doing this for a bit, dropping summarized old tool
| output from context is a big win, but it's still level ~0 context
| engineering. It'll be interesting to see which of my tricks they
| figure out next.
| hendersoon wrote:
| I wish claude code supported the new memory tool. The difference
| is CLAUDE.md is always in your active context while the new
| memory stuff is essentially local RAG.
| airocker wrote:
| Would this mean cursor and cline don't have to do context
| management? Their value is much more just in the ui now?
| ChicagoDave wrote:
| This is hilarious. It's like they took my usage pattern and made
| it native. Love it.
| gtsop wrote:
| I'll let them (ai companies) figure it out while i keep coding
| manually.
|
| I'll wait for the day they will release a js library to
| programmatically do all this llm context juggling, instead of
| using a UI, and then I will adopt it by doing what I do now,
| writting code.
|
| I will write code that orchestrates llms for writting code.
|
| Edit: This is obviously a joke... but is it really a joke?
___________________________________________________________________
(page generated 2025-10-05 23:00 UTC)