[HN Gopher] Writing a good Claude.md
       ___________________________________________________________________
        
       Writing a good Claude.md
        
       Author : objcts
       Score  : 696 points
       Date   : 2025-11-30 17:56 UTC (1 days ago)
        
 (HTM) web link (www.humanlayer.dev)
 (TXT) w3m dump (www.humanlayer.dev)
        
       | eric-burel wrote:
       | "You can investigate this yourself by putting a logging proxy
       | between the claude code CLI and the Anthropic API using
       | ANTHROPIC_BASE_URL" I'd be eager to read a tutorial about that I
       | never know which tool to favour for doing that when you're not a
       | system or network expert.
        
         | fishmicrowaver wrote:
         | Have you considered just asking claude? I'd wager you'd get up
         | and running in <10 minutes.
        
           | dhorthy wrote:
           | agree - i've had claude one-shot this for me at least 10
           | times at this point cause i'm too lazy to lug whatever code
           | around. literally made a new one this morning
        
           | eric-burel wrote:
           | AI is good for discovery but not validation, I wanted
           | experienced human feedback here
        
         | 0xblacklight wrote:
         | Hi, post author here
         | 
         | We used cloudflare's AI gateway which is pretty simple. Set one
         | up, get the proxy URL and set it through the env var, very
         | plug-and-play
        
           | eric-burel wrote:
           | Smart, thanks for the tip
        
         | Havoc wrote:
         | Just install mitmproxy. Takes like 5 mins to figure out. 2 with
         | Claude.
         | 
         | On phone else I'd post commands
        
       | jasonjmcghee wrote:
       | Interesting selection of models for the "instruction count vs.
       | accuracy" plot. Curious when that was done and why they chose
       | those models. How well does ChatGPT 5/5.1 (and codex/mini/nano
       | variants), Gemini 3, Claude Haiku/Sonnet/Opus 4.5, recent grok
       | models, Kimi 2 Thinking etc (this generation of models) do?
        
         | alansaber wrote:
         | Guessing they included some smaller models just to show how
         | they dump accuracy at smaller context sizes
        
           | jasonjmcghee wrote:
           | Sure - I was more commenting that they are all > 6 months
           | old, which sounds silly, but things have been changing fast,
           | and instruction following is definitely an area that has been
           | developing a lot recently. I would be surprised if accuracy
           | drops off that hard still.
        
             | 0xblacklight wrote:
             | I imagine it's highly-correlated to parameter count, but
             | the research is a few months old and frontier model
             | architecture is pretty opaque so hard to draw too too many
             | conclusions about newer models that aren't in the study
             | besides what I wrote in the post
        
       | vladsh wrote:
       | What is a good Claude.md?
        
         | testdelacc1 wrote:
         | Claude.md - A markdown file you add to your code repository to
         | explain how things work to Claude.
         | 
         | A good Claude.md - I don't know, presumably the article
         | explains.
        
       | andersco wrote:
       | I have found enabling the codebase itself to be the "Claude.md"
       | to be most effective. In other words, set up effective automated
       | checks for linting, type checking, unit tests etc and tell Claude
       | to always run these before completing a task. If the agent keeps
       | doing something you don't like, then a linting update or an
       | additional test often is more effective than trying to tinker
       | with the Claude.md file. Also, ensure docs on the codebase are up
       | to date and tell Claude to read relevant parts when working on a
       | task and of course update the docs for each new task. YMMV but
       | this has worked for me.
        
         | Aeolun wrote:
         | > Also, ensure docs on the codebase are up to date and tell
         | Claude to read relevant parts when working on a task
         | 
         | Yeah, if you do this every time it works fine. If you add what
         | you tell it every time to CLAUDE.md, it also works fine, but
         | you don't have to tell it any more ;)
        
         | Havoc wrote:
         | > Claude.md
         | 
         | It's case sensitive btw. CLAUDE.md - Might explain your mixed
         | results with it
        
       | prettyblocks wrote:
       | The advice here seems to assume a single .md file with
       | instructions for the whole project, but the AGENTS.md methodology
       | as supported by agents like github copilot is to break out more
       | specific AGENTS.md files in the subdirectories in your code base.
       | I wonder how and if the tips shared change assuming a flow with a
       | bunch of focused AGENTS.md files throughout the code.
        
         | 0xblacklight wrote:
         | Hi, post author here :)
         | 
         | I didn't dive into that because in a lot of cases it's not
         | necessary and I wanted to keep the post short, but for large
         | monorepos it's a good idea
        
       | btbuildem wrote:
       | It seems overall a good set of guidelines. I appreciate some of
       | the observations being backed up by data.
       | 
       | What I find most interesting is how a hierarchical / recursive
       | context construct begins to emerge. The authors' note of "root"
       | claude.md as well as the opening comments on LLMs being stateless
       | ring to me like a bell. I think soon we will start seeing
       | stateful LLMs, via clever manipulation of scope and context.
       | Something akin to memory, as we humans perceive it.
        
       | _pdp_ wrote:
       | There is far much easier way to do this and one that is perfectly
       | aligned with how these tools work.
       | 
       | It is called documenting your code!
       | 
       | Just write what this file is supposed to do in a clear concise
       | way. It acts as a prompt, it provides much needed context
       | specific to the file and it is used only when necessary.
       | 
       | Another tip is to add README.md files where possible and where it
       | helps. What is this folder for? Nobody knows! Write a README.md
       | file. It is not a rocket science.
       | 
       | What people often forget about LLMs is that they are largely
       | trained on public information which means that nothing new needs
       | to be invented.
       | 
       | You don't have to "prompt it just the right way".
       | 
       | What you have to do is to use the same old good best practices.
        
         | dhorthy wrote:
         | For the record I do think the AI community tries to
         | unnecessarily reinvent the wheel on crap all the time.
         | 
         | sure, readme.md is a great place to put content. But there's
         | things I'd put in a readme that I'd never put in a claude.md if
         | we want to squeeze the most out of these models.
         | 
         | Further, claude/agents.md have special quality-of-life
         | mechanics with the coding agent harnesses like e.g. `injecting
         | this file into the context window whenever an agent touches
         | this directory, no matter whether the model wants to read it or
         | not`
         | 
         | > What people often forget about LLMs is that they are largely
         | trained on public information which means that nothing new
         | needs to be invented.
         | 
         | I don't think this is relevant at all - when you're working
         | with coding agents, the more you can finesse and manage every
         | token that goes into your model and how its presented, the
         | better results you can get. And the public data that goes into
         | the models is near useless if you're working in a complex
         | codebase, compared to the results you can get if you invest
         | time into how context is collected and presented to your agent.
        
           | theshrike79 wrote:
           | > For the record I do think the AI community tries to
           | unnecessarily reinvent the wheel on crap all the time.
           | 
           | On Reddit's LLM subreddits people are rediscovering the very
           | basics of software project management as some massive
           | insights daily or very least weekly.
           | 
           | Who would've guessed that proper planning, accessible and up
           | to documentation and splitting tasks into manageable testable
           | chunks produces good code? Amazing!
           | 
           | Then they write a massive blog post or even some MCP
           | mostrosity for it and post it everywhere as a new discovery
           | =)
        
             | dkubb wrote:
             | I can totally understand where you are coming from with
             | this comment. It does feel a bit frustrating that people
             | are rediscovering things that were written in books
             | 30/40/50 years ago.
             | 
             | However, I think this is awesome for the industry. People
             | are rediscovering basic things, but if they didn't know
             | about the existing literature this is a perfect opportunity
             | to refer them to it. And if they were aware, but maybe not
             | practicing it, this is a great time for the ideas to be
             | reinforced.
             | 
             | A lot of people, myself included, never really understand
             | which practices are important or not until we were forced
             | to work on a system that was most definitely not written
             | with any good practices in mind.
             | 
             | My current view of agentic coding is that it's forcing an
             | entire generation of devs to learn software project
             | management _or_ drowning under the mountain of debt an LLM
             | can produce. Previously it took much longer to feel the
             | weight of bad decisions in a project but an LLM allows you
             | to speed-run this process in a few weeks or months.
        
         | bastawhiz wrote:
         | This is missing the point. If I want to instruct Claude to
         | never write a database query that doesn't hit a preexisting
         | index, where exactly am I supposed to document that? You can
         | either choose:
         | 
         | 1. A centralized location, like a README (congrats, you've just
         | invented CLAUDE.md)
         | 
         | 2. You add a docs folder (congrats, you've just done exactly
         | what the author suggests under Progressive Disclosure)
         | 
         | Moreover, you can't just do it all in a README, for the exact
         | reasons that the author lays out under "CLAUDE.md file length &
         | applicability".
         | 
         | CLAUDE.md simply isn't about telling Claude what all the parts
         | of your code are and how they work. You're right, that's what
         | documenting your code is for. But even if you have READMEs
         | everywhere, Claude has no idea where to put code when it starts
         | a new task. If it has to read all your documentation every time
         | it starts a new task, you're needlessly burning tokens. The
         | whole point is to give Claude important information up front
         | _so it doesn 't have to_ read all your docs and fill up its
         | context window searching for the right information on every
         | task.
         | 
         | Think of it this way: incredibly well documented code has
         | everything a new engineer needs to get started on a task, yes.
         | But this engineer has amnesia and forgets everything it's
         | learned after every task. Do you want them to have to reonboard
         | from scratch every time? No! You structure your docs in a way
         | so they don't have to start from scratch every time. This is an
         | accommodation: humans don't need this, for the most part,
         | because we don't reonboard to the same codebase over and over.
         | And so yes, you do need to go above and beyond the "same old
         | good best practices".
        
           | _pdp_ wrote:
           | You put a warning where it is most likely to be seen by a
           | human coder.
           | 
           | Besides, no amount of prompting will prevent this situation.
           | 
           | If it is a concern then you put a linter or unit tests to
           | prevent it altogether, or make a wrapper around the tricky
           | function with some warning in its doc strings.
           | 
           | I don't see how this is any different from how you typically
           | approach making your code more resilient to accidental
           | mistakes.
        
             | mvkel wrote:
             | Documenting for AI exactly like you would document for a
             | human is ignoring how these tools work
        
               | anonzzzies wrote:
               | But they are right, claude routinely ignores stuff from
               | CLAUDE.md, even with warning bells etc. You need a linter
               | preventing things. Like drizzle sql` templates: it just
               | loves them.
        
               | CuriouslyC wrote:
               | You can make affordances for agent abilities without
               | deviating from what humans find to be good documentation.
               | Use hyperlinks, organize information, document in layers,
               | use examples, be concise. It's not either/or unless
               | you're being lazy.
        
               | notachatbot123 wrote:
               | Sounds like we should call them tools, not AI!
        
               | theshrike79 wrote:
               | Agentic AI is LLMs using tools in a loop to achieve a
               | goal.
               | 
               | Needs a better term than "AI", I agree, but it's 99%
               | marketing the tech will stay the same.
        
             | bastawhiz wrote:
             | > no amount of prompting will prevent this situation.
             | 
             | Again, missing the point. If you don't prompt for it and
             | you document it in a place where the tool won't look first,
             | the tool simply won't do it. "No amount of promoting"
             | couldn't be more wrong, it works for me and all my
             | coworkers.
             | 
             | > If it is a concern then you put a linter or unit tests to
             | prevent it altogether
             | 
             | Sure, and then it'll always do things it's own way, run the
             | tests, and have to correct itself. Needlessly burning
             | tokens. But if you want to pay for it to waste its time and
             | yours, go for it.
             | 
             | > I don't see how this is any different from how you
             | typically approach making your code more resilient to
             | accidental mistakes.
             | 
             | It's not about avoiding mistakes! It's about having it
             | follow the norms of your codebase.
             | 
             | - My codebase at work is slowly transitioning from Mocha to
             | Jest. I can't write a linter to ban new mocha tests, and it
             | would be a pain to keep a list of legacy mocha test suites.
             | The solution is to simply have a bullet point in the
             | CLAUDE.md file that says "don't write new Mocha test
             | suites, only write new test suites in Jest". A more robust
             | solution isn't necessary and doesn't avoid mistakes, it
             | avoids the extra step of telling the LLM to rewrite the
             | tests.
             | 
             | - We have a bunch of terraform modules for convenience when
             | defining new S3 buckets. No amount of documenting the
             | modules will have Claude magically know they exist. You
             | tell it that there are convenience modules and to consider
             | using them.
             | 
             | - Our ORM has findOne that returns one record or null. We
             | have a convenience function getOne that returns a record or
             | throws a NotFoundError to return a 404 error. There's no
             | way to exhaustively detect with a linter that you used
             | findOne and checked the result for null and threw a
             | NotFoundError. And the hassle of maybe catching some
             | instances isn't necessary, because avoiding it is just one
             | line in CLAUDE.md.
             | 
             | It's really not that hard.
        
               | girvo wrote:
               | > There's no way to exhaustively detect with a linter
               | that you used findOne and checked the result for null and
               | threw a NotFoundError
               | 
               | Yes there is? Though this is usually better served with a
               | type checker, it's still totally feasible with a linter
               | too if that's your bag
               | 
               | > because avoiding it is just one line in CLAUDE.md.
               | 
               | Except no, it isn't, because these tools still ignore
               | that line sometimes so I _still_ have to check for it
               | myself.
        
           | gitgud wrote:
           | > _1. A centralized location, like a README (congrats, you
           | 've just invented CLAUDE.md)_
           | 
           | README files are not a new concept, and have been used in
           | software for like 5 decades now, whereas CLAUDE.md files were
           | invented 12 months ago...
        
           | callc wrote:
           | This CLAUDE.md dance feels like herding cats. Except we're
           | herding a really good autocorrect encyclopedic parrot. Sans
           | intelligence
           | 
           | Relating / personifying LLM to an engineer doesn't work out
           | 
           | Maybe the best though model currently is just "good way to
           | automate trivial text modifications" and "encyclopedic
           | ramblings"
        
             | saturatedfat wrote:
             | unfair characterization.
             | 
             | think about how this thing is interacting with your
             | codebase. it can read one file at a time. sections of
             | files.
             | 
             | in this UX, is it ergonomic to go hunting for patterns and
             | conventions? if u have to linearly process every single
             | thing u look at every time you do something, how are you
             | supposed to have "peripheral vision"? if you have amnesia,
             | how do you continue to do good work in a codebase given
             | you're a skilled engineer?
             | 
             | it is different from you. that is OK. it doesn't mean its
             | stupid. it means it needs different accomodations to
             | perform as well as you do. accomodations IRL exist for a
             | reason, different people work differently and have
             | different strengths and weaknesses. just like humans, you
             | get the most out of them if you meet and work with them
             | from where they're at.
        
           | victorbuilds wrote:
           | Learned this the hard way. Asked Claude Code to run a
           | database migration. It deleted my production database
           | instead, then immediately apologised and started panicking
           | trying to restore it.
           | 
           | Thankfully Azure keeps deleted SQL databases recoverable, so
           | I got it back in under an hour. But yeah - no amount of
           | CLAUDE.md instructions would have prevented that. It no
           | longer gets prod credentials.
        
           | theshrike79 wrote:
           | 1. Create a tool that can check if a query hits a prexisting
           | index
           | 
           | In step 2 either force Claude to use it (hooks) or suggest it
           | (CLAUDE.md)
           | 
           | 3. Profit!
           | 
           | As for "where stuff is", for anything more complex I have a
           | tree-style graph in CLAUDE.md that shows the rough categories
           | of where stuff is. Like the handler for letterboxd is in
           | cmd/handlerletterboxd/ and internal modules are in internal/
           | 
           | Now it doesn't need to go in blind but can narrow down
           | searches when I tell it to "add director and writer to the
           | letterboxd handler output".
        
         | johnfn wrote:
         | So how exactly does one "write what this file is supposed to do
         | in a clear concise way" in a way that is quickly comprehensible
         | to AI? The gist of the article is that when your audience
         | changes from "human" to "AI" the manner in which you write
         | documentation changes. The article is fairly high quality, and
         | presents excellent evidence that simply "documenting your code"
         | won't get you as far as the guidelines it provides.
         | 
         | Your comment comes off as if you're dispensing common-sense
         | advice, but I don't think it actually applies here.
        
         | 0xblacklight wrote:
         | I think you're missing that CLAUDE.md is deterministically
         | injected into the model's context window
         | 
         | This means that instead of behaving like a file the LLM reads,
         | it effectively lets you customize the model's prompt
         | 
         | I also didn't write that you have to "prompt it just the right
         | way", I think you're missing the point entirely
        
         | datacynic wrote:
         | Writing documentation for LLMs is strangely pleasing because
         | you have very linear returns for every bit of effort you spend
         | on improving its quality and the feedback loop is very tight.
         | When writing for humans, especially internal documentation,
         | I've found that these returns are quickly diminishing or even
         | negative as it's difficult to know if people even read it or if
         | they didn't understand it or if it was incomplete.
        
         | avereveard wrote:
         | Well, no. You run pretty fast into context limit (or attention
         | limit for long context models) And the model understand pretty
         | well what code does without documentation.
         | 
         | Theres also a question of processes. How to format code what
         | style of catching to use and how to run the tests, which human
         | keep on the bacl of their head after reading it once or twice
         | but need a constant reminder for llm whose knowledge lifespan
         | is session limited
        
           | uncletaco wrote:
           | I'm pretty sure Claude would not work well in my code base if
           | I hadn't meticulously added docstrings, type hints, and
           | module level documentation. Even if you're stubbing out code
           | for later implementation, it helps to go ahead and document
           | it so that a code assistant will get a hint of what to do
           | next.
        
       | candiddevmike wrote:
       | None of this should be necessary if these tools did what they say
       | on the tin, and most of this advice will probably age like milk.
       | 
       | Write readmes for humans, not LLMs. That's where the ball is
       | going.
        
         | 0xblacklight wrote:
         | Hi, post author here :)
         | 
         | Yes README.md should still be written for humans and isn't
         | going away anytime soon.
         | 
         | CLAUDE.md is a convention used by claude code, and AGENTS.md is
         | used by other coding agents. Both are intended to be
         | supplemental to the README and are deterministically injected
         | into the agent's context.
         | 
         | It's a configuration point for the harness, it's not intended
         | to replace the README.
         | 
         | Some of the advice in here will undoubtedly age poorly as
         | harnesses change and models improve, but some of the generic
         | principles will stay the same - e.g. that you shouldn't use an
         | LLM to do a linter &formatter's job, or that LLMs are stateless
         | and need to be onboarded into the codebase, and having some
         | deterministically-injected instructions to achieve that is
         | useful instead of relying on the agent to non-deterministically
         | derive all that info by reading config and package files
         | 
         | The post isn't really intended to be super forward-looking as
         | much as "here's how to use this coding agent harness
         | configuration point as best as we know how to right now"
        
           | teiferer wrote:
           | > you shouldn't use an LLM to do a linter &formatter's job,
           | 
           | Why is that good advice? If that thing is eventually supposed
           | to do the most tricky coding tasks, and already a year ago
           | could have won a medal at the informatics olympics, then why
           | wouldn't it eventually be able to tell if I'm using 2 or 4
           | spaces and format my code accordingly? Either it's going to
           | change the world, then this is a trivial task, or it's all
           | vaporware, then what are we even discussing..
           | 
           | > or that LLMs are stateless and need to be onboarded into
           | the codebase
           | 
           | What? Why would that be a reasonable assumption/prediction
           | for even near term agent capabilities? Providing it with some
           | kind of local memory to dump its learned-so-far state of the
           | world shouldn't be too hard. Isn't it supposed to already be
           | treated like a junior dev? All junior devs I'm working with
           | remember what I told them 2 weeks ago. Surely a coding agent
           | can eventually support that too.
           | 
           | This whole CLAUDE.md thing seems a temporary kludge until
           | such basic features are sorted out, and I'm seriously
           | surprised how much time folks are spending to make that early
           | broken state less painful to work with. All that precious
           | knowledge y'all are building will be worthless a year or two
           | from now.
        
             | cruffle_duffle wrote:
             | The stateless nature of Claude code is what annoys me so
             | much. Like it has to spend so much time doing repetitious
             | bootstraps. And how much it "picks up and propagates"
             | random shit it finds in some document it wrote. It will
             | echo back something it wrote that "stood out" and I'll
             | forget where it got that and ask "find where you found that
             | info so we can remove it." And it will do so but somehow
             | mysteriously pick it up again and it will be because of
             | some git commit message or something. It's like a tune
             | stuck in its head or something only it's sticky for LLMs
             | not humans.
             | 
             | And that describes the issues I had with "automatic
             | memories" features things like ChatGPT had. Turns out it is
             | an awful judge of things to remember. Like it would make
             | memories like "cruffle is trying to make pepper soup with
             | chicken stock"! Which it would then parrot back to me at
             | some point 4 months later and I'd be like "WTF I figured it
             | out". The "# remember this" is much more powerful because
             | know how sticky this stuff gets and id rather have it over
             | index on my own forceful memories than random shit it
             | decided.
             | 
             | I dunno. All I'm saying is you are right. The future is in
             | having these things do a better job of remembering. And I
             | don't know if LLMs are the right tool for that. Keyword
             | search isn't either though. And vector search might not be
             | either--I think it suffers from the same kinds of "catchy
             | tune attack" an LLM might.
             | 
             | Somebody will figure it out somehow.
        
             | lijok wrote:
             | > All junior devs I'm working with remember what I told
             | them 2 weeks ago
             | 
             | That's why they're junior
        
             | alwillis wrote:
             | > Then why wouldn't it eventually be able to tell if I'm
             | using 2 or 4 spaces and format my code accordingly?
             | 
             | It's not that an agent doesn't know if you're using 2 or 4
             | spaces in your code; it comes down to:
             | 
             | - there are many ways to ensure your code is formatted
             | correctly; that's what .editorconfig [1] is for.
             | 
             | - in a halfway serious project, incorrectly formatted code
             | shouldn't reach the LLM in the first place
             | 
             | - tokens are relatively cheap but they're not free on a
             | paid plan; why spend tokens on something linters and
             | formatters can do deterministically and for free?
             | 
             | If you wanted Claude Code to handle linting automatically,
             | you're better off taking that out of CLAUDE.md and creating
             | a Skill [2].
             | 
             | > What? Why would that be a reasonable
             | assumption/prediction for even near-term agent
             | capabilities? Providing it with some kind of local memory
             | to dump its learned-so-far state of the world shouldn't be
             | too hard. Isn't it supposed to already be treated like a
             | junior dev? All junior devs I'm working with remember what
             | I told them 2 weeks ago. Surely a coding agent can
             | eventually support that too.
             | 
             | It wasn't mentioned in the article, but Claude Code, for
             | example, does save each chat session by default. You can
             | come back to a project and type `claude --resume` and
             | you'll get a list of past Claude Code sessions that you can
             | pick up from where you left off.
             | 
             | [1]: https://editorconfig.org
             | 
             | [2]: https://code.claude.com/docs/en/skills
        
             | Zerot wrote:
             | > Why is that good advice? If that thing is eventually
             | supposed to do the most tricky coding tasks, and already a
             | year ago could have won a medal at the informatics
             | olympics, then why wouldn't it eventually be able to tell
             | if I'm using 2 or 4 spaces and format my code accordingly?
             | Either it's going to change the world, then this is a
             | trivial task, or it's all vaporware, then what are we even
             | discussing..
             | 
             | This is the exact reason for the advice: The LLM already is
             | able to follow coding conventions by just looking at the
             | surrounding code which was already included in the context.
             | So by adding your coding conventions to the claude.md, you
             | are just using more context for no gain.
             | 
             | And another reason to not use an agent for
             | linting/formatting(i.e. prompting to "format this code for
             | me") is that dedicated linters/formatters are faster and
             | only take maybe a single cent of electricity to run whereas
             | using an LLM to do that job will cost multiple dollars if
             | not more.
        
       | rootusrootus wrote:
       | Ha, I just tell Claude to write it. My results have been
       | generally fine, but I only use Claude on a simple codebase that
       | is well documented already. Maybe I will hand-edit it to see if I
       | can see any improvements.
        
       | serial_dev wrote:
       | I'm sure I'm just working like a caveman, but I simply highlight
       | the relevant code, add it to the chat, and talk to these tools as
       | if they were my colleagues and I'm getting pretty good results.
       | 
       | About 12 to 6 months ago this was not the case (with or without
       | .md files), I was getting mainly subpar result, so I'm assuming
       | that the models have improved a lot.
       | 
       | Basically, I found that they not make that much of a difference,
       | the model is either good enough or not...
       | 
       | I know (or at least I suppose) that these markdown files could
       | bring some marginal improvements, but at this point, I don't
       | really care.
       | 
       | I assume this is an unpopular take because I see so many people
       | treat these files as if they were black magic or silver bullet
       | that 100x their already 1000x productivity.
        
         | vanviegen wrote:
         | > I simply highlight the relevant code, add it to the chat, and
         | talk to these tools
         | 
         | Different use case. I assume the discussion is about having the
         | agent implement whole features or research and fix bugs without
         | much guidance.
        
           | 0xblacklight wrote:
           | Yep it is opinionated for how to get coding agents to solve
           | hard problems in complex brownfield codebases which is what
           | we are focused on at humanlayer :)
        
         | rmnclmnt wrote:
         | Matches my experience also. Bothered only once to setup a
         | proper CLAUDE.md file, and now never do it. Simply refering to
         | the context properly for surgical recommendations and edit
         | works relatively well.
         | 
         | It feels a lot like bikeshedding to me, maybe I'm wrong
        
         | wredcoll wrote:
         | How about a list of existing database tables/columns so you
         | don't need to repeat it each time?
        
           | anonzzzies wrote:
           | Claude code figures that out at startup every time. Never had
           | issues with it.
        
             | theshrike79 wrote:
             | You can save some precious context by having it somewhere
             | without it having to figure it out from scratch every time.
        
           | HDThoreaun wrote:
           | Do you not use a model file for your orm?
        
             | wredcoll wrote:
             | ORMs are generally a bad idea, so.. hopefully not?
        
               | girvo wrote:
               | Even without the explicit magic ORMs, with data mapper
               | style query builders like Kysely and similar, I still
               | find I need to marshall selected rows into objects to,
               | yknow, do things with them in a lot of cases.
               | 
               | Perhaps a function of GraphQL though.
        
               | wredcoll wrote:
               | Sure, but that's not the same thing. For example, whether
               | or not you have to redeclare your entire database schema
               | in a custom ORM language in a different repo.
        
               | mattmanser wrote:
               | This isn't the 00s any more.
        
           | girvo wrote:
           | I gave it a tool to execute to get that info if required, but
           | it mostly doesn't need to due to Kysely migration files and
           | the database type definition being enough.
        
         | jwpapi wrote:
         | === myExperience
        
       | gonzalohm wrote:
       | Probably a lot of people here disagree with this feeling. But my
       | take is that if setting up all the AI infrastructure and
       | onboarding to my code is going to take this amount of effort,
       | then I might as well code the damn thing myself which is what I'm
       | getting paid to (and enjoy doing anyway)
        
         | vanviegen wrote:
         | Perhaps. But keep in mind that the setup work is typically
         | mostly delegated to LLMs as well.
        
         | fragmede wrote:
         | Whether it's setting up AI infrastructure or configuring
         | Emacs/vim/VSCode, the important distinction to make is if the
         | cost has to be paid continually, or if it's a one
         | time/intermittent cost. If I had to configure my shell/git
         | aliases every time I booted my computer, I wouldn't use them,
         | but seeing as how they're saved in config files, they're pretty
         | heavily customized by this point.
         | 
         | Don't use AI if you don't want to, but "it takes too much
         | effort to set up" is an excuse printf debuggers use to avoid
         | setting up a debugger. Which is a whole other debate though.
        
           | bird0861 wrote:
           | I fully agree with this POV but for one detail; there is a
           | problem with sunsetting frontier models. As we begin to adopt
           | these tools and build workflows with them, they become pieces
           | of our toolkit. We depend on them. We take them for granted
           | even. And then the model either changes (new checkpoints,
           | maybe alignment gets fiddled with) and all of the sudden
           | prompts no longer yield the same results we expected from
           | them after working on them for quite some time. I think the
           | term for this is "prompt instability". I felt this with
           | Gemini 3 (and some people had less pronounced but similar
           | experience with Sonnet releases after 3.7) which for certain
           | tasks that 2.5Pro excelled at..it's just unusable now. I was
           | already a local model advocate before this but now I'm a
           | local model zealot. I've stopped using Gemini 3 over this.
           | Last night I used Qwen3 VL on my 4090 and although it was not
           | perfect (sycophancy, overuse of certain cliches...nothing I
           | can't get rid of later with some custom promptsets and a few
           | hours in Heretic) it did a decent enough job of helping me
           | work through my blindspots in the UI/UX for a project that I
           | got what I needed.
           | 
           | If we have to perform tuning on our prompts ("skills",
           | agents.md/claude.md, all of the stuff a coding assistant
           | packs context with) every model release then I see new model
           | releases becoming a liability more than a boon.
        
         | kissgyorgy wrote:
         | I strongly disagree with the author not using /init. It takes a
         | minute to run and Claude provides surprisingly good results.
        
           | alwillis wrote:
           | /init has evolved since the early day; it's more concise than
           | it used to be.
        
           | 0xblacklight wrote:
           | If you find it works for you, then that's great! This post is
           | mostly from our learnings from getting it to solve hard
           | problems in complex brownfield codebases where auto
           | generation is almost never sufficient.
        
         | nvarsj wrote:
         | It really doesn't take that much effort. Like any tool, people
         | can over-optimise on the setup rather than just use it.
        
         | nichochar wrote:
         | The effort described in the article is maybe a couple hours of
         | work.
         | 
         | I understand the "enjoy doing anyway" part and it resonates,
         | but not using AI is simply less productive.
        
           | TheRoque wrote:
           | > but not using AI is simply less productive
           | 
           | Some studies shows the opposite for experienced devs. And it
           | also shows that developers are delusional about said
           | productivity gains:
           | https://metr.org/blog/2025-07-10-early-2025-ai-
           | experienced-o...
           | 
           | If you have a counter-study (for experienced devs, not
           | juniors), I'd be curious to see. My experience also has been
           | that using AI as part of your main way to produce code, is
           | not faster when you factor in everything.
        
             | ares623 wrote:
             | Curious why there hasn't been a rebuttal study to that one
             | yet (or if there is I haven't seen it come up). There must
             | be near infinite funding available to debunk that study
             | right?
        
             | bird0861 wrote:
             | That study is garbo and I suspect you didn't even read the
             | abstract. Am I right?
        
               | gravypod wrote:
               | I've heard this mentioned a few times. Here is a
               | summarized version of the abstract:                   >
               | ... We conduct a randomized controlled trial (RCT)
               | > ... AI tools ... affect the productivity of experienced
               | > open-source developers. 16 developers with moderate AI
               | > experience complete 246 tasks in mature projects on
               | which they         > have an average of 5 years of prior
               | experience. Each task is         > randomly assigned to
               | allow or disallow usage of early-2025 AI         > tools.
               | ... developers primarily use Cursor Pro ... and         >
               | Claude 3.5/3.7 Sonnet. Before starting tasks, developers
               | forecast that allowing         > AI will reduce
               | completion time by 24%. After completing the         >
               | study, developers estimate that allowing AI reduced
               | completion time by 20%.         > Surprisingly, we find
               | that allowing AI actually increases         > completion
               | time by 19%--AI tooling slowed developers down. This
               | > slowdown also contradicts predictions from experts in
               | economics         > (39% shorter) and ML (38% shorter).
               | To understand this result,         > we collect and
               | evaluate evidence for 21 properties of our setting
               | > that a priori could contribute to the observed slowdown
               | effect--for         > example, the size and quality
               | standards of projects, or prior         > developer
               | experience with AI tooling. Although the influence of
               | > experimental artifacts cannot be entirely ruled out,
               | the robustness         > of the slowdown effect across
               | our analyses suggests it is unlikely         > to
               | primarily be a function of our experimental design.
               | 
               | So what we can gather:
               | 
               | 1. 16 people were randomly given tasks to do
               | 
               | 2. They knew the codebase they worked on pretty well
               | 
               | 3. They said AI would help them work 24% faster (before
               | starting tasks)
               | 
               | 4. They said AI made them ~20% faster (after completion
               | of tasks)
               | 
               | 5. ML Experts claim that they think programmers will be
               | ~38% faster
               | 
               | 6. Economists say ~39% faster.
               | 
               | 7. We measured that people were actually 19% slower
               | 
               | This seems to be done on Cursor, with big models, on
               | codebases people know. There are definitely problems with
               | industry-wide statements like this but I feel like the
               | biggest area AI tools help me is if I'm working on
               | something I know nothing about. For example: I am really
               | bad at web development so CSS / HTML is easier to edit
               | through prompts. I don't have trouble believing that I
               | would be slower trying to make an edit to code that I
               | already know how to make.
               | 
               | Maybe they would see the speedups by allowing the
               | engineer to select when to use the AI assistance and when
               | not to.
        
               | saturatedfat wrote:
               | it doesnt control for skill using models/experience using
               | models. this looks VERY different at hour 1000 and hour
               | 5000 than hour 100.
        
               | brumar wrote:
               | Lazy from me to not check if I remember well or not, but
               | the dev that got productivity gains was a regular user of
               | cursor.
        
           | svachalek wrote:
           | Minutes really, despite what the article says you can get 90%
           | of the way there by telling Claude how you want the project
           | documentation structured and just let it do it. Up to you if
           | you really want to tune the last 10% manually, I don't. I
           | have been using basically the same system and when I tell
           | Claude to update docs it doesn't revert to one big Claude.md,
           | it maintains it in a structure like this.
        
           | globular-toast wrote:
           | It's a couple of hours right now, then another couple of
           | hours "correcting" the AI when it still goes wrong, another
           | couple of hours tweaking the file again, another couple of
           | hours to update when the model changes, another couple of
           | hours when someone writes a new blog post with another method
           | etc.
           | 
           | There's a huge difference between investing time into a
           | deterministic tool like a text editor or programming language
           | and a moving target like "AI".
           | 
           | The difference between programming in Notepad in a language
           | you don't know and using "AI" will be huge. But the
           | difference between being fluent in a language and having a
           | powerful editor/IDE? Minimal at best. I actually think
           | productivity is worse because it tricks you into wasting time
           | via the "just one more roll" (ie. gambling) mentality. Not to
           | mention you're not building that fluency or toolkit for
           | yourself, making you barely more valuable than the "AI"
           | itself.
        
             | fragmede wrote:
             | You say that as if tech hasn't always been a moving target
             | anyway. The skills I spent months learning a specific
             | language and IDE became obsolete with the next job and the
             | next paradigm shift. That's been one of the few consistent
             | themes throughout my career. Hours here and there, spread
             | across months and years, just learning whatever was new.
             | Sometimes, like with Linux, it really paid off. Other
             | times, like PHP, it did, and then fizzled out.
             | 
             | --
             | 
             | The other thing is, this need for determinism bewilders me.
             | I mean, I get where it comes from, we want nice,
             | predictable reliable machines. But how deterministic does
             | it need to be? If today, it decides to generate code and
             | the variable is called fileName, and tomorrow it's
             | filePath, as long as it's passing tests, what do I care
             | that it's not totally deterministic and the names of the
             | variables it generates are different? as long as it's
             | consistent with existing code, and it passes tests, whats
             | the importance of it being deterministic to a computer
             | science level of rigor? It reminds me about the travelling
             | salesman problem, or the knapsack problem. Both NP hard,
             | but users don't care about that. They just want the
             | computer to tell them something good enough for them to go
             | on about their day. So if a customer comes up to you and
             | offers you a pile of money to solve either one of those
             | problems, do I laugh in their face, knowing damn well I
             | won't be the one to prove that NP = P, or do I explain to
             | them the situation, and build them software that will do
             | the best it can, with however much compute resources
             | they're willing to pay for?
        
         | Havoc wrote:
         | A lot of the style stuff you can write once and reuse. I
         | started splitting mine into overall and project specific files
         | for this reason
         | 
         | Universal has stuff I always want (use uv instead of pip etc)
         | while the other describes what tech choice for this project
        
       | ctoth wrote:
       | I've gotten quite a bit of utility out of my current setup[0]:
       | 
       | Some explicit things I found helpful: Have the agent address you
       | as something specific! This way you know if the agent is paying
       | attention to your detailed instructions.
       | 
       | Rationality, as in the stuff practiced on early Less Wrong, gives
       | a great language for constraining the agent, and since it's read
       | The Sequences and everything else you can include pointers and
       | the more you do the more it will nudge it into that mode of
       | thought.
       | 
       | The explicit "This is what I'm doing, this is what I expect"
       | pattern has been hugely useful for both me monitoring it/coming
       | back to see what it did, and it itself. It makes it more likely
       | to recover when it goes down a bad path.
       | 
       | The system reminder this article mentions is definitely there but
       | I have not noticed it messing much with adherence. I wish there
       | were some sort of power user mode to turn it off though!
       | 
       | Also, this is probably too long! But I have been experimenting
       | and iterating for a while, and this is what is working best
       | currently. Not that I've been able to hold any other part
       | constant -- Opus 4.5 really is remarkable.
       | 
       | [0]:
       | https://gist.github.com/ctoth/d8e629209ff1d9748185b9830fa4e7...
        
       | johnfn wrote:
       | I was expecting the traditional AI-written slop about AI, but
       | this is actually really good. In particular, the "As instruction
       | count increases, instruction-following quality decreases
       | uniformly" section and associated graph is truly fantastic! To my
       | mind, the ability to follow long lists of rules is one of the
       | most obvious ways that virtually all AI models fail today. That's
       | why I think that graph is so useful -- I've never seen someone go
       | and systematically measure it before!
       | 
       | I would love to see it extended to show Codex, which to my mind
       | is by far the best at rule-following. (I'd also be curious to see
       | how Gemini 3 performs.)
        
         | 0xblacklight wrote:
         | I looked when I wrote the post but the paper hasn't been
         | revisited with newer models :/
        
       | boredtofears wrote:
       | It would be nice to see an actual example of what a good
       | claude.md that implements all of these recommendations looks
       | like.
        
       | huqedato wrote:
       | Looking for a similar GEMINI.md
        
         | 0xblacklight wrote:
         | It might support AGENTS.md, you could check the site and see if
         | it's there
        
       | vunderba wrote:
       | From the article:
       | 
       |  _> We recommend keeping task-specific instructions in separate
       | markdown files with self-descriptive names somewhere in your
       | project. Then, in your CLAUDE.md file, you can include a list of
       | these files with a brief description of each, and instruct Claude
       | to decide which (if any) are relevant and to read them before it
       | starts working._
       | 
       | I've been doing this since the early days of agentic coding
       | though I've always personally referred to it as the _Table-of-
       | Contents approach_ to keep the context window relatively
       | streamlined. Here 's a snippet of my CLAUDE.md file that
       | demonstrates this approach:                 # Documentation
       | References            - When adding CSS, refer to:
       | docs/ADDING_CSS.md       - When adding assets, refer to:
       | docs/ADDING_ASSETS.md       - When working with user data, refer
       | to: docs/STORAGE_MANAGER.md
       | 
       | Full CLAUDE.md file for reference:
       | 
       | https://gist.github.com/scpedicini/179626cfb022452bb39eff10b...
        
         | sothatsit wrote:
         | I have also done this, but my results are very hit or miss.
         | Claude rarely actually reads the other documentation files I
         | point it to.
        
           | dhorthy wrote:
           | I think the key here is "if X then Y syntax" - this seems to
           | be quite effective at piercing through the "probably ignore
           | this" system message by highlighting WHEN a given instruction
           | is "highly relevant"
        
             | throwaway314155 wrote:
             | What?
        
               | xpe wrote:
               | It helps when questions intended to resolve ambiguity are
               | not themselves hopelessly ambiguous.
               | 
               | See also: "Help me help you" -
               | https://en.wikipedia.org/wiki/Jerry_Maguire
        
           | Sammi wrote:
           | Yeah I don't trust any agent to follow document references
           | consistently. I just manually add the relevant files to
           | context every single time.
           | 
           | Though I know some people who have built an mcp that does
           | exactly this: https://www.usable.dev/
           | 
           | It's basically a chat-bot frontend to your markdown files,
           | with both rag and graph db indexes.
        
           | wry_discontent wrote:
           | That makes sense given that it's trained on real world
           | developers.
        
         | dimitri-vs wrote:
         | Correct me if I'm wrong but I think the new "skillss are
         | exactly this, but better.
        
           | vunderba wrote:
           | Yeah I think "Skills" are just a more codified folder based
           | approach to this TOC system. The main reason I haven't
           | migrated yet is that the TOC approach lends itself better to
           | the more generic AGENTS.md style - allowing me to swap over
           | to alternative LLMs (such as Gemini) relatively easily.
        
           | stpedgwdgfhgdd wrote:
           | Indeed, the article links to the skill documentation which
           | says:
           | 
           | Skills are modular capabilities that extend Claude's
           | functionality through organized folders containing
           | instructions, scripts, and resources.
           | 
           | And
           | 
           | Extend Claude's capabilities for your specific workflows
           | 
           | E.g. building your project is definitely a workflow.
           | 
           | It als makes sense to put as much as you can into a skill as
           | this an optimized mechanism for claude code to retrieve
           | relevant information based on the skill's frontmatter.
        
         | Zarathruster wrote:
         | I've done this too. The nice side-benefit of this approach is
         | that it also serves as good documentation for other humans
         | (including your future self) when trying to wrap their heads
         | around what was done and why. In general I find it helpful to
         | write docs that help both humans and agents to understand the
         | structure and purpose of my codebase.
        
       | tietjens wrote:
       | I think this could work really well for infrastructure/ops style
       | work where the LLM will not be able to grasp the full context of
       | say the network from just a few files that you have open.
       | 
       | But as others are saying this is just basic documentation that
       | should be done anyway.
        
       | acedTrex wrote:
       | "Here's how to use the slop machine better" is such a ridiculous
       | pretense for a blog or article. You simply write a sentence and
       | it approximates it. That is hardly worth any literature being
       | written as it is so self obvious.
        
         | 0xblacklight wrote:
         | This is an excellent point - LLMs are autoregressive next-token
         | predictors, and output token quality is a function of input
         | token quality
         | 
         | Consider that if the only code you get out of the
         | autoregressive token prediction machine is slop, that this
         | indicates more about the quality of your code than the quality
         | of the autoregressive token prediction machine
        
           | acedTrex wrote:
           | > that this indicates more about the quality of your code
           | 
           | Considering that the "input" to these models is essentially
           | all public code in existence, the direct context input is a
           | drop in the bucket.
        
       | johnsmith1840 wrote:
       | I don't get the point. Point it at your relevent files ask it to
       | review discuss the update refine it's understanding and then tell
       | it to go.
       | 
       | I have found that more context comments and info damage quality
       | on hard problems.
       | 
       | I actually for a long time now have two views for my code.
       | 
       | 1. The raw code with no empty space or comments. 2. Code with
       | comments
       | 
       | I never give the second to my LLM. The more context you give the
       | lower it's upper end of quality becomes. This is just a habit
       | I've picked up using LLMs every day hours a day since gpt3.5 it
       | allows me to reach farther into extreme complexity.
       | 
       | I suppose I don't know what most people are using LLMs for but
       | the higher complexity your work entails the less noise you should
       | inject into it. It's tempting to add massive amounts of xontext
       | but I've routinely found that fails on the higher levels of
       | coding complexity and uniqueness. It was more apparent in earlier
       | models newer ones will handle tons of context you just won't be
       | able to get those upper ends of quality.
       | 
       | Compute to informatio ratio is all that matters. Compute is
       | capped.
        
         | ra wrote:
         | This is exactly right. Attention is all you need. It's all
         | about attention. Attention is finite.
         | 
         | The more you data load into context the more you dilute
         | attention.
        
           | throwuxiytayq wrote:
           | people who criticize LLMs for merely regurgitating
           | statistically related token sequences have very clearly never
           | read a single HN comment
        
         | nightski wrote:
         | IMO within the documentation .md files the information density
         | should be very high. Higher than trying to shove the entire
         | codebase into context that is for sure.
        
           | johnsmith1840 wrote:
           | You deffinetly don't just push the entire code base. Previous
           | models required you to be meticulous about your input. A
           | function here a class there.
           | 
           | Even now if I am working on REALLY hard problems I will still
           | manually copy and paste code sections out for discussion and
           | algorithm designs. Depends on complexity.
           | 
           | This is why I still believe open ai O1-Pro was the best model
           | I've ever seen. The amount of compute you could throw at a
           | problem was absurd.
        
         | senshan wrote:
         | > I never give the second to my LLM.
         | 
         | How do you practically achieve this? Honest question. Thanks
        
           | johnsmith1840 wrote:
           | Custom scripts.
           | 
           | 1. Turn off 2. Code 3. Turn on 4. Commit
           | 
           | I also delete all llm comments they 100% poison your
           | codebase.
        
             | senshan wrote:
             | >> 1. The raw code with no empty space or comments. 2. Code
             | with comments
             | 
             | > 1. Turn off 2. Code 3. Turn on 4. Commit
             | 
             | What does it mean "turn off" / "turn on"?
             | 
             | Do you have a script to strip comments?
             | 
             | Okay, after the comments were stripped, does this become
             | the common base for 3-way merge?
             | 
             | After modification of the code stripped of the comments, do
             | you apply 3-way merge to reconcile the changes and the
             | comments?
             | 
             | This seems a lot of work. What is the benefit? I mean
             | demonstrable benefit.
             | 
             | How does it compare to instructing through AGENTS.md to
             | ignore all comments?
        
               | johnsmith1840 wrote:
               | Telling an AI to ignore comments != no comments that's
               | pretty fundamental to get my point.
        
               | senshan wrote:
               | >> 1. The raw code with no empty space or comments. 2.
               | Code with comments
               | 
               | > 1. Turn off 2. Code 3. Turn on 4. Commit
               | 
               | So can you describe your "turn off" / "turn on" process
               | in practical terms?
               | 
               | Asking simply because saying "Custom scripts" is similar
               | to saying "magic".
        
         | Mtinie wrote:
         | > 1. The raw code with no empty space or comments. 2. Code with
         | comments
         | 
         | I like the sound of this but what technique do you use to
         | maintain consistency across both views? Do you have a post-
         | modification script which will strip comments and extraneous
         | empty space after code has been modified?
        
           | wormpilled wrote:
           | Curious if that is the case, how you would put comments back
           | too? Seems like a mess.
        
             | Mtinie wrote:
             | As I think more on how this could work, I'd treat the fully
             | commented code as the source of truth (SOT).
             | 
             | 1. SOT through a processor to strip comments and extra
             | spaces. Publish to feature branch.
             | 
             | 2. Point Claude at feature branch. Prompt for whatever
             | changes you need. This runs against the minimalist feature
             | branch. These changes will be committed with comments and
             | readable spacing for the new code.
             | 
             | 3. Verify code changes meet expectations.
             | 
             | 4. Diff the changes from minimal version, and merge only
             | that code into SOT.
             | 
             | Repeat.
        
               | johnsmith1840 wrote:
               | Just test it, maybe you won't get a boost.
               | 
               | 1. Run into a problem you and AI can't solve. 2. Drop all
               | comments 3. Restart debug/design session 4. Solve it and
               | save results 5. Revert code to have comments and put
               | update in
               | 
               | If that still doesn't work: Step 2.5 drop all unrelated
               | code from context
        
           | johnsmith1840 wrote:
           | Custom scripts and basic merge logic but manual still happens
           | around modifications. Forces me to update stale comments
           | around changes anyhow.
           | 
           | I first "discovered" it because I repeatedly found LLM
           | comments poisoned my code base over time and linited it's
           | upper end of ability.
           | 
           | Easy to try just drop comments around a problem and see the
           | difference. I was previously doing that and then manually
           | updating the original.
        
         | Aurornis wrote:
         | > I have found that more context comments and info damage
         | quality on hard problems.
         | 
         | There can be diminishing returns, but every time I've used
         | Claude Code for a real project I've found myself repeating
         | certain things over and over again and interrupting tool usage
         | until I put it in the Claude notes file.
         | 
         | You shouldn't try to put everything in there all the time, but
         | putting key info in there has been very high ROI for me.
         | 
         | Disclaimer: I'm a casual user, not a hardcore vibe coder.
         | Claude seems much more capable when you follow the happy path
         | of common projects, but gets constantly turned around when you
         | try to use new frameworks and tools and such.
        
           | lostdog wrote:
           | Agreed, I don't love the CLAUDE.md that gets autogenerated.
           | It's too wordy for me to understand and for the model to
           | follow consistently.
           | 
           | I like to write my CLAUDE.md directly, with just a couple
           | paragraphs describing the codebase at a high level, and then
           | I add details as I see the model making mistakes.
        
           | MarkMarine wrote:
           | Setting hooks has been super helpful for me, you can reject
           | certain uses of tools (don't touch my tests for this session)
           | with just simple scripting code.
        
             | brianwawok wrote:
             | Git lint hook has been key. No matter how many times I told
             | it, it lints randomly. Sometimes not at all. Sometime
             | before rubbing tests (but not after fixing test failures).
        
         | schrodinger wrote:
         | Genuinely curious -- how did you isolate the effect of
         | comments/context on model performance from all the other
         | variables that change between sessions (prompt phrasing, model
         | variance, etc)? In other words, how did you validate the
         | hypothesis that "turning off the comments" (assuming you mean
         | stripping them temporarily...) resulted in an objectively
         | superior experience?
         | 
         | What did your comparison process look like? It feels
         | intuitively accurate and validates my anecdotal impression but
         | I'd love to hear the rigor behind your conclusions!
        
           | johnsmith1840 wrote:
           | I was already in the habit of copy pasting relevent code
           | sections to maximize reasoning performance to squeeze earlier
           | weaker models performance on stubborn problems. (Still do
           | this on really nasty ones)
           | 
           | It's also easy to notice LLMs create garbage comments that
           | get worse over time. I started deleting all comments manually
           | alongside manual snippet selection to get max performance.
           | 
           | Then started just routinely deleting all comments pre big
           | problem solving session. Was doing it enough to build some
           | automation.
           | 
           | Maybe high quality human comments improve ability? Hard to
           | test in a hybrid code base.
        
         | saturatedfat wrote:
         | could u share some more intuition as to why you started
         | believing that? are there ANY comments that are useful?
        
         | stpedgwdgfhgdd wrote:
         | The comments are what makes the model understand your code much
         | better.
         | 
         | See it as a human, the comments are there to speed up
         | understanding of the code.
        
         | xpe wrote:
         | > I have found that more context comments and info damage
         | quality on hard problems.
         | 
         | I'm skeptical this a valid generalization over what was
         | directly observed. [1] We would learn more if they wrote a more
         | detailed account of their observations. [2]
         | 
         | I'd like to draw a parallel to another area of study possibly
         | unfamiliar to many of us. Anthropology faced similar issues
         | until Geertz's 1970s reform emphasized "thick description" [3]
         | meaning detailed contextual observations instead of thin
         | generalization.
         | 
         | [1]: I would not draw this generalization. I've found that
         | adding guidelines (on the order of 10k tokens) to my CLAUDE.md
         | has been beneficial across all my conversations. At the same
         | time, I have not constructed anything close to _study_ of
         | variations of my approach. And the underlying models are a
         | moving target. I will admit that some of my guidelines were
         | added to address issues I saw over a year ago and may be
         | nothing more than vestigial appendages nowadays. This is why I
         | 'm reluctant to generalize.
         | 
         | [2]: What kind of "hard problems"? What is meant by "more"
         | exactly? (Going from 250 to 500 tokens? 1000 to 2000? 2500 to
         | 5000? &c) How much overlap exists between the CLAUDE.md content
         | items? How much ambiguity? How much contradiction?
         | 
         | [3]: https://en.wikipedia.org/wiki/Thick_description
        
       | malshe wrote:
       | I have been using Claude.md to stuff way too many instructions so
       | this article was an eye opener. Btw, any tips for Claude.md when
       | one uses subagents?
        
       | 0xcb0 wrote:
       | Here is my take, on writing a good claude.md. I had very good
       | results with my 3 file approach. And it has also been inspired by
       | the great blog posts that Human Layer is publishing from time to
       | time https://github.com/marcuspuchalla/claude-project-management
        
       | mmaunder wrote:
       | That paper the article references is old at this point. No GPT
       | 5.1, no Gemini 3, which both were game changers. I'd love to see
       | their instruction following graphs.
        
         | 0xblacklight wrote:
         | Same!
        
       | grishka wrote:
       | Oh yeah I added a CLAUDE.md to my project the other day:
       | https://github.com/grishka/Smithereen/blob/master/CLAUDE.md
       | 
       | Is it a good one?
        
         | lijok wrote:
         | I copy/pasted it into my codebase to see if it's any good and
         | now Claude is refusing to do any work? I asked Copilot to
         | investigate why Claude is not working but it too is not
         | working. Do you know what happened?
        
         | wizzledonker wrote:
         | Definitely a good one - probably one of the best CLAUDE.md
         | files you can put in any repository if you care about your
         | project at all.
        
       | max-privatevoid wrote:
       | The only good Claude.md is a deleted Claude.md.
        
         | rvz wrote:
         | This is the only correct answer.
        
       | VimEscapeArtist wrote:
       | What's the actual completion rate for Advent of Code? I'd bet the
       | majority of participants drop off before day 25, even among those
       | aiming to complete it.
       | 
       | Is this intentional? Is AoC designed as an elite challenge, or is
       | the journey more important than finishing?
        
         | philipwhiuk wrote:
         | Wrong article.
         | 
         | I rarely get past 18 or so. The stats for last year are here:
         | https://adventofcode.com/2024/stats
        
       | DR_MING wrote:
       | I already forgot CLAUDE.md, I generate and update it by AI, I
       | prefer to keep design, tasks, docs folder instead. It is always
       | better to ask it to read a some spec docs and read the real code
       | first before doing anything.
        
       | nico wrote:
       | > Claude often ignores CLAUDE.md
       | 
       | > The more information you have in the file that's not
       | universally applicable to the tasks you have it working on, the
       | more likely it is that Claude will ignore your instructions in
       | the file
       | 
       | Claude.md files can get pretty long, and many times Claude Code
       | just stops following a lot of the directions specified in the
       | file
       | 
       | A friend of mine tells Claude to always address him as "Mr
       | Tinkleberry", he says he can tell Claude is not paying attention
       | to the instructions on Claude.md, when Claude stops calling him
       | "Mr Tinkleberry" consistently
        
         | stingraycharles wrote:
         | That's hilarious and a great way to test this.
         | 
         | What I'm surprised about is that OP didn't mention having
         | multiple CLAUDE.md files in each directory, specifically
         | describing the current context / files in there. Eg if you have
         | some database layer and want to document some critical things
         | about that, put it in "src/persistence/CLAUDE.md" instead of
         | the main one.
         | 
         | Claude pulls in those files automatically whenever it tries to
         | read a file in that directory.
         | 
         | I find that to be a very effective technique to leverage
         | CLAUDE.md files and be able to put a lot of content in them,
         | but still keep them focused and avoid context bloat.
        
           | sroussey wrote:
           | Ummm... sounds like that directory should have a readme. And
           | Claude should read readme files.
        
             | stingraycharles wrote:
             | READMEs are written for people, CLAUDE.mds are written for
             | coding assistants. I don't write "CRITICAL (PRIORITY 0):"
             | in READMEs.
             | 
             | The benefit of CLAUDE.md files is that they're pulled in
             | automatically, eg if Claude wants to read
             | "tests/foo_test.py" it will automatically pull in
             | "tests/CLAUDE.md" (if it exists).
        
               | adastra22 wrote:
               | Is this documented anywhere? This is the first I have
               | ever heard of it.
        
               | grumbelbart wrote:
               | Here: https://www.anthropic.com/engineering/claude-code-
               | best-pract...
               | 
               | claude.md seems to be important enough to be their very
               | first point in that document.
        
               | ffsm8 wrote:
               | Naw man, it's the first point because in April Claude
               | code didn't really gave anything else that somewhat
               | worked.
               | 
               | I tried to use that effectively, I even started a new
               | greenfield project just to make sure to test it under
               | ideal circumstances - and while it somewhat worked, it
               | was always super lackluster and way more effective to
               | explicitly add the context manually via prepared md you
               | just reference in the prompt.
               | 
               | I'd tell anyone to go for skills first before littering
               | your project with these config files everywhere
        
               | llbeansandrice wrote:
               | If AI is supposed to deliver on this magical no-lift ease
               | of use task flexibility that everyone likes to talk about
               | I think it should be able to work with a README instead
               | of clogging up ALL of my directories with yet another
               | fucking config file.
               | 
               | Also this isn't portable to other potential AI tools. Do
               | I need 3+ md files in every directory?
        
               | stingraycharles wrote:
               | It's not delivering on magical stuff. Getting real
               | productivity improvements out of this requires
               | engineering and planning and it needs to be approached as
               | such.
               | 
               | One of the big mistakes I think is that all these tools
               | are over-promising on the "magic" part of it.
               | 
               | It's not. You need to really learn how to use all these
               | tools effectively. This is not done in days or weeks
               | even, it takes months in the same way becoming proficient
               | in eMacs or vim or a programming language is.
               | 
               | Once you've done that, though, it can absolutely enhance
               | productivity. Not 10x, but definitely in the area of 2x.
               | Especially for projects / domains you're uncomfortable
               | with.
               | 
               | And of course the most important thing is that you need
               | to enjoy all this stuff as well, which I happen to do. I
               | can totally understand the resistance as it's a shitload
               | of stuff you need to learn, and it may not even be
               | relevant anymore next year.
        
               | giancarlostoro wrote:
               | Yeah I feel like on average I still spend a similar
               | amount of time developing but drastically less time
               | fixing obscure bugs, because once it codes the feature
               | and I describe the bugs it fixed them, the rest of my
               | times spent testing and reviewing code.
        
               | tsimionescu wrote:
               | While I believe you're probably right that getting any
               | productivity gains from these tools requires an
               | investment, I think calling the process "engineering" is
               | really stretching the meaning of the word. It's really
               | closer to ritual magic than any solid engineering
               | practices at this point. People have guesses and
               | practices that may or may not actually work for them
               | (since measuring productivity increases is difficult if
               | not impossible), and they teach others their magic
               | formulas for controlling the demon.
        
               | gtaylor wrote:
               | Most countries don't have a notion of a formally licensed
               | software engineer, anyway. Arguing what is and is not
               | engineering is not useful.
        
               | llbeansandrice wrote:
               | I think it's relevant when people keep using terms like
               | "prompt engineering" to try and beef up this charade of
               | md files that don't even seem to work consistently.
               | 
               | This is a far far cry from even writing yaml for
               | Github/Gitlab CICD pipelines. Folks keep trying to say
               | "engineering" when every AI thread like this seems to
               | push me more towards "snake oil" as an appropriate term.
        
               | tsimionescu wrote:
               | Most countries don't have a notion of a formally licenses
               | physicist either. That doesn't make it right to call
               | astrology physics. And all of the practices around using
               | LLM agents for coding are a lot closer to astrology than
               | they are to astronomy.
               | 
               | I was replying to someone who claimed that getting real
               | productivity gains from this tool requires engineering
               | and needs to be approached as such. It also compared
               | learning to use LLM agents to learning to code in emacs
               | or vim, or learning a programming language - things which
               | are nothing alike to learning to control an inherently
               | stochastic tool that can't even be understood using any
               | of our regular scientific methods.
        
               | fpauser wrote:
               | >> [..] and it may not even be relevant anymore next
               | year.
               | 
               | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               | ^
        
               | nineteen999 wrote:
               | Learning how to equip a local LLM with tools it can use
               | to interact with to extend its capabilities has been a
               | lot of fun for me and is a great educational experience
               | for anyone who is interested. Just another tool for the
               | toolchest.
        
               | llbeansandrice wrote:
               | My issue is not with learning. This "tool" has an
               | incredibly shallow learning curve. My issue is that I'm
               | having to make way for these "tools" that everyone says
               | vastly increases productivity but seems to just churn out
               | tech-debt as quickly as it can write it.
               | 
               | It a large leap to "requires engineering and planning"
               | when no one even in this thread can seem to agree on the
               | behavior of any of these "tools". Some comments tell
               | anecdotes of not getting the agents to listen until the
               | context of the whole world is laid out in these md files.
               | Others say the only way is to keep the context tight and
               | focused, going so far as to have written _yet more tools_
               | to remove and re-add code comments so they don't "poison"
               | the context.
               | 
               | I am slightly straw-manning, but the tone in this thread
               | has already shifted from a few months ago where these
               | "tools" were going to immediately give huge productivity
               | gains but now you're telling me they need 1) their own
               | special files everywhere (again, this isn't even agreed
               | on) and 2) "engineering and planning...not done in days
               | or weeks even
               | 
               | The entire economy is propped up on this tech right now
               | and no one can even agree on whether it's effective or
               | how to use it properly? Not to mention the untold damage
               | it is doing to learning outcomes.
        
               | whywhywhywhy wrote:
               | > Do I need 3+ md files in every directory?
               | 
               | Don't worry, as of about 6 weeks ago when they changed
               | the system prompt Claude will make sure every folder has
               | way more than 3 .md files seen as it often writes 2 or
               | more per task so if you don't clean them up...
        
               | solumunus wrote:
               | Strange. I haven't experienced this a single time and I
               | use it almost all day everyday.
        
               | beardedwizard wrote:
               | That is strange because it's been going on since sonnet
               | 4.5 release.
        
               | mewpmewp2 wrote:
               | Is your logic that unless something is perfect it should
               | not be used even though it is delivering massive
               | productivity gains?
        
               | swiftcoder wrote:
               | > it is delivering massive productivity gains
               | 
               | [citation needed]
               | 
               | Every article I can find about this is citing the
               | valuation of the S&P500 as evidence of the productivity
               | gains, and that feels very circular
        
               | brigandish wrote:
               | I often can't tell the difference between my Readme and
               | Claude files to the point that I cannibalise the Claude
               | file for the Readme.
               | 
               | It's the difference between instructions for a user and
               | instructions for a developer, but in coding projects
               | that's not much different.
        
               | pmarreck wrote:
               | > "CRITICAL (PRIORITY 0):"
               | 
               | There's no need for this level of performative
               | ridiculousness with AGENTS.md (Codex) directives, FYI.
        
         | grayhatter wrote:
         | > A friend of mine tells Claude to always address him as "Mr
         | Tinkleberry", he says he can tell Claude is not paying
         | attention to the instructions on Claude.md, when Claude stops
         | calling him "Mr Tinkleberry" consistently
         | 
         | this is a totally normal thing that everyone does, that no one
         | should view as a signal of a psychotic break from reality...
         | 
         | is your friend in the room with us right now?
         | 
         | I doubt I'll ever understand the lengths AI enjoyers will go
         | though just to avoid any amount of independent thought...
        
           | crystal_revenge wrote:
           | I suspect you're misjudging the friend here. This sounds more
           | like the famous "no brown m&ms" clause in the Van Halen
           | performance contract. As ridiculous as the request is, it
           | being followed provides strong evidence that the rest (and
           | more meaningful) of the requests are.
           | 
           | Sounds like the friend understands quite well how LLMs
           | actually work and has found a clever way to be signaled when
           | it's starting to go off the rails.
        
             | davnicwil wrote:
             | It's also a common tactic for filtering inbound email.
             | 
             | Mention that people may optionally include some word like
             | 'orange' in the subject line to tell you they've come via
             | some place like your blog or whatever it may be, and have
             | read at least carefully enough to notice this.
             | 
             | Of course ironically that trick's probably trivially broken
             | now because of use of LLMs in spam. But the point stands,
             | it's an old trick.
        
               | kiernan wrote:
               | Could try asking for a seahorse emoji in addition...
        
               | mosselman wrote:
               | Apart from the fact that not even every human would read
               | this and add it to the subject, this would still work.
               | 
               | I doubt there is any spam machine out there the quickly
               | tries to find peoples personal blog before sending them
               | viagra mail.
               | 
               | If you are being targeted personally, then of course all
               | bets are off, but that would've been the case with or
               | without the subject-line-trick
        
             | grayhatter wrote:
             | > I suspect you're misjudging the friend here. This sounds
             | more like the famous "no brown m&ms" clause in the Van
             | Halen performance contract. As ridiculous as the request
             | is, it being followed provides strong evidence that the
             | rest (and more meaningful) of the requests are.
             | 
             | I'd argue, it's more like you've bought so much into the
             | idea this is reasonable, that you're also willing to go
             | through extreme lengths to recon and pretend like this is
             | sane.
             | 
             | Imagine two different worlds, one where the tools that
             | engineers use, have a clear, and reasonable way to detect
             | and determine if the generative subsystem is still on the
             | rails provided by the controller.
             | 
             | And another world where the interface is completely devoid
             | of any sort of basic introspection interface, and because
             | it's a problematic mess, all the way down, everyone invents
             | some asinine way that they believe provides some sort of
             | signal as to whether or not the random noise generator has
             | gone off the rails.
             | 
             | > Sounds like the friend understands quite well how LLMs
             | actually work and has found a clever way to be signaled
             | when it's starting to go off the rails.
             | 
             | My point is that while it's a cute hack, if you step back
             | and compare it objectively, to what good engineering would
             | look like. It's wild so many people are all just willing to
             | accept this interface as "functional" because it means they
             | don't have to do the thinking that required to emit the
             | output the AI is able to, via the specific randomness
             | function used.
             | 
             | Imagine these two worlds actually do exist; and instead of
             | using the real interface that provides a clear bool answer
             | to "the generative system has gone off the rails" they
             | *want* to be called Mr Tinkerberry
             | 
             | Which world do you think this example lives in? You could
             | convince me, Mr Tinkleberry is a cute example of the
             | latter, obviously... but it'd take effort to convince me
             | that this reality is half reasonable or that's it's
             | reasonable that people who would want to call themselves
             | engineers should feel proud to be a part of this one.
             | 
             | Before you try to strawman my argument, this isn't a
             | gatekeeping argument. It's only a critical take on the
             | interface options we have to understand something that
             | might as well be magic, because that serves the snakeoil
             | sales much better.
             | 
             | > > Is the magic token machine working?
             | 
             | > Fuck I have no idea dude, ask it to call you a funny
             | name, if it forgets the funny name it's probably broken,
             | and you need to reset it
             | 
             | Yes, I enjoy working with these people and living in this
             | world.
        
               | gyomu wrote:
               | It is kind of wild that not that long ago the general
               | sentiment in software engineering (at least as observed
               | on boards like this one) seemed to be about valuing
               | systems that were understandable, introspectable, with
               | tight feedback loops, within which we could compose
               | layers of abstractions in meaningful and predictable ways
               | (see for example the hugely popular - at the time - works
               | of Chris Granger, Bret Victor, etc).
               | 
               | And now we've made a complete 180 and people are getting
               | excited about proprietary black boxes and "vibe
               | engineering" where you have to pretend like the computer
               | is some amnesic schizophrenic being that you have to
               | coerce into maybe doing your work for you, but you're
               | never really sure whether it's working or not because who
               | wants to read 8000 line code diffs every time you ask
               | them to change something. And never mind if your feedback
               | loops are multiple minutes long because you're waiting on
               | some agent to execute some complex network+GPU bound
               | workflow.
        
               | adastra22 wrote:
               | You don't think people are trying very hard to understand
               | LLMs? We recognize the value of interpretability. It is
               | just not an easy task.
               | 
               | It's not the first time in human history that our ability
               | to create things has exceeded our capacity to understand.
        
               | gyomu wrote:
               | Your comment would be more useful if you could point us
               | to some concrete tooling that's been built out in the
               | last ~3 years that LLM assisted coding has been around to
               | improve interpretability.
        
               | adastra22 wrote:
               | That would be the exact opposite of my claim: it is a
               | very hard problem.
        
               | grayhatter wrote:
               | > You don't think people are trying very hard to
               | understand LLMs? We recognize the value of
               | interpretability. It is just not an easy task.
               | 
               | I think you're arguing against a tangential position to
               | both me, and the person this directly replies to. It can
               | be hard to use and understand something, but if you have
               | a magic box that you can't tell if it's working. It
               | doesn't belong anywhere near the systems that other
               | humans use. The people that use the code you're about to
               | commit to whatever repo you're generating code for, all
               | deserve better than to be part of your unethical science
               | experiment.
               | 
               | > It's not the first time in human history that our
               | ability to create things has exceeded our capacity to
               | understand.
               | 
               | I don't agree this is a correct interpretation of the
               | current state of generative transformer based AI. But
               | even if you wanted to try to convince me; my point would
               | still be, this belongs in a research lab, not anywhere
               | near prod. And that wouldn't be a controversial idea in
               | the industry.
        
               | nineteen999 wrote:
               | > It doesn't belong anywhere near the systems that other
               | humans use
               | 
               | Really for those of us who _actually_ work in critical
               | systems (emergency services in my case) - of course we
               | 're not going to start patching the core applications
               | with vibe code.
               | 
               | But yeah, that frankenstein reporting script that half a
               | dozen amateur hackers made a mess of over 20 years
               | instead of refactoring and redesigning? That's prime
               | fodder for this stuff. NOBODY wants to clean that stuff
               | up by hand.
        
               | grayhatter wrote:
               | > Really for those of us who actually work in critical
               | systems (emergency services in my case) - of course we're
               | not going to start patching the core applications with
               | vibe code.
               | 
               | I used to believe that no one would seriously consider
               | this too... but I don't believe that this is a safe
               | assumption anymore. You might be the exception, but there
               | are many more people who don't consider the implications
               | of turning over said intellectual control.
               | 
               | > But yeah, that frankenstein reporting script that half
               | a dozen amateur hackers made a mess of over 20 years
               | instead of refactoring and redesigning? That's prime
               | fodder for this stuff. NOBODY wants to clean that stuff
               | up by hand.
               | 
               | It's horrible, no one currently understands it, so let
               | the AI do it, so that still, no one will understand it,
               | but at least this one bug will be harder to trigger.
               | 
               | I don't agree that harder to trigger bugs are better than
               | easy to trigger bugs. And from my view, the argument that
               | "it's currently broken now, and hard to fix!" Isn't
               | exactly an argument I find compelling for leaving it that
               | way.
        
               | adastra22 wrote:
               | We used the steam engine for 100 years before we had a
               | firm understanding of why it worked. We still don't
               | understand how ice skating works. We don't have a
               | physical understanding of semi-fluid flow in grain silos,
               | but we've been using them since prehistory.
               | 
               | I could go on and on. The world around you is full of not
               | well understood technology, as well as non deterministic
               | processes. We know how to engineer around that.
        
               | grayhatter wrote:
               | > We used the steam engine for 100 years before we had a
               | firm understanding of why it worked. We still don't
               | understand how ice skating works. We don't have a
               | physical understanding of semi-fluid flow in grain silos,
               | but we've been using them since prehistory.
               | 
               | I don't think you and I are using the same definition for
               | "firm understanding" or "how it works".
               | 
               | > I could go on and on. The world around you is full of
               | not well understood technology, as well as non
               | deterministic processes. We know how to engineer around
               | that.
               | 
               | Again, you're side stepping my argument so you can
               | restate things that are technically correct, but not
               | really a point in of themselves. I see people who want to
               | call themselves software engineers throw code they
               | clearly don't understand against the wall because the AI
               | said so. There's a significant delta between knowing you
               | can heat water to turn it into a gas with increased
               | pressure that you can use to mechanically turn a wheel,
               | vs, put wet liquid in jar, light fire, get magic spinny
               | thing. If jar doesn't call you a funny name first, that's
               | bad!
        
               | adastra22 wrote:
               | > I don't think you and I are using the same definition
               | for "firm understanding" or "how it works".
               | 
               | I'm standing in firm ground here. Debate me in the
               | details if you like.
               | 
               | You are constructing a strawman.
        
               | adastra22 wrote:
               | It feels like you're blaming the AI engineers here, that
               | they built it this way out of ignorance or something.
               | Look into interpretability research. It is a hard
               | problem!
        
               | grayhatter wrote:
               | I am blaming the developers who use AI because they're
               | willing to sacrifice intellectual control in trade for
               | something that I find has minimal value.
               | 
               | I agree it's likely to be a complex or intractable
               | problem. But I don't enjoy watching my industry revert
               | down the professionalism scale. Professionals don't
               | choose tools that they can't explain how it works. If
               | your solution to understanding if your tool is still
               | functional is inventing an amusing name and trying to use
               | that as the heuristic, because you have no better way to
               | determine if it's still working correctly. That feels
               | like it might be a problem, no?
        
               | adastra22 wrote:
               | I'm sorry you don't like it. But this has very strong
               | old-man-yells-at-cloud vibes. This train is moving,
               | whether you want it to or not.
               | 
               | Professionals use tools that work, whether they know why
               | it works is of little consequence. It took 100 years to
               | explain the steam engine. That didn't stop us from making
               | factories and railroads.
        
               | grayhatter wrote:
               | > It took 100 years to explain the steam engine. That
               | didn't stop us from making factories and railroads.
               | 
               | You keep saying this, why do you believe it so strongly?
               | Because I don't believe this is true. Why do you?
               | 
               | And then, even assuming it's completely true exactly as
               | stated; shouldn't we have higher standards than that when
               | dealing with things that people interact with? Boiler
               | explosions are bad right? And we should do everything we
               | can to prove stuff works the way we want and expect? Do
               | you think AI, as it's currently commonly used, helps do
               | that?
        
               | adastra22 wrote:
               | Because I'm trained as a physicist and (non-software)
               | engineer and I know my field's history? Here's the first
               | result that comes up on Google. Seems accurate from a
               | quick skim: https://www.ageofinvention.xyz/p/age-of-
               | invention-why-wasnt-...
               | 
               | And yes we should seek to understand new inventions.
               | Which we are doing right now, in the form of
               | interpretability research.
               | 
               | We should not be making Luddite calls to halt progress
               | simply because our analytic capabilities haven't caught
               | up to our progress in engineering.
        
               | grayhatter wrote:
               | Can you cite a section from this very long page that
               | might convince me no one at the time understood how
               | turning water into steam worked to create pressure?
               | 
               | If this is your industry, shouldn't you have a more
               | reputable citation, maybe something published more
               | formally? Something expected to stand up to peer review,
               | instead of just a page on the internet?
               | 
               | > We should not be making Luddite calls to halt progress
               | simply because our analytic capabilities haven't caught
               | up to our progress in engineering.
               | 
               | You've misunderstood my argument. I'm not making a
               | luddite call to halt progress, I'm objecting to my
               | industry which should behave as one made up of
               | professionals, willingly sacrifice intellectual control
               | over the things they are responsible for, and advocate
               | others should do the same. Especially not at the expense
               | of users, which I see happening.
               | 
               | Anything that results in sacrificing the understanding
               | over exactly how the thing you built works is bad should
               | be avoided. The source, either AI or something different,
               | doesn't matter as much as the result.
        
               | adastra22 wrote:
               | The steam engine is more than just boiling water. It is a
               | thermodynamic cycle that exploits differences in the
               | pressure curve in the expansion and contraction part of
               | the cycle and the cooling of expanding gas to turn a
               | temperature difference (the steam) into physical force
               | (work).
               | 
               | To really understand WHY a steam engine works, you need
               | to understand the behavior of ideal gasses (1787 - 1834)
               | and entropy (1865). The ideal gas law is enough to
               | perform calculations needed to design a steam engine, but
               | it was seen at the time to be just as inscrutable. It was
               | an empirical observation not derivable from physical
               | principles. At least not until entropy was understood in
               | 1865.
               | 
               | James Watt invented his steam engine in 1765, exactly a
               | hundred years before the theory of statistical mechanics
               | that was required to explain why it worked, and prior to
               | all of the gas laws except Boyle's.
        
               | orbital-decay wrote:
               | This reads like you either have an idealized view of Real
               | Engineering(tm), or used to work in a stable, extremely
               | regulated area (e.g. civil engineering). I used to work
               | in aerospace in the past, and we had a lot of silly Mr
               | Tinkleberry canaries. We didn't strictly rely on them
               | because our job was "extremely regulated" to put it
               | mildly, but they did save us some time.
               | 
               | There's a ton of pretty stable engineering subfields that
               | involve a lot more intuition than rigor. A lot of things
               | in EE are like that. Anything novel as well. That's how
               | steam in 19th century or aeronautics in the early 20th
               | century felt. Or rocketry in 1950s, for that matter.
               | There's no need to be upset with the fact that some
               | people want to hack explosive stuff together before it
               | becomes a predictable glacier of Real Engineering.
        
               | gyomu wrote:
               | Man I hate this kind of HN comment that makes grand
               | sweeping statement like "that's how it was with steam in
               | the 19th century or rocketry in the 1950s", because
               | there's no way to tell whether you're just pulling these
               | things out of your... to get internet points or actually
               | have insightful parallels to make.
               | 
               | Could you please elaborate with concrete examples on how
               | aeronautics in the 20th century felt like having a
               | fictional friend in a text file for the token predictor?
        
               | orbital-decay wrote:
               | We're not going to advance the discussion this way. I
               | also hate this kind of HN comment that makes grand
               | sweeping statement like "LLMs are like having a fictional
               | friend in a text file for the token predictor", because
               | there's no way to tell whether you're just pulling these
               | things out of your... to get internet points or actually
               | have insightful parallels to make.
               | 
               | Yes, during the Wright era aeronautics was absolutely
               | dominated by tinkering, before the aerodynamics was
               | figured out. It wouldn't pass the high standard of Real
               | Engineering.
        
               | grayhatter wrote:
               | > Yes, during the Wright era aeronautics was absolutely
               | dominated by tinkering, before the aerodynamics was
               | figured out. It wouldn't pass the high standard of Real
               | Engineering.
               | 
               | Remind me: did the Wright brothers start selling tickets
               | to individuals telling them it was completely safe? Was
               | step 2 of their research building a large passenger
               | plane?
               | 
               | I originally wanted to avoid that specific flight
               | analogy, because it felt a bit too reductive. But while
               | we're being reductive, how about medicine too; the first
               | smallpox vaccine was absolutely not well understood...
               | would that origin story pass ethical review today? What
               | do you think the pragmatics would be if the medical
               | profession encouraged that specific kind of behavior?
               | 
               | > It wouldn't pass the high standard of Real Engineering.
               | 
               | I disagree, I think it 100% is really engineering.
               | Engineering at it's most basic is tricking physics into
               | doing what you want. There's no more perfect example of
               | that than heavier than air flight. But there's a critical
               | difference between engineering research, and
               | experimenting on unwitting people. I don't think users
               | need to know how the sausage is made. That counts equally
               | to planes, bridges, medicine, and code. But the
               | professionals absolutely must. It's disappointing
               | watching the industry I'm a part of willingly eschew
               | understanding to avoid a bit of effort. Such a thing is
               | considered malpractice in "real professions".
               | 
               | Ideally neither of you to wring your hands about the
               | flavor or form of the argument, or poke fun at the
               | gamified comment thread. But if you're gonna complain
               | about adding positively to the discussion, try to add
               | something to it along with the complaints?
        
               | orbital-decay wrote:
               | As a matter of fact, commercial passenger service started
               | almost immediately as the tech was out of the fiction
               | phase. The airship were large, highly experimental,
               | barely controllable, hydrogen-filled death traps that
               | were marketed as luxurious and safe. First airliners also
               | appeared with big engines and large planes (WWI disrupted
               | this a bit). Nothing of that was built on solid grounds.
               | The adoption was only constrained by the industrial
               | capacity and cost. Most large aircraft were more or less
               | experimental up until the 50's, and aviation in general
               | was unreliable until about 80's.
               | 
               | I would say that right from the start everyone was pretty
               | well aware about the unreliability of LLM-assisted coding
               | and nobody was experimenting on unwitting people or
               | forcing them to adopt it.
               | 
               |  _> Engineering at it's most basic is tricking physics
               | into doing what you want._
               | 
               | Very well, then Mr Tinkleberry also passes the bar
               | because it's exactly such a trick. That it irks you as a
               | cheap hack that lacks rigor (which it does) is another
               | matter.
        
               | grayhatter wrote:
               | > As a matter of fact, commercial passenger service
               | started almost immediately as the tech was out of the
               | fiction phase. The airship were large, highly
               | experimental, barely controllable, hydrogen-filled death
               | traps that were marketed as luxurious and safe.
               | 
               | And here, you've stumbled onto the exact thing I'm
               | objecting to. I think the Hindenburg disaster was a bad
               | thing, and software engineering shouldn't repeat those
               | mistakes.
               | 
               | > Very well, then Mr Tinkleberry also passes the bar
               | because it's exactly such a trick. That it irks you as a
               | cheap hack that lacks rigor (which it does) is another
               | matter.
               | 
               | Yes, this is what I said.
               | 
               | > there's a critical difference between engineering
               | research, and experimenting on unwitting people.
               | 
               | I object to watching developers do, exactly that.
        
               | grayhatter wrote:
               | > There's no need to be upset with the fact that some
               | people want to hack explosive stuff together before it
               | becomes a predictable glacier of Real Engineering.
               | 
               | You misunderstand me. I'm not upset that people are
               | playing with explosives. I'm upset that my industry is
               | playing with explosives that all read, "front: face
               | towards users"
               | 
               | And then, more upset that we're all seemingly ok with
               | that.
               | 
               | The driving force of enshittifacation of everything, may
               | be external, but degradation clearly comes from engineers
               | first. These broader industry trends only convince me
               | it's not likely to get better anytime soon, and I don't
               | like how everything is user hostile.
        
               | pacifika wrote:
               | This could be a very niche standup comedy routine, I
               | approve.
        
               | solumunus wrote:
               | I use agents almost all day and I do way more thinking
               | than I used to, this is why I'm now more productive.
               | There is little thinking required to produce output,
               | typing requires very little thinking. The thinking is all
               | in the planning... If the LLM output is bad in any given
               | file I simply step in and modify it, and obviously this
               | is much faster than typing every character.
               | 
               | I'm spending more time planning and my planning is more
               | comprehensive than it used to be. I'm spending less time
               | producing output, my output is more plentiful and of
               | equal quality. No generated code goes into my commits
               | without me reviewing it. Where exactly is the problem
               | here?
        
           | Alpha_Logic wrote:
           | The 'canary in the coal mine' approach (like the Mr.
           | Tinkleberry trick) is silly but pragmatic. Until we have
           | deterministic introspection for LLMs, engineers will always
           | invent weird heuristics to detect drift. It's not elegant
           | engineering, but it's effective survival tactics in a non-
           | deterministic loop.
        
         | jmathai wrote:
         | I have a /bootstrap command that I run which instructs Claude
         | Code to read all system and project CLAUDE.md files, skills and
         | commands.
         | 
         | Helps me quickly whip it back in line.
        
           | adastra22 wrote:
           | Isn't that what every new session does?
        
             | threecheese wrote:
             | That also clears the context; a command would just append
             | to the context.
        
               | jmathai wrote:
               | This. I've had Claude not start sessions with all of the
               | CLAUDE.md, skills, commands loaded and I've had it lose
               | it mid-session.
        
           | mrasong wrote:
           | Mind sharing it? (As long as it doesn't involve anything
           | private.)
        
         | chickensong wrote:
         | The article explains why that's not a very good test however.
        
           | sydd wrote:
           | Why not? It's relevant for all tasks, and just adds 1 line
        
             | chickensong wrote:
             | I guess I assumed that it's not highly relevant to the
             | task, but I suppose it depends on interpretation. E.g. if
             | someone tells the bus driver to smile while he drives, it's
             | hopefully clear that actually driving the bus is more
             | important than smiling.
             | 
             | Having experimented with similar config, I found that
             | Claude would adhere to the instructions somewhat reliably
             | at the beginning and end of the conversation, but was
             | likely to ignore during the middle where the real work is
             | being done. Recent versions also seem to be more context-
             | aware, and tend to start rushing to wrap up as the context
             | is nearing compaction. These behaviors seem to support my
             | assumption, but I have no real proof.
        
             | dncornholio wrote:
             | It will also let the LLM process even more tokens, thus
             | decreasing it's accuracy
        
         | globular-toast wrote:
         | It baffles me how people can be happy working like this. "I
         | wrap the hammer in paper so if the paper breaks I know the
         | hammer has turned into a saw."
        
           | fragmede wrote:
           | probably by not thinking in ridiculous analogies that don't
           | help
        
           | easyThrowaway wrote:
           | If you have any experience in 3D modeling, I feel it's quite
           | closer to 3D Unwrapping than software development.
           | 
           | You got a bitmap atlas ("context") where you have to cram as
           | much information as possible without losing detail, and then
           | you need to massage both your texture and the structure of
           | your model so that your engine doesn't go mental when trying
           | to map your informations from a 2D to a 3D space.
           | 
           | Likewise, both operations are rarely blemish-free and your
           | ability resides in being able to contain the intrinsic
           | stochastic nature of the tool.
        
           | mewpmewp2 wrote:
           | You could think of it as art or creativity.
        
           | pacifika wrote:
           | > It Is Difficult to Get a Man to Understand Something When
           | His Salary Depends Upon His Not Understanding It
        
         | isoprophlex wrote:
         | That's smart, but I worry that that works only partially;
         | you'll be filling up the context window with conversation turns
         | where the LLM consistently addresses it's user as "Mr.
         | Tinkleberry", thus reinforcing that specifc behavior encoded by
         | CLAUDE.md. I'm not convinced that this way of addressing the
         | user implies that it keeps attention the rest of the file.
        
         | pmarreck wrote:
         | I've found that Codex is much better at instruction-following
         | like that, almost to a fault (for example, when I tell it to
         | "always use TDD", it will try to use TDD even when just fixing
         | already-valid-just-needing-expectation-updates tests!
        
         | sesm wrote:
         | We are back to color-sorted M&Ms bowls.
        
         | homeonthemtn wrote:
         | The green m&M's trick of AI instructions.
         | 
         | I've used that a couple times, e.g. "Conclude your
         | communications with "Purple fish" at the end"
         | 
         | Claude definitely picks and chooses when purple fish will show
         | up
        
           | nathan_douglas wrote:
           | I tell it to accomplish only half of what it thinks it can,
           | then conclude with a haiku. That seems to help, because 1) I
           | feel like it starts shedding discipline as it starts feeling
           | token pressure, and 2) I feel like it is more likely to
           | complete task n - 1 than it is to complete task n. I have no
           | idea if this is actually true or not, or if I'm
           | hallucinating... all I can say is that this is the impression
           | I get.
        
         | bryanrasmussen wrote:
         | I wonder if there are any benefits, side-effects or downsides
         | of everyone using the same fake name for Claude to call them.
         | 
         | If a lot of people always put call me Mr. Tinkleberry in the
         | file will it start calling people Mr. Tinkleberry even when it
         | loses the context because so many people seem to want to be
         | called Mr. Tinkleberry.
        
           | seunosewa wrote:
           | Then you switch to another name.
        
         | dkersten wrote:
         | I used to tell it to always start every message with a specific
         | emoji. Of the emoji wasn't present, I knew the rules were
         | ignored.
         | 
         | But it's bro reliable enough. It can send the emoji or address
         | you correctly while still ignoring more important rules.
         | 
         | Now I find that it's best to have a short and tight rules file
         | that references other files where necessary. And to refresh
         | context often. The longer the context window gets, the more
         | likely it is to forget rules and instructions.
        
         | aqme28 wrote:
         | You could make a hook in Claude to re-inject claude.md. For
         | example, make it say "Mr Tinkleberry" in every response, and
         | failing to do so re-injects the instructions.
        
         | lubujackson wrote:
         | For whatever reason, I can't get into Claude's approach. I like
         | how Cursor handles this, with a directory of files (even
         | subdirectories allowed) where you can define when it should use
         | specific documents.
         | 
         | We are all "context engineering" now but Claude expects one big
         | file to handle everything? Seems luke a deadend approach.
        
           | piokoch wrote:
           | This is good for the company, chances are you will eat more
           | tokens. I liked Aider approach, it wasn't trying to be too
           | clever, it used files added to chat and asks if it figure out
           | that something more is needed (like, say, settings in case of
           | Django application).
           | 
           | Sadly Aider is no longer maintained...
        
           | unshavedyak wrote:
           | I think their skills have the ability to dynamically pull in
           | more data, but so far i've not tested it to much since it
           | seems more tailored towards specific actions. Ie converting a
           | PDF might translate nicely to the Agent pulling in the skill
           | doc, but i'm not sure if it will translate well to it pulling
           | in some rust_testing_patterns.md file when it writes rust
           | tests.
           | 
           | Eg i toyed with the idea of thinning out various CLAUDE.md
           | files in favor of my targeted skill.md files. In doing so my
           | hope was to have less irrelevant data in context.
           | 
           | However the more i thought through this, the more i realized
           | the Agent is doing "everything" i wanted to document each
           | time. Eg i wasn't sure that creating
           | skills/writing_documentation.md and skills/writing_tests.md
           | would actually result in less context usage, since both of
           | those would be in memory most of the time. My CLAUDE.md is
           | already pretty hyper focused.
           | 
           | So yea, anyway my point was that skills _might_ have
           | potential to offload irrelevant context which seems useful.
           | Though in my case i 'm not sure it would help.
        
           | jswny wrote:
           | They have an entire feature for this:
           | https://www.claude.com/blog/skills
           | 
           | CLAUDE.md should only be for persistent reminders that are
           | useful in 100% of your sessions
           | 
           | Otherwise, you should use skills, especially if CLAUDE.md
           | gets too long.
           | 
           | Also just as a note, Claude already supports lazy loaded
           | separate CLAUDE.md files that you place in subdirectories. It
           | will read those if it dips into those dirs
        
       | astrostl wrote:
       | I have Claude itself write CLAUDE.md. Once it is informed of its
       | context (e.g., "README.md is for users, CLAUDE.md is for you")
       | you can say things like, "update readme and claudemd" and it will
       | do it. I find this especially useful for prompts like, "update
       | claudemd to make absolutely certain that you check the API docs
       | every single time before making assumptions about its behavior"
       | -- I don't need to know what magick spell will make that happen,
       | just that it does happen.
        
         | dexwiz wrote:
         | Do you have any proof that AI written instructions are better
         | than human ones? I don't see why an AI would have an innate
         | understanding on how best to prompt itself.
        
           | michaelbuckbee wrote:
           | Generally speaking it has a lot of information from things
           | like OP's blog post on how best to structure the file and
           | prompt itself and you can also (from within Claude Code) ask
           | it to look at posts or Anthropic prompting best practices and
           | adopt those to your own file.
        
           | astrostl wrote:
           | Having been through cycles of manual writing with '#' and
           | having it do it itself, it seems to have been a push on
           | efficacy while spending less effort and getting less
           | frustrated. Hard to quantify except to say that I've had
           | great results with it. I appreciate the spirit of OP's,
           | "CLAUDE.md is the highest leverage point of the harness, so
           | avoid auto-generating it" but you can always ask Claude to
           | tighten it up itself too.
        
         | chickensong wrote:
         | This will start to break down after a while unless you have a
         | small project, for reasons being described in the article.
        
       | brcmthrowaway wrote:
       | Is CLAUDE.md required when claude has a --continue option?
        
         | Zerot wrote:
         | I would recommend using it, yeah. You have limited context and
         | it will be compacted/summarized occasionally. The
         | compaction/summary will lose some information and it is easy
         | for it to forget certain instructions you gave it. Afaik
         | claude.md will be loaded into the context on every compaction
         | which allows you to use it for instructions that should always
         | be included in the context.
        
       | bryanhogan wrote:
       | I've been very satisfied with creating a short AGENTS.md file
       | with the project basics, and then also including references to
       | where to find more information / context, like a /context folder
       | that has markdown files such as app-description.md.
        
       | m13rar wrote:
       | I was waiting for someone to build this so that I can chuck it
       | into CLAUDE and tell it how to write good MD.
        
       | foobarbecue wrote:
       | Funny how this is exactly the documentation you'd need to make it
       | easy for a human to work with the codebase. Perhaps this'll be
       | the greatest thing about LLMs -- they force people to write
       | developer guides for their code. Of course, people are going to
       | ask an LLM to write the CLAUDE.md and then it'll just be more
       | slop...
        
         | chickensong wrote:
         | It's not exactly the doc you'd need for a human. There could be
         | overlap, but each side may also have unique requirements that
         | aren't necessarily suitable for the other. E.g. a doc for a
         | human may have considerably more information than you'd want to
         | give to the agent, or, you may want to define agent behavior
         | for workflows that don't apply to a human.
         | 
         | Also, while it may be hip to call any LLM output slop, that
         | really isn't the case. Look at what a poor history we have of
         | developer documentation. LLMs may not be great at everything,
         | but they're actually quite capable when it comes to technical
         | documentation. Even a 1-shot attempt by LLM is often way better
         | than many devs who either can't write very well, or just can't
         | be bothered to.
        
       | AndyNemmity wrote:
       | it's always funny, i think the opposite. I use a massive
       | CLAUDE.md file, but it's targetted towards very specific details
       | of what to do, and what not to do.
       | 
       | I have a full system of agents, hooks, skills, and commands, and
       | it all works for me quite well.
       | 
       | I believe is massive context, but targetted context. It has to be
       | valuable, and important.
       | 
       | My agents are large. My skills are large. Etc etc.
        
       | wowamit wrote:
       | > Regardless of which model you're using, you may notice that
       | Claude frequently ignores your CLAUDE.md file's contents.
       | 
       | This is a news for me. And at the same time it isn't. Without the
       | knowledge of how the models actually work, most of the prompting
       | is guesstimate at best. You have no control over models via
       | prompts.
        
       | Ozzie_osman wrote:
       | Has anyone had success getting Claude to write it's own Claude.md
       | file? It should be able to deduce rules by looking at the code,
       | documentation, and PR comments.
        
         | handoflixue wrote:
         | The main failure state I find is that Claude wants to write an
         | incredibly verbose Claude.md, but if I instruct it "one
         | sentence per topic, be concise" it usually does a good job.
         | 
         | That said, a lot of what it can deduce by looking at the code
         | is exactly what you shouldn't include, since it will usually
         | deduce that stuff just by interacting with the code base.
         | Claude doesn't seem good at that.
         | 
         | An example of both overly-verbose and unnecessary:
         | 
         | ### 1. Identify the Working Directory
         | 
         | When a user asks you to work on something:
         | 
         | 1. *Check which project* they're referring to
         | 
         | 2. *Change to that directory* explicitly if needed
         | 
         | 3. *Stay in that directory* for file operations
         | 
         | ```bash
         | 
         | # Example: Working on ProjectAlpha
         | 
         | cd /home/user/code/ProjectAlpha
         | 
         | ```
         | 
         | (The one sentence version is "Each project has a subfolder; use
         | pwd to make sure you're in the right directory", and the ideal
         | version is probably just letting it occasionally spend 60
         | seconds confused, until it remembers pwd exists)
        
         | chickensong wrote:
         | If you have any substantial codebase, it will write a massive
         | file unless you explicitly tell it not to. It also will try and
         | make updates, including garbage like historical or transitional
         | changes, project status, etc...
         | 
         | I think most people who use Claude regularly have probably come
         | to the same conclusions as the article. A few bits of high-
         | level info, some behavior stuff, and pointers to actual docs.
         | Load docs as-needed, either by prompt or by skill. Work through
         | lists and constantly update status so you can clear context and
         | pick up where you left off. Any other approach eats too much
         | context.
         | 
         | If you have a complex feature that would require ingesting too
         | many large docs, you can ask Claude to determine exactly what
         | it needs to build the appropriate context for that feature and
         | save that to a context doc that you load at the beginning of
         | each session.
        
       | adastra22 wrote:
       | > Claude code injects the following system reminder...
       | 
       | OMG this finally makes sense.
       | 
       | Is there any way to turn off this behavior?
       | 
       | Or better yet is there a way to filter the context that is being
       | sent?
        
       | edf13 wrote:
       | Ah, never knew about this injection...
       | 
       | <system-reminder> IMPORTANT: this context may or may not be
       | relevant to your tasks. You should not respond to this context
       | unless it is highly relevant to your task. </system-reminder>
       | 
       | Perhaps a small proxy between Claude code and the API to enforce
       | following CLAUDE.md may improve things... I may try this
        
       | nurettin wrote:
       | I've been a customer since sonnet 3.5. It is coming to the point
       | where opus 4.5 usually does better than whatever your
       | instructions say on claude.md just by reading your code and
       | having a general sense of what your preferences are.
       | 
       | I used to instruct about coding style (prefer functions, avoid
       | classes, use structs for complex params and returns, avoid member
       | functions unless needed by shared state, avoid superfluous
       | comments, avoid silly utf8 glyphs, AoS vs SoA, dry, etc)
       | 
       | I removed all my instructions and it basically never violates
       | those points.
        
       | magictux wrote:
       | I think this is an overall good approach and I've got allright
       | results with a similar approach - I still think that this
       | CLAUDE.md experience is too magical and that Anthropic should
       | really focus on it.
       | 
       | Actually having official guidelines in their docs would be a good
       | entrypoint, even though I guess we have this which is the closest
       | available from anything official for now:
       | https://www.claude.com/blog/using-claude-md-files
       | 
       | One interesting thing I also noticed and used recently is that
       | Claude Code ships with a @agent-claude-code-guide. I've used it
       | to review and update my dev workflow / CLAUDE.md file but I've
       | got mixed feelings on the discussion with the subagent.
        
       | toenail wrote:
       | A good Claude.md only needs one line:
       | 
       | Read your instructions from Agents.md
        
       | ilmj8426 wrote:
       | I've recently started using a similar approach for my own
       | projects. providing a high-level architecture overview in a
       | single markdown file really helps the LLM understand the 'why'
       | behind the code, not just the 'how'. Does anyone have a specific
       | structure or template for Claude.md that works best for frontend-
       | heavy projects (like React/Vite)? I find that's where the context
       | window often gets cluttered.
        
       | asim wrote:
       | That's a good write up. Very useful to know. I'm sort of on the
       | outside of all this. I've only sort of dabbled and now use
       | copilot quite a lot with claude. What's being said here, reminds
       | me a lot of CPU registers. If you think about the limited space
       | in CPU registers and the processing of information is astounding,
       | how much we're actually able to do. So we actually need higher
       | layers of systems and operating systems to help manage all of
       | this. So it feels like a lot of what's being said. Here will end
       | up inevitably being an automated system or compiler or
       | effectively an operating system. Even something basic like a
       | paging system would make a lot of difference.
        
       | aiibe wrote:
       | Writing and updating CLAUDE.md or AGENTS.md feels like pointless
       | to me. Humans are the real audience for documentation. The code
       | changes too fast, and LLMs are stateless anyway. What's been
       | working is just letting the LLM explore the relevant part of the
       | code to acquire the context, defining the problem or feature, and
       | asking for a couple of ways to tackle it. All in a one short
       | prompt. That usually gets me solid options to pick and build it
       | out. And always do, one session for one problem. This is my lazy
       | approach to getting useful help from an LLM.
        
         | arnorhs wrote:
         | I agree with you, however your approach results in much longer
         | LLM development runs, increased token usage and a whole lot of
         | repetitive iterations.
        
           | aiibe wrote:
           | I'm definitely interested in reducing token usage techniques.
           | But with one session one problem I've never hit a context
           | limit yet, especially when the problem is small and clearly
           | defined using divide-and-conquer. Also, agentic models are
           | improving at tool use and should require fewer tokens. I'll
           | take as many iterations as needed to ensure the code is
           | correct.
        
         | dncornholio wrote:
         | Because it's stateless it's not pointless? Good codebases don't
         | change fast. Stuff gets added but for the most stuff, they
         | shouldn't change.
        
           | aiibe wrote:
           | A well-documented codebase lets both developers and agentic
           | models locate relevant code easily. If you treat the model
           | like a teammate, extra docs for LLMs are unnecessary. IMHO.
           | In frontend work, code moves quickly.
        
         | samuelknight wrote:
         | I use .md to tell the model about my development workflow.
         | Along the lines of "here's how you lint", "do this to re-
         | generate the API", "this is how you run unit tests", "The
         | sister repositories are cloned here and this is what they are
         | for".
         | 
         | One may argue that these should go in a README.md, but these
         | markdowns are meant to be more streamlined for context, and
         | it's not appropriate to put a one-liner in the imperative tone
         | to fix model behavior in a top-level file like the README.md
        
           | aiibe wrote:
           | That kind of repetitive process belongs in a script, rather
           | than baked into markdown prompts. Claude has custom hooks for
           | that.
        
         | aqme28 wrote:
         | This is true but sometimes your codebase has unique quirks that
         | you get tired of repeating. "No, Claude, we do it this other
         | way here. Every time."
        
           | aiibe wrote:
           | Quirks are pretty much unavoidable. I tend to get better
           | results using Codex. It sticks to established patterns. Slow,
           | but more deliberate. Claude focuses more on speed.
        
         | xpe wrote:
         | > Humans are the real audience for documentation.
         | 
         | Seeing "real" is a warning flag here that either-or thinking is
         | in play.
         | 
         | Putting aside hopes and norms, we live in a world now where
         | multiple kinds of agents (human and non-human) are contributing
         | to codebases. They do not contribute equally; they work
         | according to different mechanisms, with different strengths and
         | weaknesses, with different economic and cultural costs.
         | 
         | Recall a lesson from Ralph Waldo Emerson: "a foolish
         | consistency is the hobgoblin of little minds" [1]. Don't cling
         | to the past; pay attention to the now, and do what works.
         | Another way of seeing it: don't force a false equivalence
         | between things that warrant different treatment.
         | 
         | If you find yourself thinking thoughts that do more harm than
         | good (e.g. muddle rather than clarify), attempt to reframe them
         | to better make sense of reality (which has texture and
         | complexity).
         | 
         | Here's my reframing: "Documentation serves different purposes
         | to different agents across different contexts. So plan and
         | execute accordingly."
         | 
         | [1]:
         | https://en.wikipedia.org/wiki/Wikipedia:Emerson_and_Wilde_on...
        
       | philipp-gayret wrote:
       | I find writing a good CLAUDE.md is done by running /init, and
       | having the LLM write it. If you need more controls on how it
       | should work, I would highly recommend you implement it in an
       | unavoidable way via hooks and not in a handwritten note to your
       | LLM.
        
       | jankdc wrote:
       | I'm not sure if Claude Code has integrated it in its system
       | prompts or not since it's moving at breakneck speed, but one
       | instruction I like putting on all of my projects is to "Prompt
       | for technical decisions from user when choices are unsure". This
       | would almost always trigger the prompting feature that Claude
       | Code has for me when it's got some uncertainty about the
       | instructions I gave it, giving me options or alternatives on how
       | to approach the problem when planning or executing.
       | 
       | This way, it's got more of a chance in generating something that
       | I wanted, rather than running off on it's own.
        
       | saberience wrote:
       | I find the Claude.md file mostly useless. It seems to be 50/50 or
       | LESS that Claude.md even reads/uses this file.
       | 
       | You can easily test this by adding some mandatory instruction
       | into the file. E.g. "Any new method you write must have less than
       | 50 lines or code." Then use Claude for ten minutes and watch it
       | blow through this limit again and again.
       | 
       | I use CC and Codex extensively and I constantly am resetting my
       | context and manually pasting my custom instructions in again and
       | again, because these models DO NOT remember or pay attention to
       | Claude.md or Agents.md etc.
        
       | uncletaco wrote:
       | Honestly I'd rather google get their gemini tool in better shape.
       | I know for a fact it doesn't ignore instructions like Claude code
       | does but it is horrible at editing files.
        
       | rcarmo wrote:
       | PSA: Claude can also use .github/copilot-instructions.md
       | 
       | If you're using VSCode, that is automatically added to context
       | (and I think in Zed that happens as well, although I can't verify
       | right now).
        
       | fpauser wrote:
       | Even better: learn to code yourself.
        
       | _august wrote:
       | I copied this post and gave it to claude code, and had it self-
       | modify CLAUDE.md. It.. worked really well.
        
       | scelerat wrote:
       | > we recommend keeping task-specific instructions in separate
       | markdown files with self-descriptive names somewhere in your
       | project.
       | 
       | Should do this for human developers too. Can't count the number
       | of times I've been thrown onto a project and had to spend a
       | significant amount of time opening and skimming files just to
       | answer simple questions that should be answered in high-level
       | docs like this.
        
         | abustamam wrote:
         | There's a funny joke I heard that they made Claude Code only to
         | force developers to write better documentation.
         | 
         | But in all seriousness, it's working. I write cursor rules
         | religiously and I point other devs to them. Its great.
        
         | minor3 wrote:
         | Yeah I do love how many "best practices" we are only
         | implementing because of LLMs, even though they were massively
         | beneficial for humans prior as well.
        
       | vaer-k wrote:
       | > we recommend keeping task-specific instructions in separate
       | markdown files with self-descriptive names somewhere in your
       | project.
       | 
       | Why should we do this when anthropic specifically recommends
       | creating multiple CLAUDE.md files in various directories where
       | the information is specific and pertinent? It seems to me that
       | anthropic has designed claude to look for claude.md for guidance,
       | and randomly named markdown files may or may not stand out to it
       | as it searches the directory.
       | 
       | You can place CLAUDE.md files in several locations:
       | 
       | > The root of your repo, or wherever you run claude from (the
       | most common usage). Name it CLAUDE.md and check it into git so
       | that you can share it across sessions and with your team
       | (recommended), or name it CLAUDE.local.md and .gitignore it Any
       | parent of the directory where you run claude. This is most useful
       | for monorepos, where you might run claude from root/foo, and have
       | CLAUDE.md files in both root/CLAUDE.md and root/foo/CLAUDE.md.
       | Both of these will be pulled into context automatically Any child
       | of the directory where you run claude. This is the inverse of the
       | above, and in this case, Claude will pull in CLAUDE.md files on
       | demand when you work with files in child directories Your home
       | folder (~/.claude/CLAUDE.md), which applies it to all your claude
       | sessions
       | 
       | https://www.anthropic.com/engineering/claude-code-best-pract...
        
       | andai wrote:
       | >Frontier thinking LLMs can follow ~ 150-200 instructions with
       | reasonable consistency.
       | 
       | Doesn't that mean that Claude Code's system prompt exhausts that
       | budget before you even get to CLAUDE.md and the user prompt?
       | 
       | Edit: They say Claude Code's system prompt has 50. I might have
       | misjudged then. It seemed pretty verbose to me!
       | 
       | The part about smaller models attending to fewer instructions is
       | interesting too, since most of what was added doesn't seem
       | necessary for the big models. I thought they added them so Haiku
       | could handle the job as well, despite a relative lack of common
       | sense.
        
       | rootnod3 wrote:
       | Here's an idea for LLM makers: allow for a very rigid and
       | structured Claude.md file. One that gives detailed instructions,
       | as void of ambiguity as possible. Then go and refine said
       | language, allow maybe for more than one file to give it some file
       | structure. Iterate on that for a few years and if you ever need a
       | name for it, you might wanna give it a name describing something
       | that describes a program, or maybe if you are inclined
       | enough....a programming language.
       | 
       | Have we really reached the low point that we need tutorials on
       | how to coerce a LLM into doing what we want instead of
       | just....writing the god damn code?
        
       | sixothree wrote:
       | I pointed CC to this URL and told it to fix my files in planning
       | mode. It gave me some options and did all of the work.
        
       ___________________________________________________________________
       (page generated 2025-12-01 23:02 UTC)