[HN Gopher] Claude Code: Best practices for agentic coding
___________________________________________________________________
Claude Code: Best practices for agentic coding
Author : sqs
Score : 557 points
Date : 2025-04-19 10:48 UTC (1 days ago)
(HTM) web link (www.anthropic.com)
(TXT) w3m dump (www.anthropic.com)
| joshstrange wrote:
| The most interesting part of this article for me was:
|
| > Have multiple checkouts of your repo
|
| I don't know why this never occurred to me probably because it
| feels wrong to have multiple checkouts, but it makes sense so
| that you can keep each AI instance running at full speed. While
| LLM's are fast, this is one of the annoying parts of just waiting
| for an instance of Aider or Claude Code to finish something.
|
| Also, I had never heard of git worktrees, that's pretty
| interesting as well and seems like a good way to accomplish
| effectively having multiple checkouts.
| m0rde wrote:
| I've never used Claude Code or other CLI-based agents. I use
| Cursor a lot to pair program, letting the AI do the majority of
| the work but actively guiding.
|
| How do you keep tabs on multiple agents doing multiple things
| in a codebase? Is the end deliverable there a bunch of MRs to
| review later? Or is it a more YOLO approach of trusting the
| agents to write the code and deploy with no human in the loop?
| rfoo wrote:
| In the same way how you manage a group of brilliant interns.
| mh- wrote:
| Really? My LLMs seem entirely uninterested in free snacks
| and unlimited vacation.
| oxidant wrote:
| Multiple terminal sessions. Well written prompts and
| CLAUDE.md files.
|
| I like to start by describing the problem and having it do
| research into what it should do, writing to a markdown file,
| then get it to implement the changes. You can keep tabs on a
| few different tasks at a time and you don't need to approve
| Yolo mode for writes, to keep the cost down and the model
| going wild.
| remoquete wrote:
| What's the Gemini equivalent of Claude Code and OpenAI's Codex?
| I've found projects like reugn/gemini-cli, but Gemini Code Assist
| seems limited to VS Code?
| peterldowns wrote:
| I would also like to know -- I think people are using
| Cursor/Windsurf/Roo(Cline) for IDEs that let you pick the
| model, but I don't know of a CLI agentic editor that lets you
| use arbitrary models.
| manojlds wrote:
| https://aider.chat/
| peterldowns wrote:
| Thanks! Any others, or any thoughts you can share on it?
| danenania wrote:
| Hey, I'm the creator of Plandex
| (https://github.com/plandex-ai/plandex), which takes a
| more agentic approach than aider, and combines models
| from Anthropic, OpenAI, and Google. You might find it
| interesting.
|
| I did a Show HN for it a few days ago:
| https://news.ycombinator.com/item?id=43710576
| jasir wrote:
| There's Aider, Plandex and Goose, all of which let you chose
| various providers and models. Aider also has a well known
| benchmark[0] that you can check out to help select models.
|
| - Aider - https://aider.chat/ | https://github.com/Aider-
| AI/aider
|
| - Plandex - https://plandex.ai/ | https://github.com/plandex-
| ai/plandex
|
| - Goose - https://block.github.io/goose/ |
| https://github.com/block/goose
|
| [0] https://aider.chat/docs/leaderboards/
| boredtofears wrote:
| I've only user aider (which I like quite a bit more than
| cursor) but I'm curious how it compares to plandex and goose.
| danenania wrote:
| Hi, creator of Plandex here. In case it's helpful, I posted
| a comment listing some of the main differences with aider
| here: https://news.ycombinator.com/item?id=43728977
| vunderba wrote:
| I think the differences between Aider / Plandex are more
| obvious. However I'd love to see a comparison breakdown
| between Plandex and Goose which seem to occupy a very
| similar space.
| jhawk28 wrote:
| Junie from Jetbrains was recently released. Not sure what LLM
| is uses.
| nojs wrote:
| Claude
| zomglings wrote:
| If anyone from Anthropic is reading this, your billing for Claude
| Code is hostile to your users.
|
| Why doesn't Claude Code usage count against the same plan that
| usage of Claude.ai and Claude Desktop are billed against?
|
| I upgraded to the $200/month plan because I really like Claude
| Code but then was so annoyed to find that this upgrade didn't
| even apply to my usage of Claude Code.
|
| So now I'm not using Claude Code so much.
| fcoury wrote:
| I totally agree with this, I would rather have some kind of
| prediction than using the Claude Code roulette. I would
| definitely upgrade my plan if I got Claude Code usage included.
| datavirtue wrote:
| I don't what you guys are on about but I have been using the
| free GitHub Copilot in VS Code chats to absolutely crank out
| new UI features in Vue. All that stuff that makes you groan
| at the thought of it: more divs, bindings, form validation, a
| whole new widget...churned out in 30 seconds. Try it live.
| Works? Keep.
|
| I'm surprised at the complexity and correctness at which it
| infers from very simple, almost inadequate, prompts.
| cypherpunks01 wrote:
| Claude Pro and other website/desktop subscription plans are
| subject to usage limits that would make it very difficult to
| use for Claude Code.
|
| Claude Code uses the API interface and API pricing, and writes
| and edits code directly on your machine, this is a level past
| simply interacting with a separate chat bot. It seems a little
| disingenuous to say it's "hostile" to users, when the reality
| is yeah, you do pay a bit more for more reliable usage tier,
| for a task that requires it. It also shows you exactly how much
| it's spent at any point.
| fcoury wrote:
| > ... usage limits that would make it very difficult to use
| for Claude Code.
|
| Genuinely interested: how's so?
| cypherpunks01 wrote:
| Well, I think it'd be pretty irritating to see the message
| "3 messages remaining until 6PM" while you are in the
| middle of a complex coding task.
| fcoury wrote:
| No, that's the whole point: predictability. It's
| definitely a trade off, but if we could save the work as
| is we could have the option to continue the iteration
| elsewhere, or even better, from that point on offer the
| option to fallback to the current API model.
|
| A nice addition would be having something like /cost but
| to check where you are in regards to limits.
| unshavedyak wrote:
| Conversely I have to manually do this and monitor the
| billing instead.
| zomglings wrote:
| The writing of edits and code directly on my machine is
| something that happens on the client side. I don't see why
| that usage would be subject to anything but one-time billing
| or how it puts any strain on Anthropic's infrastructure.
| ghuntley wrote:
| $200/month isn't that much. Folks, I'm hanging around with are
| spending $100 USD to $500 USD daily as the new norm as a cost
| of doing business and remaining competitive. That might seem
| expensive, but it's cheap... https://ghuntley.com/redlining
| m00dy wrote:
| Seriously? That's wild. What kind of CS field could even
| handle that kind of daily spend for a bunch of people?
| ghuntley wrote:
| Consider L5 at Google: outgoings of $377,797 USD per year
| just on salary/stock, before fixed overheads such as
| insurance, leave, issues like ramp-up time and cost of
| their manager. In the hands of a Staff+ engineer, these
| tools enable replication of Staff+ engineers and don't
| sleep. My 2c: the funding for the new norm will come from
| either compressing the manager layer or engineering layer
| or both.
| malfist wrote:
| LLMs absolutely don't replicate staff+ engineers.
|
| If your staff engineers are mostly doing things AI can
| do, then you don't need staff. Probably don't even need
| senior
| ghuntley wrote:
| That's my point.
|
| - L3 SWE II - $193,712 USD (before overheads)
|
| - L4 SWE III - $297,124 USD (before overheads)
|
| - L5 Senior SWE - $377,797 USD (before overheads)
|
| These tools and foundational models get better every day,
| and right now, they enable Staff+ engineers and
| businesses to have less need for juniors. I suspect there
| will be [short-to-medium-term] compression. See extended
| thoughts at https://ghuntley.com/screwed
| StefanBatory wrote:
| I wonder what will happen first - will companies move to
| LLMs, or to programmers from abroad (because ultimately,
| it will be cheaper than using LLMs - you've said ~$500
| per day, in Poland ~$1500 will be a good monthly wage -
| and that still will make us expensive! How about moving
| to India, then? Nigeria? LATAM countries?)
| ghuntley wrote:
| The industry has tried that, and the problems are well
| known (timezones, unpredictable outcomes in terms of
| quality and delivery dates)...
|
| Delivery via LLMs is predictable, fast, and any concerns
| about outcome [quality] can be programmed away to reject
| bad outcomes. This form of programming the LLMs has a
| one-time cost...
| throwawayb299 wrote:
| > in Poland ~$1500 will be a good monthly wage
|
| The minimum wage in Poland is around USD 1240/month. The
| median wage in Poland is approximately USD 1648/month.
| Tech salaries are considerably higher than the median.
|
| Idk, maybe for an intern software developer it's a good
| salary...
| StefanBatory wrote:
| Minimal is ~$930 after taxes, though; I rarely see people
| talk here about salary pre-tax, tbh.
|
| ~$1200 is what I'd get paid here after a few years of
| experience; I have never saw an internship offer in my
| city that paid more than minimal wage (most commonly,
| it's unpaid).
| breckenedge wrote:
| > These [...] get better every day.
|
| They do, but I've seen a huge slowdown in "getting
| better" in the last year. I wonder if it's my perception,
| or reality. Each model does better on benchmarks but I'm
| still experiencing at least a 50% failure rate on _basic_
| task completion, and that number hasn't moved higher in
| many months.
| cpursley wrote:
| Oh but they absolutely do. Have you not used any of this
| llm tooling? It's insanely good once you learn how to
| employ it. I no longer need a front end team, for
| example. It's that good at TypeScript and React. And the
| design is even better.
| mmikeff wrote:
| The kind of field where AI builds more in a day than a team
| or even contract dev does.
| ghuntley wrote:
| correct; utilised correctly these tools ship teams of
| output in a single day.
| rudedogg wrote:
| Do you have a link to some of this output? A repo on
| Github of something you've done for fun?
|
| I get a lot of value out of LLMs but when I see people
| make claims like this I know they aren't "in the
| trenches" of software development, or care so little
| about quality that I can't relate to their experience.
|
| Usually they're investors in some bullshit agentic coding
| tool though.
| ghuntley wrote:
| I will shortly; am building a serious self-compiling
| compiler rn out of an brand-new esoteric language.
| Meaning the LLM is able to program itself without
| training data about the programming language...
| lostmsu wrote:
| I would hold on on making grand claims until you have
| something grand to show for it.
| ghuntley wrote:
| Honestly, I don't know what to make of it. Stage 2 is
| almost complete, and I'm (right now) conducting per-
| language benchmarks to compare it to the Titans.
|
| Using the proper techniques, Sonet 3.7 can generate code
| in the custom lexical/stdlib. So, in my eyes, the path to
| Stage 3 is unlocked, but it will chew lots and lots of
| tokens.
| throwawayb299 wrote:
| > a serious self-compiling compiler
|
| Well, virtually every production-grade compiler is self-
| compiling. Since you bring it up explicitly, I'm
| wondering what implications of begin self-compiling you
| have in mind?
|
| > Meaning the LLM is able to program itself without
| training data about the programming language...
|
| Could you clarify this sentence a bit? Does it mean the
| LLM will code in this new language without training in it
| before hand? Or is it going to enable the LLM to programm
| itself to gain some new capabilities?
|
| Frankly, with the advent of coding agents, building a new
| compiler sounds about as relevant as introducing a new
| flavor of assembly language and then a new assembly may
| at least be justified by a new CPU architecture...
| sbszllr wrote:
| All can be true depending on the business/person:
|
| 1. My company cannot justify this cost at all.
|
| 2. My company can justify this cost but I don't find it
| useful.
|
| 3. My company can justify this cost, and I find it useful.
|
| 4. I find it useful, and I can justify the cost for personal
| use.
|
| 5. I find it useful, and I cannot justify the cost for
| personal use.
|
| That aside -- 200/day/dev for a "nice to have service that
| sometimes makes my work slightly faster" is much in the
| majority of the world.
| oytis wrote:
| When should we expect to see the amazing products these
| super-competitive businesses are developing?
| zomglings wrote:
| $100/day seems reasonable as an upper-percentile spend per
| programmer. $500/day sounds insane.
|
| A 2.5 hour session with Claude Code costs me somewhere
| between $15 and $20. Taking $20/2.5 hours as the estimate,
| $100 would buy me 12.5 hours of programming.
| bambax wrote:
| Asking very specific questions to Sonnet 3.7 costs a couple
| of tenths of a cent every time, and even if you're doing
| that all day it will never amount to more than maybe a
| dollar at the end of the day.
|
| On average, one line of, say, JavaScript represents around
| 7 tokens, which means there are around 140k lines of JS per
| million tokens.
|
| On Openrouter, Sonnet 3.7 costs are currently:
|
| - $3 / one million input tokens => $100 = 33.3 million
| input tokens = 420k lines of JS code
|
| - $15 / one million output tokens => $100 = 3.6 million
| output tokens = 4.6 million lines of JS code
|
| For one developer? In one day? It seems that one can only
| reach such amounts if the whole codebase is sent again as
| context with each and every interaction (maybe even with
| every keystroke for type completion?) -- and that seems
| incredibly wasteful?
| cma wrote:
| That's how it works, everything is recomputed again every
| additional prompt. But it can cache the state of things
| and restore for a lower fee, and reingesting what was
| formerly output is cheaper than making new output (serial
| bottleneck) so sometimes there is a discount there.
| bambax wrote:
| I can't edit the above comment, but there's obviously an
| error in the math! ;-) Doesn't change the point I was
| trying to make, but putting this here for the record.
|
| 33.3 million input tokens / 7 tokens per loc = 4.8
| million locs
|
| 3.6 million output tokens / 7 tokens per loc = 515k locs
| ghuntley wrote:
| It sounds insane until you drive full agentic loops/evals.
| I'm currently making a self-compiling compiler; no doubt
| you'll hear/see about it soon. The other night, I fell
| asleep and woke up with interface dynamic dispatch using
| vtables with runtime type information and generic interface
| support implemented...
| UltraSane wrote:
| Do you actually understand the code Claude wrote?
| cpursley wrote:
| Do you understand all of the code in the libraries that
| your applications depend on? Or your coworker for that
| matter?
|
| All of the gate keeping around llm code tools are
| amusing. But whatever, I'm shipping 10x and making money
| doing it.
| UltraSane wrote:
| Up until recently I could be sure they were written by a
| human.
|
| But if you are making money by using LLMs to write code
| then all power to you. I just despair at the idea of
| trillions of lines of LLM generated code.
| cpursley wrote:
| Well, you can't just vibe code something useful into
| existence despite all the marketing. You have to be very
| intentional about which libraries it can use, code style
| etc. Make sure it has the proper specifications and
| context. And review the code, of course.
| zomglings wrote:
| Fair enough. That's pretty cool, I haven't gone that far
| in my own work with AI yet, but now I am inspired to try.
|
| The point is to get a pipeline working, cost can be
| optimized down after.
| dannersy wrote:
| I'm waiting for the day this AI bubble bursts since as far
| as we can tell almost all these AI "providers" are
| operating at a loss. I wonder if this billing model
| actually makes profit or if it's still just burning cash in
| hopes of AGI being around the corner. We have yet to see a
| product that is useful and affordable enough to justify the
| cost.
| timmytokyo wrote:
| It's burning cash. Lots of it.
|
| [0] https://www.wheresyoured.at/openai-is-a-systemic-
| risk-to-the...
| dannersy wrote:
| Great article, thanks. Mirrors exactly what the JP
| Morgan/Goldman report claimed but that was quite dated.
| replwoacause wrote:
| Their API billing in general is hostile to users. I switched
| completely to Gemini for this reason and haven't looked back.
| dist-epoch wrote:
| Claude.ai/Desktop is priced based on average user usage. If you
| have 1 power user sending 1000 requests per day, and 99 sending
| 5, many even none, you can afford having a single $10/month
| plan for everyone to keep things simple.
|
| But every Claude Code user is a 1000 requests per day user, so
| the economics don't work anymore.
| fcoury wrote:
| Well, take that into consideration then. Just make it an
| option. Instead of getting 1000 requests per day with code,
| you get 100 on the $10/month plan, and then let users decide
| whether they want to migrate to a higher tier or continue
| using the API model.
|
| I am not saying Claude should stop making money, I'm just
| advocating for giving users the value of getting some Code
| coverage when you migrate from the basic plan to the pro or
| max.
|
| Does that make sense?
| zomglings wrote:
| I would accept a higher-priced plan (which covered both my
| use of Claude.ai/Claude Desktop AND my use of Claude Code).
|
| Anthropic make it seem like Claude Code is a product
| categorized like Claude Desktop (usage of which gets billed
| against your Claude.ai plan). This is how it signs off all
| its commits: Generated with [Claude
| Code](https://claude.ai/code)
|
| At the very least, this is misleading. It misled me.
|
| Once I had purchased the $200/month plan, I did some reading
| and quickly realized that I had been too quick to jump to
| conclusions. It still left me feeling like they had pulled a
| fast on one me.
| dist-epoch wrote:
| Maybe you can cancel your subscription or charge back?
|
| I think it's just oversight on their part. They have
| nothing to gain by making people believe they would get
| Claude Code access through their regular plans, only bad
| word of mouth.
| zomglings wrote:
| To be fair to them, they make it pretty easy to manage
| the subscription, downgrade it, etc.
|
| This is definitely not malicious on their part. Just
| bears pointing out.
| twalkz wrote:
| I've been using codemcp (https://github.com/ezyang/codemcp) to
| get "most" of the functionality of Claude code (I believe it
| uses prompts extracted from Claude Code), but using my existing
| pro plan.
|
| It's less autonomous, since it's based on the Claude chat
| interface, and you need to write "continue" every so often, but
| it's nice to save the $$
| zomglings wrote:
| Thanks, makes sense that an MCP server that edits files is a
| workaround to the problem.
| fcoury wrote:
| Just tried it and it's indeed very good, thanks for
| mentioning it! :-)
| jdance wrote:
| This would put anthropic in the business of minimizing the
| context to increase profits, same as Cursor and others who
| cheap out on context and try to RAG etc. Which would quickly
| make it worse, so I hope they stay on api pricing
|
| Some base usage included in the plan might be a good balance
| zomglings wrote:
| You know, I wouldn't mind if they just applied the API
| pricing after Claude Code ran through the plan limits.
|
| It would definitely get me to use it more.
| karbon0x wrote:
| Claude Code and Claude.ai are separate products.
| Wowfunhappy wrote:
| But the Claude Pro plan is almost certainly priced under
| the assumption that some users will use it below the usage
| limit.
|
| If everyone used the plan to the limit, the plan would cost
| the same as the API with usage equal to the limit.
| visarga wrote:
| Yeah, tried it for a couple of minutes, $0.31, quickly stopped
| and moved away.
| m00dy wrote:
| well, the best practice is to use gemini 2.5 pro instead :)
| replwoacause wrote:
| Yep I learned this the hard way after racking up big bills just
| using Sonnet 3.7 in my IDE. Gemini is just as good (and not
| nearly as willing to agree with every dumb thing I say) and
| it's way cheaper.
| xpe wrote:
| > Gemini is ... way cheaper.
|
| Yep. Here are the API pricing numbers for Gemini vs Claude.
| All per 1M tokens.
|
| 1. Gemini 2.5: in: $0.15; out: $0.60 non-thinking or $3.50
| thinking
|
| 2. Claude 3.7: in: $3.00; out: $15
|
| [1] https://ai.google.dev/gemini-api/docs/pricing [2]
| https://www.anthropic.com/pricing#api
| ryeguy wrote:
| Your gemini pricing is for flash, not pro. Also, claude
| uses prompt caching and gemini currently does not. The
| pricing isn't super straightforward because of that.
| sbszllr wrote:
| The issue with many of these tips is that they require you use to
| claude code (or codex cli, doesn't matter) to spend way more time
| in it, feed it more info, generate more outputs --> pay more
| money to the LLM provider.
|
| I find LLM-based tools helpful, and use them quite regularly but
| not 20 bucks+, let alone 100+ per month that claude code would
| require to be used effectively.
| dist-epoch wrote:
| > let alone 100+ per month that claude code would require
|
| I find this argument very bizarre. $100 is pay for 1-2 hours of
| developer time. Doesn't it save at least that much time in a
| whole month?
| owebmaster wrote:
| No, it doesn't. If you are still looking for product market
| fit, it is just cost.
|
| After 2 years of GPT4 release, we can safely say that LLMs
| don't make finding PMF that much easier nor improve general
| quality/UX of products, as we still see a general
| enshittification trend.
|
| If this spending was really game-changing, ChatGPT
| frontend/apps wouldn't be so bad after so long.
| mrbombastic wrote:
| Enshittification is the result of shitty incentives in the
| market not because coding is hard
| mikeg8 wrote:
| Finding product market fit is a human directional issue,
| and LLMs absolutely can help speed up iteration time here.
| I've built two RoR MVPs for small hobbby projects spending
| ~$75 in Claude code to make something in a day that would
| have previously taken me a month plus. Again, absolutely
| bizarre that people can't see the value here, even as these
| tools are still working through their kinks.
| owebmaster wrote:
| And how much did these two MVPs make in sales?
|
| If they just helped you to ship something valueless, you
| paid $75 for entertainment, like betting.
| dist-epoch wrote:
| You can now do 30 MVPs in a month instead of just one.
| gtirloni wrote:
| Reminds me of https://www.reddit.com/r/comics/comments/d1
| sm26/behold_the_u...
| nrvn wrote:
| what happened to the "$5 is just a cup o' coffee" argument?
| Are we heading towards the everything-for-$100 land?
|
| On a serious note, there is no clear evidence that any of the
| LLM-based code assistants will contribute to saving developer
| time. Depends on the phase of the project you are in and on a
| multitude of factors.
| rsyring wrote:
| I'm a skeptical adopter of new tech. But I cut my teeth on
| LLMs a couple years ago when I was dropped into a project
| using an older framework I wasn't familiar with. Even back
| then, LLMs helped me a ton to get familiar with the project
| and use best practices when I wasn't sure what those were.
|
| And that was just copy & past into ChatGPT.
|
| I don't know about assistants or project integration. But,
| in my experience, LLMS are a great tool to have and worth
| learning how to use well, for you. And I think that's the
| key part. Some people like heavily integrated IDEs, some
| people prefer a more minimal approach with VS Code or Vim.
|
| I think LLMs are going to be similar. Some people are going
| to want full integration and some are just going to want
| minimal interface, context, and edits. It's going to be up
| to the dev to figure out what works best for him or her.
| fnordpiglet wrote:
| While I agree, I find the early phases to be the least
| productive use of my time as it's often a lot of
| boilerplate and decisions that require thought but turn to
| matter very little. Paying $100 to bootstrap to midlife on
| a new idea seems absurdly cheap given my hourly.
| panny wrote:
| Just a few days ago Cursor saved a lot of developer time by
| encouraging all the customers to quit using a product.
|
| https://news.ycombinator.com/item?id=43683012
|
| Developer time "saved" indeed ;-)
| rpastuszak wrote:
| So sad that people are happy to spend 100$ pd on a tool like
| this, and we're so unlikely (in general) to pay $5 to an
| author of an article/blog posts that possibly saved you the
| same amount of time.
|
| (I'm not judging a specific person here, this is more of a
| broad commentary regarding our relationship/sense of
| responsibility/entitlement/lack of empathy when it comes to
| supporting other people's work when it helps us)
| ramoz wrote:
| Interesting, I have $100 days with Claude Code. Beyond
| effective.
| bugglebeetle wrote:
| Claude Code works fairly well, but Anthropic has lost the plot on
| the state of market competition. OpenAI tried to buy Cursor and
| now Windsurf because they know they need to win market share,
| Gemini 2.5 pro is better at coding than their Sonnet models, has
| huge context and runs on their TPU stack, but somehow Anthropic
| is expecting people to pay $200 in API costs per functional PR
| costs to vibe code. Ok.
| owebmaster wrote:
| > but somehow Anthropic is expecting people to pay $200 in API
| costs per functional PR costs to vibe code. Ok.
|
| Reading the thread, somehow people are paying. It is
| mindblowing how in place of getting cheaper, development just
| got more expensive for businesses.
| tylersmith wrote:
| $200 per PR is significantly cheaper development than
| businesses are paying.
| xpe wrote:
| In terms of short-term outlay, perhaps. But don't forget to
| factor in the long-term benefits of having a human team
| involved.
| frainfreeze wrote:
| 3.5 was amazing for code, and topped benchmarks for months.
| It'll take a while for other models to take over that mental
| space.
| zoogeny wrote:
| So I have been using Cursor a lot more in a vibe code way lately
| and I have been coming across what a lot of people report:
| sometimes the model will rewrite perfectly working code that I
| didn't ask it to touch and break it.
|
| In most cases, it is because I am asking the model to do too much
| at once. Which is fine, I am learning the right level of
| abstraction/instruction where the model is effective
| consistently.
|
| But when I read these best practices, I can't help but think of
| the cost. The multiple CLAUDE.md files, the files of context, the
| urls to documentation, the planning steps, the tests. And then
| the iteration on the code until it passes the test, then fixing
| up linter errors, then running an adversarial model as a code
| review, then generating the PR.
|
| It makes me want to find a way to work at Anthropic so I can
| learn to do all of that without spending $100 per PR. Each of the
| steps in that last paragraph is an expensive API call for us ISV
| and each requires experimentation to get the right level of
| abstraction/instruction.
|
| I want to advocate to Anthropic for a scholarship program for
| devs (I'd volunteer, lol) where they give credits to Claude in
| exchange for public usage. This would be structured similar to
| creator programs for image/audio/video gen-ai companies (e.g.
| runway, kling, midjourney) where they bring on heavy users that
| also post to social media (e.g. X, TikTok, Twitch) and they get
| heavily discounted (or even free) usage in exchange for promoting
| the product.
| istjohn wrote:
| Why do you think it's supposed to be cheap? Developers are
| expensive. Claude doesn't have to be cheap to make software
| development quicker and cheaper. It just has to be cheaper than
| you.
|
| There are ways to use LLMs cheaply, but it will always be
| expensive to get the most out of them. In fact, the top end
| will only get more and more costly as the lengths of tasks AIs
| can successfully complete grows.
| zoogeny wrote:
| I am not implying in any sense a value judgement on cost. I'm
| stating my emotions at the realization of the cost and how
| that affects my ability to use the available tools in my own
| education.
|
| It would be no different than me saying "it sucks university
| is so expensive, I wish I could afford to go to an expensive
| college but I don't have a scholarship" and someone then
| answers: why should it be cheap.
|
| So, allow me the space to express my feelings and propose
| alternatives, of which scholarships are one example and
| creative programs are another. Another one I didn't mention
| would be the same route as universities force now: I could
| take out a loan. And I could consider it an investment loan
| with the idea it will pay back either in employment prospects
| or through the development of an application that earns me
| money. Other alternatives would be finding employment at a
| company willing to invest that $100/day through me, the limit
| of that alternative being working at an actual foundational
| model company for presumably unlimited usage.
|
| And of course, I could focus my personal education on
| squeezing the most value for the least cost. But I believe
| the balance point between slightly useful and completely
| transformative usages levels is probably at a higher cost
| level than I can reasonably afford as an independent.
| qudat wrote:
| > It just has to be cheaper than you.
|
| Not when you need an SWE in order for it to work
| successfully.
| farzd wrote:
| general public, ceo, vc consensus is that - if it can
| understand english, anyone can do it. crazy
| solatic wrote:
| > It just has to be cheaper than you
|
| There's an ocean of B2B SaaS services that would save
| customers money compared to building poor imitations in-
| house. Despite the Joel Test (almost 25 years old! craxy...)
| asking whether you buy your developers the best tools that
| money can buy, because they're almost invariably cheaper than
| developer salaries, the fact remains that most companies
| treat salaries as a fixed cost and everything else threatens
| the limited budget they have.
|
| Anybody who has ever tried to sell developer tooling knows,
| you're competing with free/open-source solutions, and it aint
| a fair fight.
| k__ wrote:
| That's why I like Aider.
|
| You can protect your files in a non-AI way: by simply not
| giving write access to Aider.
|
| Also, apparently Aider is a bit more economic with tokens than
| other tools.
| zoogeny wrote:
| I haven't used Aider yet, but I see it show up on HN
| frequently recently (the last couple of days specifically).
|
| I am hesitant because I am paying for Cursor now and I get a
| lot of model usage included within that monthly cost. I'm
| cheap, perhaps to a fault even when I could afford it, and I
| hate the idea of spending twice when spending once is usually
| enough. So while Aider is potentially cheaper than Claude
| Code, it is still more than what I am already paying.
|
| I would appreciate any comments on people who have made the
| switch from Cursor to Aider. Are you paying more/less? If you
| are paying more, do you feel the added value is worth the
| additional cost? If you are paying less, do you feel you are
| getting less, the same or even more?
| Game_Ender wrote:
| With Aider you pay API fees only. You can get simple tasks
| done for a few dollars. I suggest budgeting $20 or so
| dollars and giving it a go.
| alchemist1e9 wrote:
| As an Aider user who has never tried Cursor, I'd also be
| interested in hearing from any Aider users who are using
| Cursor and how it compares.
| Wowfunhappy wrote:
| > So I have been using Cursor a lot more in a vibe code way
| lately and I have been coming across what a lot of people
| report: sometimes the model will rewrite perfectly working code
| that I didn't ask it to touch and break it.
|
| I don't find this particularly problematic because I can
| quickly see the unnecessary changes in git and revert them.
|
| Like, I guess it would be nice if I didn't have to do that, but
| compared to the value I'm getting it's not a big deal.
| zoogeny wrote:
| I agree with this in the general sense but of course I would
| like to minimize the thrash.
|
| I have become obsessive about doing git commits in the way I
| used to obsess over Ctrl-S before the days of source control.
| As soon as I get to a point I am happy, I get the LLM to do a
| check-point check in so I can minimize the cost of doing a
| full directory revert.
|
| But from a time and cost perspective, I could be doing much
| better. I've internalized the idea that when the LLM goes off
| the rails it was my fault. I should have prompted it better.
| So I am now consider: how do I get better faster? And the
| answer is I do it as much as I can to learn.
|
| I don't just want to whine about the process. I want to use
| that frustration to help me improve, while avoiding going
| bankrupt.
| vessenes wrote:
| i think this is particularly claude 3.7 behavior - at least
| in my experience, it's ... eager. overeager. smarter than
| 3."6" but still, it has little chill. gemini is better; o3
| better yet. I'm mostly off claude as a daily driver coding
| assistant, but it had a really long run - longest so far.
| imafish wrote:
| I get the same with gemini, though. o3 is kind of the
| opposite, under-eager. I cannot really decide on my
| favorite. So I switch back and forth :)
| jasonjmcghee wrote:
| Surprised that "controlling cost" isn't a section in this post.
| Here's my attempt.
|
| ---
|
| If you get a hang of controlling costs, it's much cheaper. If
| you're exhausting the context window, I would not be surprised if
| you're seeing high cost.
|
| Be aware of the "cache".
|
| Tell it to read specific files (and only those!), if you don't,
| it'll read unnecessary files, or repeatedly read sections of
| files or even search through files.
|
| Avoid letting it search - even halt it. Find / rg can have a
| thousands of tokens of output depending on the search.
|
| Never edit files manually during a session (that'll bust cache).
| THIS INCLUDES LINT.
|
| The cache also goes away after 5-15 minutes or so (not sure) - so
| avoid leaving sessions open and coming back later.
|
| Never use /compact (that'll bust cache, if you need to, you're
| going back and forth too much or using too many files at once).
|
| Don't let files get too big (it's good hygiene too) to keep the
| context window sizes smaller.
|
| Have a clear goal in mind and keep sessions to as few messages as
| possible.
|
| Write / generate markdown files with needed documentation using
| claude.ai, and save those as files in the repo and tell it to
| read that file as part of a question. I'm at about ~$0.5-0.75 for
| most "tasks" I give it. I'm not a super heavy user, but it
| definitely helps me (it's like having a super focused smart
| intern that makes dumb mistakes).
|
| If i need to feed it a ton of docs etc. for some task, it'll be
| more in the few $, rather than < $1. But I really only do this to
| try some prototype with a library claude doesn't know about (or
| is outdated). For hobby stuff, it adds up - totally.
|
| For a company, massively worth it. Insanely cheap productivity
| boost (if developers are responsible / don't get lazy / don't
| misuse it).
| bugglebeetle wrote:
| If I have to spend this much time thinking about any of this,
| congratulations, you've designed a product with a terrible UI.
| jasonjmcghee wrote:
| Some tools take more effort to hold properly than others. I'm
| not saying there's not a lot of room for improvement - or
| that the ux couldn't hold the users hand more to force things
| like this in some "assisted mode" but at the end of the day,
| it's a thin, useful wrapper around an llm, and llms require
| effort to use effectively.
|
| I definitely get value out of it- more than any other tool
| like it that I've tried.
| sqs wrote:
| It's fundamentally hard. If you have an easy solution, you
| can go make a easy few billion dollars.
| tetha wrote:
| Mh. Like, I'm deeply impressed what these AI assistants can
| do by now. But, the list in the parent comment there is very
| similar to my mental check-list of pair-programming / pair-
| admin'ing with less experienced people.
|
| I guess "context length" in AIs is what I intuitively tracked
| with people already. It can be a struggle to connect the
| Zabbix alert, the ticket and the situation on the system
| already, even if you don't track down all the zabbix code and
| scripts. And then we throw in Ansible configuring the thing,
| and then the business requriements by more, or less
| controlled dev-teams. And then you realize dev is controlled
| by impossible sales-terms.
|
| These are scope -- or I guess context -- expansions that
| cause people to struggle.
| oxidant wrote:
| Think about what you would do in an unfamiliar project with
| no context and the ticket
|
| "please fix the authorization bug in /api/users/:id".
|
| You'd start by grepping the code base and trying to
| understand it.
|
| Compare that to, "fix the permission in
| src/controllers/users.ts in the function `getById`. We need
| to check the user in the JWT is the same user that is being
| requested"
| troupo wrote:
| So, AIs are overeager junior developers at best, and not
| the magical programmer replacements they are advertised as.
| xpe wrote:
| > So, AIs are overeager junior developers at best, and
| not the magical programmer replacements they are
| advertised as.
|
| This may be a quick quip or a rant. But the things we say
| have a way of reinforcing how we think. So I suggest
| refining until what we say cuts to the core of the
| matter. The claim above is a false dichotomy. Let's put
| aside advertisements and hype. Trying to map between AI
| capabilities and human ones is complicated. There is high
| quality writing on this to be found. I recommend reading
| literature reviews on evals.
| lacker wrote:
| Let's split the difference and call them "magical
| overeager junior developer replacements".
| oxidant wrote:
| The grandparent is talking about how to control cost by
| focusing the tool. My response was to a comment about how
| that takes too much thinking.
|
| If you give a junior an overly broad prompt, they are
| going to have to do a ton of searching and reading to
| find out what they need to do. If you give them specific
| instructions, including files, they are more likely to
| get it right.
|
| I never said they were replacements. At best, they're
| tools that are incredibly effective when used on the
| correct type of problem with the right type of prompt.
| oezi wrote:
| As of April 2025. The pace is so fast that it will
| overtake seniors within years maybe months.
| apwell23 wrote:
| overtake ceo by 2026
| jdiff wrote:
| That's been said since at least 2021 (the release date
| for GitHub Copilot). I think you're overestimating the
| pace.
| whywhywhywhy wrote:
| On a shorter timeline than you'd think none of working
| with these tools will look like this.
|
| You'll be prompting and evaluating and iterating entirely
| finished pieces of software and be able to see multiple
| attempts at each solve at once, none of this deep in the
| weeds fixing a bug stuff.
|
| We're rapidly approaching a world where a lot of software
| will be being made without an engineer hire at all, maybe
| not the hardest most complex or novel software but a lot
| of software that previously required a team of 3-15 wont
| have a single dev.
|
| My current estimate is mid 2026
| hu3 wrote:
| my current estimate is 2030. because we can barely get a
| JS/TS application to compile after a year of dependency
| updates.
|
| our current popular stack is quicksand.
|
| unless we're talking about .net core, java, Django and
| more of these stable platforms.
| djtango wrote:
| I have been quite skeptical of using AI tools and my
| experiences using them have been frustrating for developing
| software but power tools usually come with a learning curve
| while "good product" with clean simplified interface often
| results in reduced capability.
|
| VIM, Emacs and Excel are obvious power tools which may
| require you to think but often produce unrivalled
| productivity for power users
|
| So I don't think the verdict that the product has a bad UI is
| fair. Natural language interfaces is such a step up from old
| school APIs with countless flags and parameters
| BeetleB wrote:
| Oh wow. Reading your comment guarantees I'll never use Claude
| Code.
|
| I use Aider. It's awesome. You explicitly specify the files.
| You don't have to do _work_ to _limit_ context.
| boredtofears wrote:
| Yeah, I tried CC out and quickly noticed it was spending $5+
| for simple LLM capable tasks. I rarely break $1-2 a session
| using aider. Aider feels like more of a precision tool. I
| like having the ability to manually specify.
|
| I do find Claude Code to be really good at exploration though
| - like checking out a repository I'm unfamiliar with and then
| asking questions about it.
| Jerry2 wrote:
| >I use Aider. It's awesome.
|
| What do you use for the model? Claude? Gemini? o3?
| m3kw9 wrote:
| Gemini 2.5 pro is my choice
| LeafItAlone wrote:
| Aider is a great tool. I do love it. But I find I have to do
| more with it to get the same output as Claude Code (no matter
| what LLM I used with Aider). Sure it may end up being cheaper
| per run, but not when my time is factored in. The flip side
| is I find Aider much easier to limit.
| Game_Ender wrote:
| What are those extra things you have to do more of? I only
| have experience with Aider so I am curious what I am
| missing here.
| simonw wrote:
| With Claude Code you can at least type "/code" at any point
| to see how much it's spent, and it will show you when you end
| a session (with Ctrl+C) too.
|
| The output of /cost looks like this: > /cost
| [?] Total cost: $0.1331 Total duration (API): 1m
| 13.1s Total duration (wall): 1m 21.3s
| jjallen wrote:
| Not having to specify files is a humongous feature for me.
| Having to remember which file code is in is half the work
| once you pass a certain codebase size.
| m3kw9 wrote:
| That sometimes work sometimes doesn't and takes 10x time.
| Same with codex. I would have both and switch between them
| depending on what you feel will get it right better
| carpo wrote:
| Use /context <prompt> to have aider automatically add the
| files based on the prompt. It's been working well for me.
| datavirtue wrote:
| GitHub copilot follows your context perfectly. I don't have to
| tell it anything about files. I tried this initially and it
| just screwed up the results.
| xpe wrote:
| > GitHub copilot follows your context perfectly. I don't have
| to tell it anything about files. I tried this initially and
| it just screwed up the results.
|
| Just to make sure we're on the same page. There are two
| things in play. First, a language model's ability to know
| what file you are referring to. Second, an assistant's
| ability to make sure the right file is in the context window.
| In your experience, how does Claude Code compare to Copilot
| w.r.t (1) and (2)?
| pclmulqdq wrote:
| It's interesting that this is a problem for people because I
| have never spent more than about $0.50 on a task with Claude
| Code. I have pretty good code hygiene and I tell Claude what to
| do with clear instructions and guidelines, and Claude does it.
| I will usually go through a few revisions and then just change
| anything myself if I find it not quite working. It's exactly
| like having an eager intern.
| kiratp wrote:
| The productivity boost can be so massive that this amount of
| fiddling to control costs is counterproductive.
|
| Developers tend to seriously underestimate the opportunity cost
| of their own time.
|
| Hint - it's many multiples of your total compensation broken
| down to 40 hour work weeks.
| pizza wrote:
| Hard agree. Whether it's 50 cents or 10 dollars per session,
| I'm using it to get work done for the sake of quickly
| completing work that aims to unblock many orders of magnitude
| more value. But in so far as cheaper correct sessions
| correlate with sessions where the problem solving was more
| efficient anyhow, they're fairly solid tips.
| afiodorov wrote:
| I agree but optimisation often reveals implementation
| details helping to understand limits of current tech more.
| It might not be worth the time but part of engineering is
| optimisation and another part is deep understanding of
| tech. It is _sometimes_ worth optimising anyway if you want
| to take the engineering discipline to the next level within
| yourself.
|
| I myself didn't think about not running linters however it
| makes obvious sense now and gives me the insight about how
| Claude Code works allowing me to use this insight in
| related engineering work.
| Aurornis wrote:
| The cost of the task scales with how long it takes, plus or
| minus.
|
| Substitute "cost" with "time" in the above post and all of
| the same tips are still valuable.
|
| I don't do much agentic LLM coding but the speed (or lack
| thereof) was one of my least favorite parts. Using any tricks
| that narrow scope, prevent reprocessing files over and over
| again, or searching through the codebase are all helpful even
| if you don't care about the dollar amount.
| jillesvangurp wrote:
| Exactly. I've been using the chat gpt desktop app not because
| of the model quality but because of the UX. It basically
| seamlessly integrates with my IDEs (intellij and vs code).
| Mostly I just do stuff like select a few lines, hit
| option+shift+1, and say something like "fix this". Nice short
| prompt and I get the answer relatively quickly.
| Option+shift+1 opens chat gpt with the open file already
| added to the context. It sees what lines are selected. And it
| also sees the output of any test runs on the consoles. So
| just me saying "fix this" now has a rich context that I don't
| need to micromanage.
|
| Mostly I just use the 4o model instead of the newer better
| models because it is faster. It's good enough mostly and I
| prefer getting a good enough answer quickly than the perfect
| answer after a few minutes. Mostly what I ask is not rocket
| science so perfect is the enemy of good here. I rarely have
| to escalate to better models. The reasoning models are
| annoyingly slow. Especially when they go down the wrong
| track, which happens a lot.
|
| And my cost is a predictable 20$/month. The downside is that
| the scope of what I can ask is more limited. I'd like it to
| be able to "see" my whole code base instead of just 1 file
| and for me to not have to micro manage what the model looks
| at. Claude can do that if you don't care about money. But if
| you do, you are basically micro managing context. That sounds
| like monkey work that somebody should automate. And it
| shouldn't require an Einstein sized artificial brain to do
| that.
|
| There must be people that are experimenting with using
| locally running more limited AI models to do all the
| micromanaging that then escalate to remote models as needed.
| That's more or less what Apple pitched for Apple AI at some
| point. Sounds like a good path forward. I'd be curious to
| learn about coding tools that do something like that.
|
| In terms of cost, I don't actually think it's unreasonable to
| spend a few hundred dollars per month on this stuff. But I
| question the added value over the 20$ I'm spending. I don't
| think the improvement is 20x better. more like 1.5x. And I
| don't like the unpredictability of this and having to think
| about how expensive a question is going to be.
|
| I think a lot of the short term improvement is going to be a
| mix of UX and predictable cost. Currently the tools are still
| very clunky and a bit dumb. The competition is going to be
| about predictable speed, cost and quality. There's a lot of
| room for improvement here.
| charlie0 wrote:
| If this is true, why isn't our compensation scaling with the
| increases in productivity?
| lazzlazzlazz wrote:
| It usually does, just with a time delay and a strict
| condition that the firm you work at can actually
| commercialize your productivity. Apply your systems
| thinking skills to compensation and it will all make sense.
| jjmarr wrote:
| I don't think about controlling cost because I price my time at
| US$40/h and virtually all models are cheaper than that (with
| the exception of o1 or Gemini 2.5 pro).
|
| If I spend $2 instead of $0.50 on a session but I had to spend
| 6 minutes thinking about context, I haven't gained any money.
| owebmaster wrote:
| Important to remind people this is only true if you have a
| profitable product, otherwise you're spending money you
| haven't earned.
| jasonjmcghee wrote:
| If your expectation is to produce the same amount of
| output, you could argue when paying for AI tools, you're
| choosing to spend money to gain free time.
|
| 4 hours coding project X or 3 hours and a short hike with
| your partner / friends etc
| jjmarr wrote:
| If what I'm doing doesn't have a positive expected value,
| the correct move isn't to use inferior dev tooling to save
| money, it's to stop working on it entirely.
| ngruhn wrote:
| Come on, every hobby has negative expected value. You're
| not doing it for the money but it still makes sense to
| save money.
| oezi wrote:
| There might be value but you might not receive any of it.
| Most salaried employees won't see returns.
| jasonjmcghee wrote:
| If you do it a bit, it just becomes habit / no extra time or
| cognitive load.
|
| Correlation or causation aside, the same people I see
| complain about cost, complain about quality.
|
| It might indicate more tightly controlled sessions may also
| produce better results.
|
| Or maybe it's just people that tend to complain about one
| thing, complain about another.
| chewz wrote:
| My attempt is - Do not use Claude Code at all, it is terrible
| tool. It is bad at almost everything starting with making
| simple edits to files.
|
| And most of all Claude Code is overeager to start messing with
| your code and run unnecessary $$ instead of making sensible
| plan.
|
| This isn't problem with Claude Sonnet - it is fundamnetal
| problem with Claude Code.
| winrid wrote:
| I pretty much one shot a scraper from an old Joomla site with
| 200+ articles to a new WP site, including all users and
| assets, and converting all the PDFs to articles. It cost me
| like $3 in tokens.
| hu3 wrote:
| I guess the question the is: can't VScode Copilot do the
| same for a fixed $20/month? It even has access to all SOTA
| models like Claude 3.7, Gemini 2.5 Pro and GPT o3
| darksaints wrote:
| I would have thought so, but somehow no. I have a cursor
| subscription with access to all of those models, and I
| still consistently get better results from claude code.
| mceachen wrote:
| Vscode's agent mode in copilot (even in the insider's
| nightly) is a bit rough in my experience: lots of 500
| errors, stalls, and outright failures to follow tasks (as
| if there's a mismatch between what the ui says it will
| include in context vs what gets fed to the LLM).
| winrid wrote:
| I haven't tried copilot. Mostly because I don't use
| VSCode, I use jetbrains ides. How do they provide Claude
| 3.7 for $20/mo with unlimited usage?
| oezi wrote:
| By providing bad UI that you don't use it so much.
| KronisLV wrote:
| Copilot has a pretty good plugin for JetBrains IDEs!
|
| Though their own AI Assistant and Junie might be equally
| good choices there too.
| troupo wrote:
| was it a wget call feeding into html2pdf?
| winrid wrote:
| no it's a few hundred lines of python to parse weird and
| inconsistent HTML into json files and CSV files, and then
| a sync script that can call the WP API to create all the
| authors as needed, update the articles, and migrate the
| images
| SoftTalker wrote:
| Plumbing to pipe shit from one sewer to another.
| winrid wrote:
| Yep, don't wanna spend more of my life doing that than I
| have to!
| gundmc wrote:
| Never edit files manually during a session (that'll bust
| cache). THIS INCLUDES LINT
|
| Yesterday I gave up and disabled my format-on-save config
| within VSCode. It was burning way too many tokens with
| unnecessary file reads after failed diffs. The LLMs still have
| a decent number of failed diffs, but it helps a lot.
| irthomasthomas wrote:
| I assume they use a conversation, so if you compress the prompt
| immediately you should only break cache once, and still hit
| cache on subsequent prompts?
|
| So instead of Write Hit Hit Hit
|
| It's Write Write Hit Hit Hit
| sagarpatil wrote:
| If I have to be so cautious while using a tool might as well
| write the code myself lol. I've used Claude Code extensively
| and it is one of the best AI IDE. It just gets things done. The
| only downside is the cost. I was averaging $35-$40/day. At this
| cost, I'd rather just use Cursor/Windsurf.
| 0x696C6961 wrote:
| I mostly work in neovim, but I'll open cursor to write
| boilerplate code. I'd love to use something cli based like Claude
| Code or Codex, but neither of them implement semantic indexing
| (vector embeddings) the way Cursor does. It should be possible to
| implement an MCP server which does this, but I haven't found a
| good one.
| isaksamsten wrote:
| I use a small plugin I've written my self to interact with
| Claude, Gemini 2.5 pro or GPT. I've not really seen the need
| for semantic searching yet. Instead I've given the LLM access
| to LSP symbol search, grep and the ability to add files to the
| conversation. It's been working well for my use cases but I've
| never tried Cursor so I can't comment on how it compares. I'm
| sure it's not as smooth though. I've tried some of the more
| common Neovim plugins and for me it works better, but the
| preference here is very personal. If you want to try it out
| it's here: https://github.com/isaksamsten/sia.nvim
| sqs wrote:
| Tool-calling agents with search tools do very well at
| information retrieval tasks in codebases. They are slower and
| more expensive than good RAG (if you amortize the RAG index
| over many operations), but they're incredibly versatile and
| excel in many cases where RAG would fall down. Why do you think
| you need semantic indexing?
| 0x696C6961 wrote:
| > Why do you think you need semantic indexing?
|
| Unfortunately I can only give an anecdotal answer here, but I
| get better results from Cursor than the alternatives. The
| semantic index is the main difference, so I assume that's
| what's giving it the edge.
| sqs wrote:
| Is it a very large codebase? Anything else distinctive
| about it? Are you often asking high-level/conceptual
| questions? Those are the questions that would help me
| understand why you might be seeing better results with RAG.
| 0x696C6961 wrote:
| I'll ask something like "where does X happen?" But "X"
| isn't mentioned anywhere in the code because the code is
| a complete nightmare.
| xpe wrote:
| Good point. I largely work in Zed -- looks like it had semantic
| search for a while but is working on a redesign
| https://github.com/zed-industries/zed/issues/9564
| Wowfunhappy wrote:
| > Use /clear to keep context focused
|
| The only problem is that this loss is permanent! As far as I can
| tell, there's no way to go back to the old conversation after a
| `/clear`.
|
| I had one session last week where Claude Code seemed to have
| become amazingly capable and was implementing entire new features
| and fixing bugs in one-shot, and then I ran `/clear` (by accident
| no less) and it suddenly became very dumb.
| jasonjmcghee wrote:
| They've worked to improve this with "memories" (hash symbol to
| "permanently" record something - you can edit later if you
| want).
|
| And there's CLAUDE.md. it's like cursorrules. You can also have
| it modify it's own CLAUDE.md.
| zomglings wrote:
| You can ask it to store its current context to a file, review
| the file, ask it to emphasize or de-emphasize things based on
| your review, and then use `/clear`.
|
| Then, you can edit the file at your leisure if you want to.
|
| And when you want to load that context back in, ask it to read
| the file.
|
| Works better than `/compact`, and is a lot cheaper.
| Wowfunhappy wrote:
| Neat, thanks, I had no idea!
|
| Edit: It so happens I had a Claude Code session open in my
| Terminal, so I asked it: Save your current
| context to a file.
|
| Claude produced a 91 line md file... surely that's not the
| whole of its context? This was a reasonably lengthy
| conversation in which the AI implemented a new feature.
| zomglings wrote:
| What is in the file?
| Wowfunhappy wrote:
| An overview of the project and the features implemented.
|
| Edit: Here's the actual file if you want to see it. https
| ://gist.github.com/Wowfunhappy/e7e178136c47c2589cfa7e5a..
| .
| zomglings wrote:
| Apologies for the late reply. My kids demanded my
| attention yesterday.
|
| It doesn't seem to have included any points on style or
| workflow in the context. Most of my context documents end
| up including the following information:
|
| 1. I want the agent to treat git commits as checkpoints
| so that we can revert really silly changes it makes.
|
| 2. I want it to keep on running build/tests on the code
| to be sure it isn't just going completely off the rails.
|
| 3. I want it to refrain from adding low signal comments
| to the code. And not use emojis.
|
| 4. I want it to be honest in its dealings with me.
|
| It goes on a bit from there. I suspect the reason that
| the models end up including that information in the
| context documents they dump in our sessions is that I
| give them such strong (and strongly worded) feedback on
| these topics.
|
| As an alternative, I wonder what would happen if you just
| told it what was missing from the context and asked it to
| re-dump the context to file.
| Wowfunhappy wrote:
| But none of this is really Claude Code's internal
| context, right? It's a summary. I could see using it as
| an alternative to /compact but not to undo a /clear.
|
| Whatever the internal state is of Claude Code, it's lost
| as soon as you /clear or close the Terminal window. You
| can't even experiment with a different prompt and then--
| if you don't like the prompt--go back to the original
| conversation, because pressing esc to branch the
| conversation looses the original branch.
| datavirtue wrote:
| Compared to my experience with the free GitHub Copilot in VS
| Code it sounds like you guys are in a horse and buggy.
| shmoogy wrote:
| I'm excited for the improvements they've had recently but I
| have better luck with Cline in regular vs code, as well as
| cursor.
|
| I've tried Claude code this week and I really didn't like
| it - Claude did an okay job but was insistent on deleting
| some shit and hard coding a check instead of an actual
| conditional. It got the feature done in about $3, but I
| didn't really like the user experience and it didn't feel
| any better than using 3.7 in cursor.
| andrewstuart wrote:
| I'm too scared of the cost to use this.
| xpe wrote:
| You can set spend limits
| https://docs.anthropic.com/en/api/rate-limits
| LADev wrote:
| This is so helpful!
| fallinditch wrote:
| I'm wondering how much of the techniques described in this blog
| post can be used in an IDE like Windsurf or Cursor with Claude
| Sonnet?
|
| My 2 cents on value for money and effectiveness of Claude vs
| Gemini for coding:
|
| I've been using Windsurf, VS Code and the new Firebase Studio.
| The Windsurf subscription allowance for $15 per month seems
| adequate for reasonable every day use. I find Claude Sonnet 3.7
| performs better for me than Gemini 2.5 pro experimental.
|
| I still like VS Code and its way of doing things, you can do a
| lot with the standard free plan.
|
| With Firebase Studio, my take is that it should good for building
| and deploying simple things that don't require much developer
| handholding.
| flashgordon wrote:
| So I feel like a grandpa reading this. I gave Claude code a solid
| shot. Had some wins but costs started blowing up. I switched to
| Gemini AI where I only upload files I want it to work on and make
| sure to refactor often so modularity remains fairly high. It's an
| amazing experience. If this is any measure - I've been averaging
| about 5-6 "small features" per 10k tokens. And I totally suck at
| fe coding!! The other interesting aspect of doing it this way is
| being able to break up problems and concerns. For example in this
| case I _only_ worked on fe without any backend and flushed it out
| before starting on an backend.
| xpe wrote:
| by fe the poster means FE (front-end)
| flashgordon wrote:
| Sorry yes. I should have clarified that.
| xpe wrote:
| Or uppercase would have cleared it up.
| neodypsis wrote:
| A combination that works nicely to solve bugs is: 1) have
| Gemini analyze the code and the problem, 2) ask it to create a
| prompt for Claude to fix the problem, 3) give Claude the
| markdown prompt and the code, 4) give Gemini the output from
| Claude to review, 5) repeat if necessary
| vessenes wrote:
| If you like this plan, you can do this from the command line:
|
| `aider --model gemini --architect --editor-model claude-3.7`
| and aider will take care of all the fiddly bits including git
| commits for you.
|
| right now `aider --model o3 --architect` has the highest
| rating on the Aider leaderboards, but it costs wayyy more
| than just --model gemini.
| neodypsis wrote:
| I like Gemini for "architect" roles, it has very good code
| recall (almost no hallucinations, or none lately), so it
| can successfully review code edits by Claude. I also find
| it useful to ground it with Google Search.
| flashgordon wrote:
| Damn that's interesting. How much of the code do you provide?
| I'm guessing when modularity is high you can give specific
| files.
| neodypsis wrote:
| Gemini's context is very long, so I can feed it full files.
| I do the same with Claude, but I may need to start from
| scratch various times, so Gemini serves as memory (and is
| also good that Gemini has almost no hallucinations, so it's
| great as a code reviewer for Claude's edits).
| simonw wrote:
| The "ultrathink" thing is pretty funny:
|
| > We recommend using the word "think" to trigger extended
| thinking mode, which gives Claude additional computation time to
| evaluate alternatives more thoroughly. These specific phrases are
| mapped directly to increasing levels of thinking budget in the
| system: "think" < "think hard" < "think harder" < "ultrathink."
| Each level allocates progressively more thinking budget for
| Claude to use.
|
| I had a poke around and it's not a feature of the Claude model,
| it's specific to Claude Code. There's a "megathink" option too -
| it uses code that looks like this: let B =
| W.message.content.toLowerCase(); if (
| B.includes("think harder") || B.includes("think
| intensely") || B.includes("think longer") ||
| B.includes("think really hard") || B.includes("think
| super hard") || B.includes("think very hard") ||
| B.includes("ultrathink") ) return (
| l1("tengu_thinking", { tokenCount: 31999, messageId: Z, provider:
| G }), 31999 ); if (
| B.includes("think about it") || B.includes("think a lot")
| || B.includes("think deeply") ||
| B.includes("think hard") || B.includes("think more") ||
| B.includes("megathink") ) return (
| l1("tengu_thinking", { tokenCount: 1e4, messageId: Z, provider: G
| }), 1e4 );
|
| Notes on how I found that here:
| https://simonwillison.net/2025/Apr/19/claude-code-best-pract...
| orojackson wrote:
| Not gonna lie: the "ultrathink" keyword that Sonnet 3.7 with
| thinking tokens watches for gives me "doubleplusgood" vibes in
| a hilarious but horrifying way.
| 4b11b4 wrote:
| At this point should we get our first knob/slider on a
| language model... THINK
|
| ..as if we're operating this machine as analog synth
| soulofmischief wrote:
| There are already many such adjustable parameters such as
| temperature and top_k
| ljm wrote:
| Maybe a Turbo Think button that toggles between Ultrathink
| and Megathink.
| antonvs wrote:
| If you use any of the more direct API sandbox/studio UIs,
| there are already various sliders, temperature (essentially
| randomness vs. predictability) being the most common.
|
| The consumer-facing chatbot interfaces just hide all that
| because they're aiming for a non-technical audience.
| ______ wrote:
| I use a cheap MIDI controller in this manner - there is
| even native browser support. Great to get immediate
| feedback on parameter tweaks
| coffeebeqn wrote:
| A little bit of the old ultrathink with the boys
| stavros wrote:
| _Shot to everyone around a table, thinking furiously over
| their glasses of milk_
| pyfon wrote:
| Weird code to have in a modern AI system!
|
| Also 14 string scans seems a little inefficient!
| Aurornis wrote:
| 14 checks through a string is entirely negligible relative to
| the amount of compute happening. Like a drop of water in the
| ocean.
| bombela wrote:
| Everybody says this all the time. But it compounds. And
| then our computers struggle with what should be basic
| websites.
| anotherpaulg wrote:
| In aider, instead of "ultrathink" you would say:
| /thinking-tokens 32k
|
| Or, shorthand: /think 32k
| westoncb wrote:
| That's awesome, and almost certainly an Unreal Tournament
| reference (when you chain enough kills in short time it moves
| through a progression that includes "megakill" and
| "ultrakill").
| orojackson wrote:
| If they did, they left out the best one: "m-m-m-m-
| monsterkill"
|
| Surely Anthropic could do a better job implementing dynamic
| thinking token budgets.
| Quarrel wrote:
| Ultrakill is from Quake :)
| mr-karan wrote:
| What I don't like about Claude Code is why can't they give
| command line flags for this stuff? It's better documented and
| people don't have to discover this the hard way.
|
| Similarly, I do miss an --add command line flag to manual
| specify the context (files) during the session. Right now I
| pretty much end up copy pasting the relative paths from VSCode
| and supply to Claude. Aider has much better semantics for such
| stuff.
| gdudeman wrote:
| Maybe I'm not getting this, but you can tab to autocomplete
| file paths.
|
| You can use English or --add if you want to tell Claude to
| reference them.
| nulld3v wrote:
| Waiting until I can tell it to use "galaxy brain".
| NiloCK wrote:
| Slightly shameless, but easier than typing a longer reply.
|
| https://www.paritybits.me/think-toggles-are-dumb/
|
| https://nilock.github.io/autothink/
|
| LLMs with broad contextual capabilities shouldn't need to be
| guided in this manor. Claude can tell a trivial task from a
| complex one just as easily as I can, and should self-adjust, up
| to thresholds of compute spending, etc.
| adastra22 wrote:
| "think hard with a vengeance"
| panny wrote:
| >Use Claude to interact with git
|
| Are they saying Claude needs to do the git interaction in order
| to work and/or will generate better code if it does?
| sagarpatil wrote:
| It doesn't need to. Its optional.
| panny wrote:
| I don't see how this is a best practice then. It seems like
| they are saying "Spend money on something easy to do, but can
| be catastrophic if the AI screws it up."
| jwr wrote:
| I use Claude Code. I read the discussion here, and given the
| criticism, proceeded to try some of the other solutions that
| people recommended.
|
| After spending a couple of hours trying to get aider and plandex
| to run (and then with Google Gemini 2.5 pro), my conclusion is
| that these tools have a _long_ way to go until they are usable.
| The breakage is all over the place. Sure, there is promise, but
| today I simply can 't get them to work reasonably. And my time is
| expensive.
|
| Claude Code just works. I run it (even in a slightly unsupported
| way, in a Docker container on my mac) and it works. It does
| stuff.
|
| PS: what is it with all "modern" tools asking you to "curl
| somewhere.com/somescript.sh | bash". Seriously? Ship it in a
| docker container if you can't manage your dependencies.
| beefnugs wrote:
| Isn't this bad that every model company is making their own
| version of the IDE level tool?
|
| Wasn't it clearly bad when facebook would get real close to
| buying another company... then decide naw, we got developers out
| the ass lets just steal the idea and put them out of business
| bob1029 wrote:
| I've developed a new mental model of the LLM codebase automation
| solutions. These are effectively identical to outsourcing your
| product to someone like Infosys. From an information theory
| perspective, you need to communicate approximately the same
| amount of things in either case.
|
| Tweaking claude.md files until the desired result is achieved is
| similar to a back and forth email chain with the contractor. The
| difference being that the contractor can be held accountable in
| our human legal system and can be made to follow their "prompt"
| very strictly. The LLM has its own advantages, but they seem to
| be a subset since the human contractor can also utilize an LLM.
|
| Those who get a lot of uplift out of the models are almost
| certainly using them in a cybernetic manner wherein the model is
| an integral part of an expert's thinking loop regarding the
| program/problem. Defining a pile of policies and having the LLM
| apply them to a codebase automatically is a significantly less
| impactful use of the technology than having a skilled human
| developer leverage it for immediate questions and code snippets
| as part of their normal iterative development flow.
|
| If you've got so much code that you need to automate eyeballs
| over it, you are probably in a death spiral already. The LLM
| doesn't care about the terrain warnings. It can't "pull up".
| stepbeek wrote:
| This matches well with my experience so far. It's why the chat
| interface has remained my preference over autocomplete in an
| IDE.
| ixaxaar wrote:
| > These are effectively identical to outsourcing your product
| to someone like Infosys.
|
| But in my experience, the user has to be better than an Infosys
| employee to know how to convey the task to the LLM and then
| verify iteratively.
|
| So more like an experienced engg outsourcing work to a service
| company engg.
| anamexis wrote:
| That's exactly what they were saying.
| charlie0 wrote:
| The benefit of doing it like this is that I also get to learn
| from the LLM. It will surprise me from time to time about
| things I didn't know and it gives me a chance to learn and get
| better as well.
| Terretta wrote:
| We, mere humans, communicate our needs poorly, and
| undervisualize until we see concrete results. This is the state
| of us.
|
| Faced with us as a client, the LLM has infinite patience at
| linear but marginal cost (relative to your thinking/design time
| cost, and the value of instant iteration as you realize what
| you meant to picture and say).
|
| With offshoring, telling them they're getting it wrong is not
| just horrifically slow thanks to comms and comprehension
| latency, it makes you a problem client, until soon you'll find
| the do-over cost becomes neither linear nor marginal.
|
| Don't sleep on the power of small fast iterations (not vibes,
| concrete iterations), with an LLM tool that commits as you go
| and can roll back both code and mental model when you're down a
| garden path.
| highfrequency wrote:
| Intriguing perspective! Could you elaborate on this with
| another paragraph or two?
|
| > We humans undervisualize until we see concrete results.
| Terretta wrote:
| > > _We humans undervisualize until we see concrete
| results._
|
| > _Could you elaborate on this with another paragraph or
| two?_
|
| Volunteer as a client-facing PdM at a digital agency for a
| week*, you'll be able to elaborate with a book.
|
| * Well, long enough to try to iterate a client instruction
| based deliverable.
| imafish wrote:
| Why do people use Claude Code over e.g. Cursor or Windsurf?
| submeta wrote:
| I love Claude Code. It just gets the job done where Cursor (even
| with Claude Sonnet 3.7) will get lost in changing files without
| results.
|
| Did anyone have equal results with the ,,unofficial" fork ,,Anon
| Kode"? Or with Roo Code with Gemini Pro 2.5?
| kkukshtel wrote:
| I recently wrote a big blog post on my experience spending about
| $200 with Claude Code to "vibecode" some major feature
| enhancements for my image gallery site mood.site
|
| https://kylekukshtel.com/vibecoding-claude-code-cline-sonnet...
|
| Would definitely recommend people reading it for some insight
| into hands on experience with the tool.
___________________________________________________________________
(page generated 2025-04-20 23:02 UTC)