hngopher.com

       [HN Gopher] Token growth indicates future AI spend per dev
       ___________________________________________________________________
        
       Token growth indicates future AI spend per dev
        
       Author : twapi
       Score  : 161 points
       Date   : 2025-08-11 17:59 UTC (5 hours ago)
        
 (HTM) web link (blog.kilocode.ai)
 (TXT) w3m dump (blog.kilocode.ai)
        
       | senko wrote:
       | tl;dr
       | 
       | > This is driven by two developments: more parallel agents and
       | more work done before human feedback is needed.
        
       | throwanem wrote:
       | "Tokenomics."
        
         | TranquilMarmot wrote:
         | I studied this in college but I think we had a different idea
         | of what "toke" means
        
           | throwanem wrote:
           | Eh. The implicit claim is the same as everywhere, namely that
           | that $100k/dev/year of AI opex is an enormous bargain over
           | going up two orders of magnitude in capex to pay for the same
           | output from a year's worth of a team. But now that Section
           | 174's back and clearly set to stay for a good long while, it
           | makes sense to see this line of discourse come along.
        
       | jeanlucas wrote:
       | So convenient a future AI dev will cost as much as a human
       | developer, pure coincidence
        
         | naiv wrote:
         | But he works 24/7 at then maybe 20x output
        
           | crinkly wrote:
           | Like fuck that's happening. Human dev will spend entire day
           | gaslighting an electronic moron rather than an an outsourced
           | team.
           | 
           | The only argument we have so far is wild extrapolation and
           | faith. The burden of proof is on the proclaimer.
        
           | nicce wrote:
           | Why we can't keep the current jobs but accelerate humanity
           | development by more than 20x with AI? Everyone is just
           | talking about replacement, without the mention of potential.
        
             | buzzerbetrayed wrote:
             | I'm not entirely sure I understand exactly what you're
             | suggesting. But I'd imagine it's because a company that
             | doesn't have to pay people will out compete the company
             | that does.
             | 
             | There could be some scenario where it is advantageous to
             | have humans working with AI. But if that isn't how reality
             | plays out then companies won't be able to afford to pay
             | people.
        
             | hx8 wrote:
             | I don't think there is market demand for 20x more software
             | produced each year. I suspect AI will actively _decrease_
             | demand for several major sectors of software development,
             | as LLMs take over roles that were handled previously be
             | independent applications.
        
               | taftster wrote:
               | Right. This is insightful. It's not so much about
               | replacing developers, per se. It's about replacing
               | applications that developers were previously employed to
               | create/maintain.
               | 
               | We talk about AI replacing a workforce, but your
               | observation that it's more about replacing applications
               | is spot on. That's definitely going to be the trend,
               | especially for traditional back-office processing.
        
               | hx8 wrote:
               | I'm specifically commenting on the double whammy of
               | increased software developer productivity and decreased
               | demand for independent applications.
        
               | nicce wrote:
               | I think it depends on how you view it. With 20x
               | productivity you can start to minimize your supply chain
               | and reduce costs in the long term. No more cloud usage in
               | foreign countries, since you might be able to make the
               | necessary software by yourself. You can start dropping
               | expensive SaaS and you make enough for your own internal
               | needs. Heck, I would just increase the demand because
               | there is so much potential. Consultants and third-party
               | software houses will likely decrease. unless they are
               | even more efficient.
               | 
               | LLMs act as interfaces to applications which you are
               | capable to build yourself and run your own hardware,
               | since you are much more capable.
        
               | LtWorf wrote:
               | > and third-party software houses will likely decrease.
               | unless they are even more efficient.
               | 
               | It's going to be really fun for us people who love to
               | write unicode symbols into numeric input boxes and such
               | funny things.
        
               | oblio wrote:
               | That's not how this works:
               | 
               | https://en.m.wikipedia.org/wiki/Jevons_paradox
        
               | hx8 wrote:
               | Jevon's Paradox isn't a hard rule. There's plenty of
               | situations where a resource becomes cheaper and the
               | overall market decreases in total value.
        
             | dsign wrote:
             | There is great potential. But if humanity can't share a
             | loaf of bread with the needy, nor stop the blood irrigation
             | of the cracked, dusty soil of cursed Canaan[^1], what are
             | the odds that that acceleration will benefit anybody?
             | 
             | ([^1]: They have been at it for a long while now, a few
             | thousand years?)
        
           | croes wrote:
           | And is neither reliable nor liable.
        
           | SpaceNoodled wrote:
           | An LLM by itself has 0% output.
           | 
           | An engineer shackled to an LLM has about 80% output.
        
         | jgalt212 wrote:
         | It's sort of like how high cost funds net of fees offer the
         | same returns as low cost ETFs net of fees.
        
           | oblio wrote:
           | I'm not sure I understand this one.
        
         | magicalhippo wrote:
         | Similar to housing in attractive places no? Price is related to
         | what people can afford, rather than what the actual house/unit
         | is worth in terms of material and labor.
        
           | maratc wrote:
           | Except for "material and labor" there is an additional cost
           | of land.
           | 
           | That is already "related to what people can afford", in
           | attractive places or not.
        
         | thisisit wrote:
         | This is just a ball park number. Its like the AI dev will cost
         | some what less than a human developer. Enough for AI providers
         | to have huge margins and allow for CTOs to say - "I replaced
         | all devs and saved so much money".
        
           | mattmanser wrote:
           | And then the CTOs will learn the truth that most product
           | managers are just glorified admin assistants who couldn't
           | write a spec for tic-tac-toe.
           | 
           | And that to write the business analysis that the AI can
           | actually turn into working code requires senior developers.
        
         | eli_gottlieb wrote:
         | I mean, hey, rather than use AI at work, I'll just take the
         | extra $100k/year and be _just that good_.
        
         | insane_dreamer wrote:
         | The full cost of an employee is a fair bit more than just their
         | base salary.
        
           | SoftTalker wrote:
           | Wait until the taxes on AI come, to pay for all the
           | unemployment they are creating.
        
       | boltzmann_ wrote:
       | Author just choose a nice number and give no argument to it
        
         | mromanuk wrote:
         | Probably chose $100k/yr as an example of the salary of a
         | developer.
        
       | g42gregory wrote:
       | This is why it's so critical to have open source models.
       | 
       | In a year or so, the open source models will become good enough
       | (in both quality and speed) to run locally.
       | 
       | Arguably, OpenAI OSS 120B is already good enough, in both quality
       | and speed, to run on Mac Studio.
       | 
       | Then $10k, amortized over 3 years, will be enough to run code
       | LLMs 24/7.
       | 
       | I hope that's the future.
        
         | okdood64 wrote:
         | What's performance of running OpenAI OSS 120B on a Mac Studio
         | as compared to running a paid subscription frontier LLM?
        
           | andrewmcwatters wrote:
           | Chiming in here, M1 Max MacBook Pro 64GB using gpt-oss:20b
           | over ollama with Visual Studio Code with GitHub Copilot is
           | unusably slow compared to using Claude Sonnet 4, which
           | requires (I think?) GitHub Copilot Pro.
           | 
           | But I'm happy to pay the subscription vs buying a Mac Studio
           | for now.
        
             | Jimpulse wrote:
             | Ollama's implementation for gpt-oss is poor.
        
           | jermaustin1 wrote:
           | I will answer for the 20B version on my RTX3090 for anyone
           | who is interested (SUPER happy with the quality it outputs,
           | as well). I've had it write a handful of HTML/CSS/JS SPAs
           | already.
           | 
           | With medium and high reasoning, I will see between 60 and 120
           | tokens per second, which is outrageous compared to the LLaMa
           | models I was running before (20-40tps - I'm sure I could have
           | adjusted parameters somewhere in there).
        
             | ivape wrote:
             | Do we know why it's so fast barring hardware?
        
               | mattmanser wrote:
               | Because he's getting crap output. Open source locally on
               | something that under-powered is vastly worse than paid
               | LLMs.
               | 
               | I'm no shill, I'm fairly skeptical about AI, but been
               | doing a lot of research and playing to see what I'm
               | missing.
               | 
               | I haven't bothered running anything locally as the
               | overwhelming consensus is that it's just not good enough
               | yet. And that from posts and videos in the last two
               | weeks.
               | 
               | I've not seen something so positive about local LLMs
               | anywhere else.
               | 
               | It's simply just not there yet, and definitely aren't for
               | a 4090.
        
               | ivape wrote:
               | I guess I meant how is a 20b param model simply faster
               | than another 20b model? What techniques are they using?
        
               | medvezhenok wrote:
               | It's a MoE (mixture of experts) architecture, which means
               | that there's only 3.6 billion parameters activated per
               | token (but a total of 20b parameters for the model). So
               | it should run at the same speed that a 3.6b model would
               | run assuming that all of the parameters fit in vRAM.
               | 
               | Generally, 20b MoE will run faster but be less smart than
               | a 20b dense model. In terms of "intelligence" the rule of
               | thumb is the geometric mean between the number of active
               | parameters and the number of total parameters.
               | 
               | So a 20b model with 3.6b active (like the small gpt-oss)
               | should be roughly comparable in terms of output quality
               | to a sqrt(3.6*20) = 8.5b parameter model, but run with
               | the speed of a 3.6b model.
        
               | jermaustin1 wrote:
               | That is a bit harsh. I'm actually quite pleased with the
               | code it is outputting currently.
               | 
               | I'm not saying it is anywhere close to a paid foundation
               | model, but the code it is outputting (albeit simple) has
               | been generally well written and works. I do only get a
               | handful of those high-thought responses before the 50k
               | token window starts to delete stuff, though.
        
         | mockingloris wrote:
         | Most devs where I'm from would scrape to cough up that amount
         | 
         | More niche use case models have to be developed for cheaper and
         | energy optimized hardware.
         | 
         | +-- Dey well
        
           | skybrian wrote:
           | This would be a business expense. Compared to hiring a
           | developer for a year, it would be more reasonable.
           | 
           | For a short-term gig, though, I don't think they would do
           | that.
        
         | asadm wrote:
         | Even if they do get better. The latest closed-source
         | {gemini|anthropic|openai} model will always be insanely good
         | and it would be dumb to use a local one from 3 years back.
         | 
         | Also tooling, you can use aider which is ok. But claude code
         | and gemini cli will always be superior and will only work
         | correctly with their respective models.
        
           | SparkyMcUnicorn wrote:
           | I use Claude Code with other models sometimes.
           | 
           | For well defined tasks that Claude creates, I'll pass off
           | execution to a locally run model (running in another Claude
           | Code instance) and it works just fine. Not for every task,
           | but more than you might think.
        
           | asgraham wrote:
           | I don't know about your first point: at some point the three-
           | year difference may not be worth the premium, as local models
           | reach "good enough."
           | 
           | But the second point seems even less likely to be true: why
           | will Claude code and Gemini cli _always_ be superior? Other
           | than advantageous token prices (which the people willing to
           | pay the aforementioned premium shouldn't even care about),
           | what do they inherently have over third-party tooling?
        
             | nickstinemates wrote:
             | Even using Claude Code vs. something like Crush yields
             | drastically different results. Same model, same prompt,
             | same cost... the agent is a huge differentiator, which
             | surprised me.
        
               | asgraham wrote:
               | I totally agree that the agent is essential, and that
               | right now Claude Code is semi-unanimously the best agent.
               | But agentic tooling is written, not trained (as far as I
               | can tell--someone correct me) so it's not immediately
               | obvious to me that a third-party couldn't eventually do
               | it better.
               | 
               | Maybe to answer my own question, LLM developers have one,
               | potentially two advantages over third-party tooling
               | developers: 1) virtually unlimited tokens, zero rate
               | limiting with which to play around with tooling dev. 2)
               | the opportunity to train the network on their own
               | tooling.
               | 
               | The first advantage is theoretically mitigated by insane
               | VC funding, but will probably always be a problem for
               | OSS.
               | 
               | I'm probably overlooking news that the second advantage
               | is where Anthropic is winning right now; I don't have
               | intuition for where this advantage will change with time.
        
         | hoppp wrote:
         | I am looking forward for the AMD 395 max+ PCs to come down in
         | price.
         | 
         | The inference speed locally will be acceptable in 5-10 years
         | thanks to those generation of chips and finally we can have
         | good local AI apps.
        
         | skybrian wrote:
         | Open source models could be run by low-cost cloud providers,
         | too. They could offer discounts for a long term contract and
         | run it on dedicated hardware.
        
           | qingcharles wrote:
           | This. Your local LLM, even if shared between a pool of devs,
           | is probably only going to be working 8 hours a day. Better to
           | use a cloud provider, especially if you can find a way to
           | ensure data security, if that is an issue for you.
        
           | wongarsu wrote:
           | Exactly. There is no shortage of providers hosting open
           | source models with per-token pricing, with a variety of
           | speeds and context sizes at different price points.
           | Competition is strong and barriers of entry low, ensuring
           | that margins stay low and prices fair.
           | 
           | If you want complete control over your data and don't trust
           | anyone's assurances that they keep it private (and why should
           | you) then you have to self-host. But if all you care about is
           | a good price then the free market already provides that for
           | open models
        
           | hkt wrote:
           | Hetzner and Scaleway already do instances with GPUs so this
           | kinda already exists
        
             | hkt wrote:
             | In fact, does anybody want to hire a server with me? I
             | suspect it'll work out cheaper than Claude max etc: a
             | server from hetzner starts at PS220ish:
             | https://www.hetzner.com/dedicated-rootserver/matrix-gpu/
             | 
             | It might be fun to work out how to share, too. A whole new
             | breed of shell hosting.
        
         | aydyn wrote:
         | This is unrealistic hopium, and deep down you probably know it.
         | 
         | There's no such thing as models that are "good enough". There
         | are models that are better and models that are worse and OS
         | models will always be worse. Businesses that use better, more
         | expensive models will be more successful.
        
           | hsuduebc2 wrote:
           | I agree. It isn't in the interest of any actor including
           | openai to give out their tools for free.
        
           | seabrookmx wrote:
           | Most tech hits a point of diminishing returns.
           | 
           | I don't think we're there yet, but it's reasonable to expect
           | at _some point_ your typical OS model could be 98% of the way
           | to a cutting edge commercial model, and at that point your
           | last sentence probably doesn't hold true.
        
           | ch4s3 wrote:
           | > Businesses that use better, more expensive models will be
           | more successful.
           | 
           | Better back of house tech can differentiate you, but startups
           | history is littered with failed companies using the best
           | tech, and they were often beaten by companies using a worse
           | is better approach. Anyone here who has been around long
           | enough has seen this play out a number of times.
        
             | freedomben wrote:
             | > _startups history is littered with failed companies using
             | the best tech, and they were often beaten by companies
             | using a worse is better approach._
             | 
             | Indeed. In my idealistic youth I bought heavily into the
             | "if you build it, they will come," but that turned out to
             | not at all be reality. Often times the best product loses
             | because of marketing, network effects, or some other reason
             | that has nothing to do with the tech. I wish it weren't
             | that way, but if wishes were fishes we'd all have a fry
        
           | Fade_Dance wrote:
           | There is a sweet spot, and at 100k per dev per year some
           | businesses may choose lower priced options.
           | 
           | The business itself will also massively develop in the coming
           | years. For example, there will be dozens of providers for
           | integrating open source models with an in-house AI framework
           | that smoothly works with their stack and deployment solution.
        
         | root_axis wrote:
         | > _In a year or so, the open source models will become good
         | enough (in both quality and speed) to run locally._
         | 
         | "Good enough" for what is the question. You can already run
         | them locally, the problem is that they aren't really practical
         | for the use-cases we see with SOTA models, which are _just_ now
         | becoming passable as semi-reliable autonomous agents. There is
         | no hope of running anything like today 's SOTA models locally
         | in the next decade.
        
           | cyanydeez wrote:
           | they might be passable, but there's zero chance they're
           | economical atm.
        
         | holoduke wrote:
         | Problem is that it really eats all resources when using a llm
         | locally. I tried it. But the whole system becomes unresponsive
         | and slow. We need minimum of 1tb memory and dedicated
         | processors to offload.
        
         | cyanydeez wrote:
         | Its not, capitalism isn't about efficiency; it's about lockin.
         | You can't lockin open source models. If fascism under
         | republicans continue, you can bet they'll be shut down due to
         | child safety or whatever excuse the large corporations need to
         | turn off the free efficiency.
        
         | 6thbit wrote:
         | Many of the larger enterprises (retail, manufacture, insurance,
         | etc) are consistently becoming cloud-only or have reduced their
         | data center foot print massively over the last 10 years.
         | 
         | Do you think these enterprises will begin hosting their own
         | models? I'm not convinced they'll join the capex race to build
         | AI data centers. It would make more sense they just end up
         | consuming existing services.
         | 
         | Then there are the smaller startups that just never had their
         | own data center. Are those going to start self-hosting AI
         | models? And all of the related requirements to allow say a few
         | hundred employees to access a local service at once? network,
         | HA, upgrades, etc. Say you have multiple offices in different
         | countries also, and so on.
        
           | g42gregory wrote:
           | Enterprises (depending on the sector, think semi
           | manufacturing) will have no choice for two reasons:
           | 
           | 1. Protecting their intellectual property, and
           | 
           | 2. Unknown "safety" constraints baked in. Imagine an engineer
           | unable to ran some security tests because LLM thinks it's
           | "unsafe". Meanwhile, VP of Sales is on the line with the
           | customer.
        
           | nunez wrote:
           | > Do you think these enterprises will begin hosting their own
           | models? I'm not convinced they'll join the capex race to
           | build AI data centers. It would make more sense they just end
           | up consuming existing services.
           | 
           | they already are
        
           | physicsguy wrote:
           | > manufacture
           | 
           | They're much less strict than they were on cloud, but the
           | security practices are really quite strict. I work in this
           | sector and yes, they'll allow cloud, but strong data
           | isolation + segregation, access controls, networking reqs,
           | etc. etc. etc. are very much a thing in the industry still,
           | particularly where the production process is commercially
           | sensitive in itself.
        
         | moritzwarhier wrote:
         | After trying gpt-oss:20b, I'm starting to lose faith in this
         | argument, but I share your hope.
         | 
         | Also, I've never tried really huge local models and especially
         | not RAG with local models.
        
         | jvanderbot wrote:
         | It's not hard to imagine a future where I license their network
         | for inference on my own machine, and they can focus on
         | training.
        
         | coldtea wrote:
         | > _In a year or so, the open source models will become good
         | enough (in both quality and speed) to run locally._
         | 
         | Based on what?
         | 
         | And where? On systems < 48GB?
        
         | habosa wrote:
         | Every business building on LLMs should also have a contingency
         | plan for if they needed to go to an all open-weights model
         | strategy. OpenAI / Anthropic / Google have nothing stopping
         | them from 100x-ing the price or limiting access or dropping old
         | models or outright competing with their customers. Building
         | your whole business on top of them will prove to be as foolish
         | as all of the media companies that built on top of Facebook and
         | got crushed later.
        
           | OfficialTurkey wrote:
           | Couldn't you also make this argument about cloud
           | infrastructure from the standard hyperscaler cloud providers
           | (AWS, GCP, ...)? For that matter, couldn't you make this
           | argument about dependency your business has which it
           | purchases from other businesses which are competing against
           | each other to provide it?
        
             | empiko wrote:
             | In general, you are right, but AI as a field is pretty
             | volatile still. Token producers are still pivoting and are
             | generally losing money. They will have to change their
             | strategy sooner or later, and there is a good chance that
             | the users will not be happy about it.
        
           | ivape wrote:
           | _OpenAI / Anthropic / Google have nothing stopping them from
           | 100x-ing the price_
           | 
           | There is also nothing stopping this silly world from breaking
           | out into a dispute where chips are embargoed. Then we'll have
           | high API prices and hardware prices (if there's any hardware
           | at all). Even for the individual it's worth having that 2-3k
           | AI machine around, perhaps two.
        
         | DiabloD3 wrote:
         | Why bother mentioning this model? From what I've seen, it only
         | excels at benchmarks. Qwen3 is sorta where its at right now;
         | Qwen3-Coder is pretty much at "summer intern" level for coding
         | tasks, and its ahead of the rest.
         | 
         | Shame anyone is actually _paying_ for commercial inference, its
         | worse than whatever you can do locally.
        
       | typs wrote:
       | This makes sense as long as people continue to value using the
       | best models (which may or may not continue for lots of reasons).
       | 
       | I'm not entirely sure that AI companies like Cursor necessarily
       | miscalculated though. It's noted that the actual strategies the
       | blog advertises are things used by tools like Cursor (via auto
       | mode). The important thing for them is that they are able to
       | successfully push users towards their auto mode and use more
       | usage data to improve their routing and frontier models don't
       | continue to be so much better AND so expensive that users
       | continue to demand them. I wouldn't hate that bet if I were
       | Cursor personally.
        
       | sovietmudkipz wrote:
       | What is everyone's favorite parallel agent stack?
       | 
       | I've just become comfortable using GH copilot in agent mode, but
       | I haven't started letting it work in an isolated way in parallel
       | to me. Any advise on getting started?
        
       | hx8 wrote:
       | How many parallel agents can one developer actively keep up with?
       | Right now, my number seems to be about 3-5 tasks, if I review the
       | output.
       | 
       | If we assume 5 tasks, each running $400/mo of tokens, we reach an
       | annual bill of $24,000. We would have to see a 4x increase in
       | token cost to reach the $100,000/yr mark. This seems possible
       | with increased context sizes. Additionally, we might see
       | additional context sizes lead to longer running more complicated
       | tasks which would increase my number of parallel tasks.
        
       | mockingloris wrote:
       | Doesn't this segue? [We'll need a universal basic income (UBI) in
       | an AI-driven world]
       | https://news.ycombinator.com/item?id=44866518#44866713
       | 
       | +-- Yarn me
        
         | throaway920181 wrote:
         | What does "Dey well" and "Yarn me" mean at the bottom of your
         | comments?
        
           | mockingloris wrote:
           | They are Nigerian Pidgin English words:                 - Dey
           | well: Be well       - Yarn me: Lets talk
           | 
           | +-- Dey well/Be well
        
             | nmeofthestate wrote:
             | Ok, don't do that.
        
             | SoftTalker wrote:
             | Please don't use signature lines in HN comments.
             | 
             | Edit: Would have sworn that this was in the guidelines but
             | I don't see it just now.
        
       | AtNightWeCode wrote:
       | Don't know about the numbers but is this not the cloud all over
       | again. Promises about cheap storage and you don't maintain it
       | developed into maintenance hell and storage costs steadily rising
       | instead of dropping.
        
       | masterj wrote:
       | Why even stop at 100k/yr? Surely the graph is up-and-to-the-right
       | forever? https://xkcd.com/605/
        
       | jjcm wrote:
       | At some point the value of remote inference becomes more
       | expensive than just buying the hardware locally, even for server-
       | grade components. A GB200 is ~$60-70k and will run for multiple
       | years. If inference costs continue to scale, at some point it
       | just makes more sense to run even the largest models locally.
       | 
       | OSS models are only ~1 year behind SOTA proprietary, and we're
       | already approaching a point where models are "good enough" for
       | most usage. Where we're seeing advancements is more in tool
       | calling, agentic frameworks, and thinking loops, all of which are
       | independent of the base model. It's very likely that local,
       | continuous thinking on an OSS model is the future.
        
         | tempest_ wrote:
         | Maybe 60-70k nominally but where can you get one that isnt in
         | its entire rack configuration
        
           | jjcm wrote:
           | Fair, but even if you budget an additional $30k for a self-
           | contained small-unit order, you've brought yourself to the
           | equivalent proposed spend of 1 year of inference.
           | 
           | At $100k/yr/eng inference spend, your options widen greatly
           | is my point.
        
       | whateveracct wrote:
       | This is the goal. Create a reason to shave a bunch off the top of
       | SWE salaries. Pay them less because you "have" to pay for AI
       | tools. All so they don't have to do easy rote work - you still
       | get them to do the high level stuff humans must do.
        
       | turnsout wrote:
       | Tools like Cursor rely on the gym model--plenty of people will
       | pay for a tier that they don't fully utilize. The heavy users are
       | subsidized by the majority who may go months without using the
       | tool.
        
       | crestfallen33 wrote:
       | I'm not sure where the author gets the $100k number, but I agree
       | that Cursor and Claude Code have obfuscated the true cost of
       | intelligence. Tools like Cline and its forks (Roo Code, Kilo
       | Code) have shown what unmitigated inference can actually deliver.
       | 
       | The irony is that Kilo itself is playing the same game they're
       | criticizing. They're burning cash on free credits (with expiry
       | dates) and paid marketing to grab market share -- essentially
       | subsidizing inference just like Cursor, just with VC money
       | instead of subscription revenue.
       | 
       | The author is right that the "$20 - $200" subscription model is
       | broken. But Kilo's approach of giving away $100+ in credits isn't
       | sustainable either. Eventually, everyone has to face the same
       | reality: frontier model inference is expensive, and someone has
       | to pay for it.
        
         | fragmede wrote:
         | Also frontier model training is expensive, and at some point,
         | eventually, that bill also needs to get paid, by amortizing
         | over inference pricing.
        
         | cyanydeez wrote:
         | oh go one more step: the reality is these models are more
         | expensive than hiring an intern to do the same thing.
         | 
         | Unless you got a trove of self starters with a lot of money,
         | they arn't cost efficient.
        
         | fercircularbuf wrote:
         | It sounds like Uber
        
         | patothon wrote:
         | that's a good point, however maybe the difference is that kilo
         | is not creating a situation for themselves where they either
         | have to reprice or they have to throttle.
         | 
         | I believe it's pretty clear when you use these credits that
         | it's temporary (and that it's a marketing strategy), vs
         | claude/cursor where they have to fit their costs into the
         | subscription price and make things opaque to you
        
       | StratusBen wrote:
       | I started https://www.vantage.sh/ - a cloud cost platform that
       | tracks Infra & AI spend.
       | 
       | The $100k/dev/year figure feels like sticker shock math more than
       | reality. Yes, AI bills are growing fast - but most teams I see
       | are still spending substantially lower annually, and that's
       | before applying even basic optimizations like prompt caching,
       | model routing, or splitting work across models.
       | 
       | The real story is the AWS playbook all over again: vendors keep
       | dropping unit costs, customers keep increasing consumption faster
       | than prices fall, and in the end the bills still grow. If you're
       | not measuring it daily, the "marginal cost is trending down"
       | narrative is meaningless - you'll still get blindsided by scale.
       | 
       | I'm biased but the winners will be the ones who treat AI like any
       | other cloud resource: ruthlessly measured, budgeted, and tuned.
        
         | nunez wrote:
         | Dude, thank you for this service. I use ec2instance.info and
         | vantage.sh for Azure all of the time.
        
         | oblio wrote:
         | Ironically, except for Graviton (and that's also plateauing;
         | plus it requires that you're able to use it), basically no old
         | AWS service has been reduced in cost since 2019. EC2, S3, etc.
        
           | StratusBen wrote:
           | Look at the early days of AWS vs recent years. The fact that
           | AWS services have been basically flat since 2019 in a high-
           | inflation environment is actually pretty dang good on a
           | relative basis.
        
       | mockingloris wrote:
       | @g42gregory This would mean that for the certain devs, an unfair
       | advantage would be owning a decent on-prem rig running a fine
       | tuned and trained model that has been optimized for specific use
       | case for _the user_.
       | 
       | A fellow HN user's post I engaged with recently talked about low
       | hanging fruits.
       | 
       | What that means for me and where I'm from is some sort of devloan
       | initiative by NGOs and Government Grants, where devs have access
       | to these models/hardware and repay back with some form of value.
       | 
       | What that is, I haven't thought that far. Thoughts?
       | 
       | +-- Dey well
        
       | IshKebab wrote:
       | > Both effects together will push costs at the top level to $100k
       | a year. Spending that magnitude of money on software is not
       | without precedent, chip design licenses from Cadence or Synopsys
       | are already $250k a year.
       | 
       | For how many developers? Chip design companies aren't paying
       | Synopsys $250k/year _per developer_. Even when using formal tools
       | which are ludicrously expensive, developers can share licenses.
       | 
       | In any case, the reason chip design companies pay EDA vendors
       | these enormous sums is because there isn't really an alternative.
       | Verilator exists, but ... there's a reason commercial EDA vendors
       | can basically ignore it.
       | 
       | That isn't true for AI. Why on earth would you pay _more than a
       | full time developer salary_ on AI tokens when you could just hire
       | another person instead. I definitely think AI improves
       | productivity but it 's like 10-20% _maybe_ , not 100%.
        
         | cornstalks wrote:
         | > _For how many developers? Chip design companies aren 't
         | paying Synopsys $250k/year_ per developer. _Even when using
         | formal tools which are ludicrously expensive, developers can
         | share licenses._
         | 
         | That actually probably is per developer. You might be able to
         | reassign a seat to another developer, but that's still arguably
         | one seat per user.
        
           | IshKebab wrote:
           | I don't think so. The company I worked for until recently had
           | around 200 licenses for our main simulator - at that rate it
           | would cost $50m/year, but our total run rate (including all
           | salaries and EDA licenses) was only about $15m/year.
           | 
           | They're _super_ opaque about pricing but I don 't think it's
           | _that_ expensive. Apparently formal tools are _way_ more
           | expensive than simulation though (which makes sense), so we
           | only had a handful of those licenses.
           | 
           | I managed to find a real price that someone posted:
           | 
           | https://www.reddit.com/r/FPGA/comments/c8z1x9/modelsim_and_q.
           | ..
           | 
           | > Questa Prime licenses for ~$30000 USD.
           | 
           | That sounds way more realistic, and I guess you get decent
           | volume discounts if you want 200 licenses.
        
       | yieldcrv wrote:
       | I think what this model actually showed is a cyclical aspect of
       | tokens as a commodity
       | 
       | It is based on supply and demand of GPUs, the demand currently
       | outstrips supply, while the 'frontier models' are also much more
       | computationally efficient than last year's models in some ways -
       | using far fewer computational resources to do the same thing
       | 
       | so now that everyone wants to use frontier models in "agentic
       | mode" with reasoning eating up a ton more tokens before sticking
       | with a result, the demand is outpacing supply but it is possible
       | it equalizes yet again, before the cycle begins anew
        
       | zahlman wrote:
       | > The difference in pay between inference and training engineers
       | is because of their relative impact. You train a model with a
       | handful of people while it is used by millions of people.
       | 
       | Okay, but when did that ever create a comparable effect for any
       | other kind of software dev in history?
        
       | gedy wrote:
       | Maybe this is why companies are hyping the "replacing devs"
       | angle, as "wow see we're still cheaper than that engineer!" is
       | going to be only viable pitch.
        
         | woeirua wrote:
         | Its not viable yet, and at current token spend rates, it's
         | likely not going to be viable for several years.
        
       | austin-cheney wrote:
       | There is nothing new here and the math on this is pretty simple.
       | AI greatly increases automation, but its output is not trusted.
       | All research so far shows AI assisted development is a zero sum
       | game regarding time and productivity because time saved by AI is
       | reinvested back into more thorough code reviews than were
       | otherwise required.
       | 
       | Ultimately, this will become a people problem more than a
       | financial problem. People that lack the confidence to code
       | without AI will cost less to hire and dramatically more to
       | employ, no differently than people reliant on large frameworks.
       | All historical data indicates employers will happily eat that
       | extra cost if it means candidates are easier to identify and
       | select because hiring and firing remain among the most serious
       | considerations for technology selection.
       | 
       | Candidates, currently thought of 10x, that are productive without
       | these helpers will continue to remain no more or less elusive
       | than they are now. That means employers must choose between
       | higher risks with higher selection costs for the potentially
       | higher return on investment knowing that ROE is only realized if
       | these high performance candidates are allowed to execute with
       | high productivity. Employers will gladly eat increased expenses
       | if they can qualify lower risks to candidate selection.
        
         | jjmarr wrote:
         | You're assuming it's a binary between coding with or without
         | AI.
         | 
         | In my experience, a 10x developer that can code without AI
         | becomes a 100x developer because the menial tasks they'd
         | delegate to less-skilled employees while setting technical
         | direction can now be delegated to an AI instead.
         | 
         | If your _only_ skill is writing boilerplate in a framework, you
         | won 't be employed to do that with AI. You will not have a job
         | at all and the 100xer will take your salary.
        
           | oblio wrote:
           | The thing is, the 100x can't be in all the verticals, speak
           | all the languages, be a warm body required by legislation,
           | etc, etc. Plus that 100x just became a 10x (x 10x) bus
           | factor.
           | 
           | This will reduce demand for devs but it's super likely that
           | after a delay, demand for software development will go even
           | higher.
           | 
           | The only thing I don't know is how that demand for software
           | development will look like. It could be included in DevOps
           | work or IT Project Management work or whatever.
           | 
           | I guess we'll see in a few years.
        
           | austin-cheney wrote:
           | Those are some strange guesses.
        
           | LtWorf wrote:
           | Too bad that when people actually tried to measure this,
           | turned out developers were actually slower.
        
       | zeld4 wrote:
       | give me $50k raise and I need only $10k/yr.
       | 
       | seriously, I don't see the AI outcome worth that much yet.
       | 
       | On the current level of ai tools, the attention you need to
       | manage 10+ async tasks are over limit for most human.
       | 
       | In 10 years maybe, but $100k probably worths much less by then.
        
       | chiffre01 wrote:
       | Honestly we're in a race to the bottom right now with AI.
       | 
       | It's only going to get cheaper to train and run these models as
       | time goes on. Modes running on single consumer grade PCs today
       | were almost unthinkable four years ago.
        
       | 6thbit wrote:
       | An interesting metric is when token bills per dev exceed the cost
       | of hiring a new dev. But also, if paying another dev's worth in
       | tokens getting you further than 2 devs without using AI will you
       | still pay it?
       | 
       | I wonder how the economics will play out, especially when you add
       | in all the different geographic locations for remote devs and
       | their cost.
        
         | jjmarr wrote:
         | They already do for anything not in Western Europe/North
         | America.
        
       | AstroBen wrote:
       | > charge users $200 while providing at least $400 worth of
       | tokens, essentially operating at -100% gross margin.
       | 
       | Why are we assuming everyone uses the full $400? Margins aren't
       | calculated based on only the heaviest users..
       | 
       | And where are they pulling the 100k number from?
        
       | daft_pink wrote:
       | If you are throttled at $200 per month, you should probably just
       | pay another $200 a month for a second subscription, because the
       | value is there. That's my take from using Claude.
        
       | dcre wrote:
       | "The bet was that by the following year, the application
       | inference would cost 90% less, creating a $160 gross profit (+80%
       | gross margins). But this didn't happen, instead of declining the
       | application inference costs actually grew!"
       | 
       | This doesn't make any sense to me. Why would Cursor et al expect
       | they could pocket the difference if inference costs went down?
       | There's no stickiness to the product; they would compete down to
       | zero margins regardless. If anything, higher total spend is
       | _better_ for them because it 's more to skim off of.
        
       | jvanderbot wrote:
       | It's not hard to imagine a future where I license their network
       | for inference on my own machine, and they can focus on training.
        
         | oblio wrote:
         | The problem with this is that the temptation to do more us too
         | big. Nobody wants to be a "dumb pipe", a utility.
        
       | thebigspacefuck wrote:
       | Never heard of kilo before, pretty sure this post is just an ad
        
         | lvl155 wrote:
         | I've not heard either but now I am getting ads from them. I
         | guess that was their plan.
        
       | mwkaufma wrote:
       | Title modded without merit.
        
       | lvl155 wrote:
       | What is Kilocode?
        
         | tirumario wrote:
         | open-source AI coding agent extension for VS Code
        
       | ankit219 wrote:
       | No justification for a $100k number. For $100k a year or about
       | $8k a month, you will end up using 1B tokens a month (that too a
       | generous blended $8 per million input/output tokens including
       | caching while the number is lower than that). Per person.
       | 
       | I think there is a case Claude did not reduce their pricing given
       | that they have the best coding models out there. There recent
       | fundraise had them disclose their Gross margins at 60% (and -30%
       | with usage via bedrock etc). This way they can offer 2.5x more
       | tokens at the same price than the vibe code companies and yet
       | break even. The market movement where the assumption did not work
       | out was about how we still only have claude which made vibe
       | coding work and is the most tasteful when it comes to what users
       | want. There are probably models better at thinking and logic,
       | especially o3, but this signals the staying power of claude -
       | having a lock in, it's popularity, and challenges the more
       | fundamental assumption about language models being commodities.
       | 
       | (Speculating) Many companies woudl want to move away from claude
       | but cant because users love the models.
        
       | paulhodge wrote:
       | Fyi Kilocode has low credibility. They've been blasting AI
       | subreddits with lots of clickbaity ads and posts, sometimes
       | claiming things that are outright false.
       | 
       | As far as spend per dev- I can't even manage to use up the limits
       | on my $100 Claude plan. It gets everything done and I run out of
       | things to ask it. Considering that the models will get better and
       | cheaper over time, I'm personally not seeing a future where I
       | will need to spend that much more than $100 a month.
        
       ___________________________________________________________________
       (page generated 2025-08-11 23:01 UTC)