hngopher.com

       [HN Gopher] Perverse incentives of vibe coding
       ___________________________________________________________________
        
       Perverse incentives of vibe coding
        
       Author : laurex
       Score  : 115 points
       Date   : 2025-05-14 19:29 UTC (3 hours ago)
        
 (HTM) web link (fredbenenson.medium.com)
 (TXT) w3m dump (fredbenenson.medium.com)
        
       | comex wrote:
       | > There was no standardization of parts in the probe. Two widgets
       | intended to do almost the same job could be subtly different or
       | wildly different. Braces and mountings seemed hand carved. The
       | probe was as much a sculpture as a machine.
       | 
       | > Blaine read that, shook his head, and called Sally. Presently
       | she joined him in his cabin.
       | 
       | > "Yes, I wrote that," she said. "It seems to be true. Every nut
       | and bolt in that probe was designed separately. It's less
       | surprising if you think of the probe as having a religious
       | purpose. But that's not all. You know how redundancy works?"
       | 
       | > "In machines? Two gilkickies to do one job. In case one fails."
       | 
       | > "Well, it seems that the Moties work it both ways."
       | 
       | > "Moties?"
       | 
       | > She shrugged. "We had to call them something. The Mote
       | engineers made two widgets do one job, all right, but the second
       | widget does two other jobs, and some of the supports are also
       | bimetallic thermostats and thermoelectric generators all in one.
       | Rod, I barely understand the words. Modules: human engineers work
       | in modules, don't they?"
       | 
       | > "For a complicated job, of course they do."
       | 
       | > "The Moties don't. It's all one piece, everything working on
       | everything else. Rod, there's a fair chance the Moties are
       | brighter than we are."
       | 
       | - The Mote in God's Eye, Larry Niven and Jerry Pournelle (1974)
       | 
       | [...too bad that today's LLMs are not brighter than we are, at
       | least when it comes to writing correct code...]
        
         | mnky9800n wrote:
         | That book is very much fun and also I never understood why
         | Larry Niven is so obsessed with techno feudalism and gender
         | roles. I think this is my favourite book but I think his best
         | book is maybe Ringworld.
        
           | Loughla wrote:
           | Ringworld is a great book. The later books have great
           | concepts, but could do without so much. . . rishing. Niven
           | plainly inserted his furry porn fetish into those books, for
           | reasons unclear to any human alive.
        
             | Suppafly wrote:
             | >for reasons unclear to any human alive
             | 
             | Given how prevalent furries seem to be, especially in nerd
             | adjacent culture, I'd say he was ahead of his time.
        
           | AlexCoventry wrote:
           | The zero-sum mentality which leads people to think that way
           | is already clear in _The Mote In God 's Eye_. I think the
           | point of the book is that despite being superior to humans in
           | every way imaginable, the Moties are condemned to repeated
           | violent conflict by Malthusian pressures, because they have
           | nowhere to expand. One way I interpret the "mote" in God's
           | eye is the authors' belief that no matter how good we get,
           | we'll always be in potentially violent conflict with each
           | other for limited resources. (The "beam" in our own eye is
           | then that we're still fighting each other over less pressing
           | concerns. :-)
        
           | Suppafly wrote:
           | >I think this is my favourite book but I think his best book
           | is maybe Ringworld.
           | 
           | Ringworld is pretty good, the multiples sequels get kind of
           | out there.
        
             | mnky9800n wrote:
             | I never read any of the sequels just a couple of the short
             | story collections and some of the man kzin wars. What's
             | wild about them?
        
         | jerf wrote:
         | Yeah, I've had that thought too.
         | 
         | I think a lot about Motie engineering versus human engineering.
         | Could Motie engineering be practical? Is human engineering a
         | fundamentally good idea, or is it just a reflection of our
         | working memory of 7 +/- 2? Biology is Motie-esque, but it's
         | pretty obvious we are nowhere near a technology level that
         | could ever bring a biological system up from scratch.
         | 
         | If Motie engineering is a good idea, it's not a smooth
         | gradient. The Motie-est code I've seen is also the worst. It is
         | definitely not the case that getting a bit more Motie-esque,
         | all else being equal, produces better results. Is there some
         | crossover point where it gets better and maybe passes our
         | modular designs? If AIs do get better than us at coding, and it
         | turns out they do settle on Motie-esque coding, no human will
         | ever be able to penetrate it ever again. We'd have to instruct
         | our AI coders to deliberately cripple themselves to stay
         | comprehensible, and that is... economically a tricky
         | proposition.
         | 
         | After all, anyone can write anything into a novel they want to
         | and make anything work. It's why I've generally stopped reading
         | fiction that is explicitly meant to make ideological or
         | political points to the exclusion of all else; anything can
         | work on a page. Does Motie engineering correspond to anything
         | that could be manifested practically in reality?
         | 
         | Will the AIs be better at modularization than any human? Will
         | they actually manifest the Great OO Promise of vast piles of
         | amazingly well-crafted, re-usable code once they mature? Or
         | will the optimal solution turn out to be bespoke, locally-
         | optimized versions of everything everywhere, and the solution
         | to combining two systems is to do whatever locally-sensible
         | customizations are called for?
         | 
         | (I speak of the final, mature version, however long that may
         | be. Today LLMs are kind of the worst of both worlds. That turns
         | out to be a big step up from "couldn't play in this space at
         | all", so I'm not trying to fashionably slag on AIs here. I'm
         | more saying that the one point we have is not yet enough to
         | draw so much as a line through, let alone an entire multi-
         | dimensional design methodology utility landscape.)
         | 
         | I didn't expect to live to see the answers, but maybe I will.
        
           | fwip wrote:
           | For me, "Motie engineering" always brings to mind "The Story
           | of Mel." http://www.catb.org/jargon/html/story-of-mel.html
        
       | bradly wrote:
       | > it might be difficult for AI companies to prioritize code
       | conciseness when their revenue depends on token count.
       | 
       | Would open source, local models keep pressure on AI companies to
       | prioritize the usable code, as code quality and engineering time
       | saved are critical to build vs buy discussions?
        
         | jsheard wrote:
         | Depends if open source models can remain relevant once the
         | status quo of "company burns a bunch of VC money to train a
         | model, open sources it, and generates little if any revenue"
         | runs out of steam. That's obviously not sustainable long term.
        
           | Larrikin wrote:
           | Maybe we will get some university backed SETI like projects
           | to replace all those personal mining rigs now that that hype
           | is finally fading.
        
       | Workaccount2 wrote:
       | Are using the APIs worth the extra cost vs using the web tools? I
       | haven't used any API tools, I am not a programmer, but I have
       | generated many millions of tokens in the web canvas, something
       | that would cost way more than the $20 I spend for them.
        
         | jfim wrote:
         | If you're using Claude code or cursor, for example, they can
         | read files automatically instead of needing the user to copy
         | paste back and forth.
         | 
         | Both can generate code though, I've generated code using the
         | web interface and it works, it's just a bit tedious to copy
         | back and forth.
        
         | thimabi wrote:
         | I think the idea that LLMs are incentivized to write verbose
         | code fails when one considers non-API usage.
         | 
         | Like you, I've accumulated tons of LLM usage via apps and web
         | apps. I can actually see how the models are much more succinct
         | there compared to the API interface.
         | 
         | My uneducated guess is that LLM models try to fit their
         | responses into the "output tokens" limit, which is surely much
         | lower in UIs than what can be set in pay-as-you-go interfaces.
        
       | tippytippytango wrote:
       | This article captures a lot of the problem. It's often
       | frustrating how it tries to work around really simple issues with
       | complex workarounds that don't work at all. I tell it the secret
       | simple thing it's missing and it gets it. It always makes me
       | think, god help the vibe coders that can't read code. I actually
       | feel bad for them.
        
         | r053bud wrote:
         | I fear that's going to end up being a significant portion of
         | engineers in the future.
        
           | babyent wrote:
           | I think we are in the Flash era again lol.
           | 
           | You remember those days right? All those Flash sites.
        
         | iotku wrote:
         | There's a pretty big gap between "make it work" and "make it
         | good".
         | 
         | I've found with LLMs I can usually convince them to get me at
         | least something that mostly works, but each step compounds with
         | excessive amounts of extra code, extraneous comments ("This
         | loop goes through each..."), and redundant functions.
         | 
         | In the short term it feels good to achieve something 'quickly',
         | but there's a lot of debt associated with running a random
         | number generator on your codebase.
        
           | didgetmaster wrote:
           | In my opinion, the difference between good code and code that
           | simply works (sometimes barely); is that good code will still
           | work (or error out gracefully) when the state and the inputs
           | are not as expected.
           | 
           | Good programs are written by people who anticipate what might
           | go wrong. If the document says 'don't do X'; they know a
           | tester is likely to try X because a user will eventually do
           | it.
        
         | grufkork wrote:
         | Working as an instructor for a project course for first-year
         | university students, I have run in to this a couple of times.
         | The code required for the project is pretty simple, but there
         | are a couple of subtle details that can go wrong. Had one group
         | today with bit shifts and other "advanced" operators
         | everywhere, but the code was not working as expected. I asked
         | them to just `Serial.println()` so they could check what was
         | going on, and they were stumped. LLMs are already great tools,
         | but if you don't know basic troubleshooting/debugging you're in
         | for a bad time when the brick wall arrives.
         | 
         | On the other hand, it shows how much coding is just repetition.
         | You don't need to be a good coder to perform serviceable work,
         | but you won't create anything new and amazing either, if you
         | don't learn to think and reason - but that might for some
         | purposes be fine. (Worrying for the ability of the general
         | population however)
         | 
         | You could ask whether these students would have gotten anything
         | done without generated code? Probably, it's just a momentarily
         | easier alternative to actual understanding. They did however
         | realise the problem and decided by themselves to write their
         | own code in a simpler, more repetitive and "stupid" style, but
         | one that they could reason about. So hopefully a good lesson
         | and all well in the end!
        
         | martin-t wrote:
         | > I tell it the secret simple thing it's missing and it gets
         | it.
         | 
         | Anthropomorphizing LLMs is not helpful. It doesn't get
         | anything, you just gave it new tokens, ones which are more
         | closely correlated with the correct answer. It also generates
         | responses similar to what a human would say in the same
         | situation.
         | 
         | Note i first wrote "it also mimicks what a human would say",
         | then I realized I am anthropomorphizing a statistical algorithm
         | and had to correct myself. It's hard sometimes but language
         | shapes how we think (which is ironically why LLMs are a thing
         | at all) and using terms which better describe how it really
         | works is important.
        
           | ben_w wrote:
           | Given that LLMs are trained on humans, who don't respond well
           | to being dehumanised, I expect anthropomorphising them to be
           | better than the opposite of that.
           | 
           | https://www.microsoft.com/en-us/worklab/why-using-a-
           | polite-t...
        
           | Suppafly wrote:
           | >Anthropomorphizing LLMs is not helpful
           | 
           | It's a feature of language to describe things in those terms
           | even if they aren't accurate.
           | 
           | >using terms which better describe how it really works is
           | important
           | 
           | Sometimes, especially if you doing something where that
           | matters, but abstracting those details away is also useful
           | when trying to communicate clearly in other contexts.
        
           | tippytippytango wrote:
           | Patronizing much?
        
       | sigmaisaletter wrote:
       | In section 4, the author writes "... cheaper than Claude 3.7
       | ($0.80 per token vs. $3)".
       | 
       | This is an obvious mistake, the price is per Megatoken, not per
       | token.
       | 
       | Source: https://www.anthropic.com/pricing
        
       | vanschelven wrote:
       | > Its "almost there" quality -- the feeling we're just one prompt
       | away from the perfect solution -- is what makes it so addicting.
       | Vibe coding operates on the principle of variable-ratio
       | reinforcement, a powerful form of operant conditioning where
       | rewards come unpredictably. Unlike fixed rewards, this
       | intermittent success pattern ("the code works! it's brilliant! it
       | just broke! wtf!"), triggers stronger dopamine responses in our
       | brain's reward pathways, similar to gambling behaviors.
       | 
       | Though I'm not a "vibe coder" myself I very much recognize this
       | as part of the "appeal" of GenAI tools more generally. Trying to
       | get Image Generators to do what I want has a very "gambling-like"
       | quality to it.
        
         | dingnuts wrote:
         | it's not like gambling, it is gambling. you exchange dollars
         | for chips (tokens -- some casinos even call the chips tokens)
         | and insert it into the machine in exchange for the chance of a
         | prize.
         | 
         | if it doesn't work the first time you pull the lever, it might
         | the second time, and it might not. Either way, the house wins.
         | 
         | It should be regulated as gambling, because it is. There's no
         | metaphor, the only difference from a slot machine is that AI
         | will never output cash directly, only the possibility of an
         | output that could make money. So if you're lucky with your
         | first gamble, it'll give you a second one to try.
         | 
         | Gambling all the way down.
        
           | princealiiiii wrote:
           | > It should be regulated as gambling, because it is.
           | 
           | That's wild. Anything with non-deterministic output will have
           | this.
        
             | kagevf wrote:
             | > "Anything with non-deterministic output will have this.
             | 
             | Anything with non-deterministic output that charges money
             | ...
             | 
             |  _Edit_ Added words to clarify what I meant.
        
               | GuinansEyebrows wrote:
               | i think at least a lot of things (if not most things)
               | that i pay for have an agreed-upon result in exchange for
               | payment, and a mitigation system that'll help me get what
               | i paid for in the event that something else prevents that
               | from happening. if you pay for something and you don't
               | know what you're going to get, and you have to keep
               | paying for it in the hopes that you get what you want out
               | of it... that sounds a lot like gambling. not exactly,
               | but like.
        
               | 0cf8612b2e1e wrote:
               | If I ask an artist to draw a picture, I still have to pay
               | for the service, even if I am unhappy without the result.
        
               | nkrisc wrote:
               | Sounds like you should negotiate a better contract next
               | time, such as one that allows for revisions.
        
               | cogman10 wrote:
               | In the US? No, you actually do not need to pay for the
               | service if you deem the quality of the output to be
               | substandard. In particular with art, it's pretty standard
               | to put in a non-refundable downpayment with the final
               | payment due on delivery.
               | 
               | You only lose those rights in the contracts you sign
               | (which, in terms of GPT, you've likely clicked through a
               | T&C which waves all right to dispute or reclaim payment).
               | 
               | If you ask an artist to draw a picture and decide it's
               | crap, you can refuse to take it and to pay for it. They
               | won't be too happy about it, but they'll own the picture
               | and can sell it on the market.
        
               | 0cf8612b2e1e wrote:
               | There must be artists working on an hourly contract rate.
               | 
               | Maybe art is special, but there are other professions
               | where someone can invest heaps of time and effort without
               | delivering the expected result. A trial attorney,
               | treasure hunter, oil prospector, app developer. All
               | require payment for hours of service, regardless of
               | outcome.
        
               | cogman10 wrote:
               | It'll mostly depend on the contract you sign with these
               | services and the state you live in.
               | 
               | When it comes to work that requires craftmanship it's
               | pretty common to be able to not pay them if they do a
               | poor job. It may cost you more than you paid them to fix
               | their mistake, but you can generally reclaim your money
               | you paid them if the work they did was egregiously poor.
        
             | GuinansEyebrows wrote:
             | maybe more accurately anything with non-deterministic
             | output that you have to pay-per-use instead of paying by
             | outcome.
        
               | Suppafly wrote:
               | >that you have to pay-per-use instead of paying by
               | outcome.
               | 
               | That's still not gambling and it's silly to pretend it
               | is. It _feels_ like gambling but that 's it.
        
             | martin-t wrote:
             | That's incorrect, gambling is about waiting.
             | 
             | Brain scans have revealed that waiting for a potential win
             | stimulates the same areas as the win itself. That's the
             | "appeal" of gambling. Your brain literally feels like it's
             | winning while waiting because it _might_ win.
        
           | squeaky-clean wrote:
           | So how exactly does that work for the $25/mo flat fee that I
           | pay OpenAI for chatgpt. They want me to keep getting the
           | wrong output and burning money on their backend without any
           | additional payment from me?
        
             | dwringer wrote:
             | Something of an aside, but this is sort of equivalent to
             | asking "how does that work for the $50 dollars the casino
             | gave me to gamble with for free"? I once made 50 dollars
             | exactly in that way by taking the casino's free tokens and
             | putting them all on black in a single roulette spin. People
             | like that are not the ones companies like that make money
             | off of.
        
             | kimixa wrote:
             | For the amount of money OpenAI burns that $25/mo is
             | functionally the same as zero - they're still in the "first
             | one is free" phase.
             | 
             | Though you could say the same thing about pretty much any
             | VC funded sector in the "Growth" phase. And I probably
             | will.
        
               | AlexCoventry wrote:
               | Is it really gambling, if the house always loses? :-)
        
           | mystified5016 wrote:
           | I run genAI models on my own hardware for free. How does that
           | fit into your argument?
        
             | codr7 wrote:
             | The fact that you can get your drugs for free doesn't
             | exactly make you less of an addict.
        
               | squeaky-clean wrote:
               | It does literally make it not gambling though, which is
               | what's betting discussed.
               | 
               | It also kind of breaks the whole argument that they're
               | designed to be addictive in order to make you spend more
               | on tokens.
        
               | codr7 wrote:
               | As long as that argument makes you happy, go for it :)
        
               | latentsea wrote:
               | I used to run GenAI image generators on my own hardware,
               | and I 200% agree with your stance. Literally wound up
               | selling my RTX 4090 to get the dealer to move out of the
               | house. I'm better off now, but can't ever really own a
               | GPU again without opening myself back up to that. Sigh...
        
           | NathanKP wrote:
           | This only makes sense if you have an all or nothing concept
           | of the value of output from AI.
           | 
           | Every prompt and answer is contributing value toward your
           | progress toward the final solution, even if that value is
           | just narrowing the latent space of potential outputs by
           | keeping track of failed paths in the context window, so that
           | it can avoid that path in a future answer after you provide
           | followup feedback.
           | 
           | The vast majority of slot machine pulls produce no value to
           | the player. Every single prompt into an LLM tool produces
           | some form of value. I have never once had an entirely wasted
           | prompt unless you count the AI service literally crashing and
           | returning a "Service Unavailable" type error.
           | 
           | One of the stupidest takes about AI is that a partial
           | hallucination or a single bug destroys the value of the tool.
           | If a response is 90% of the way there and I have to fix the
           | 10% of it that doesn't meet my expectations, then I still got
           | 90% value from that answer.
        
             | NegativeLatency wrote:
             | > Every prompt and answer is contributing value toward your
             | progress toward the final solution
             | 
             | This has not been my experience, maybe sometimes, but
             | certainly not always.
             | 
             | As an example: asking chatgpt/gemini about how to
             | accomplish some sql data transformation set me back in
             | finding the right answer because the answer it did give me
             | was so plausible but also super duper not correct in the
             | end. Would've been better off not using it in that case.
             | 
             | Brings to mind "You can't build a ladder to the moon"
        
             | secabeen wrote:
             | > One of the stupidest takes about AI is that a partial
             | hallucination or a single bug destroys the value of the
             | tool. If a response is 90% of the way there and I have to
             | fix the 10% of it that doesn't meet my expectations, then I
             | still got 90% value from that answer.
             | 
             | That assumes that the value of a solution is linear with
             | the amount completed. If the Pareto Principle holds (80% of
             | effects come from 20% of causes), then not getting that
             | critical 10+% likely has an outsized effect on the value of
             | the solution. If I have to do the 20% of the work that's
             | hard and important after taking what the LLM did for the
             | remainder, I haven't gained as much because I still have to
             | build the state machine in my head to understand the
             | problem-space well enough to do that coding.
        
             | PaulDavisThe1st wrote:
             | This assumes you can easily and reliably identify the 10%
             | you need to fix.
        
           | rapind wrote:
           | By this logic:
           | 
           | - I buy stock that doesn't perform how I expected.
           | 
           | - I hire someone to produce art.
           | 
           | - I pay a lawyer to represent me in court.
           | 
           | - I pay a registration fee to play a sport expecting to win.
           | 
           | - I buy a gift for someone expecting friendship.
           | 
           | Are all gambas.
           | 
           | You aren't paying for the result (the win), you are paying
           | for the service that _may_ produce the desired result, and in
           | some cases one of may possibly desirable results.
        
             | rjbwork wrote:
             | >I buy stock that doesn't perform how I expected.
             | 
             | Hence the adage "sir, this is a casino"
        
             | nkrisc wrote:
             | None of those are a games of chance, except the first.
        
               | Suppafly wrote:
               | >None of those are a games of chance, except the first.
               | 
               | Neither is GenAI, the grandparent comment is dumb.
        
           | abletonlive wrote:
           | Yikes. The reactionary reach for more regulation from a
           | certain group is just so tiresome. This is the real mind
           | virus that I wish would be contained in Europe.
           | 
           | I almost can't believe this idea is being seriously
           | considered by anybody. By that logic buying any CPU is
           | gambling because it's not deterministic how far you can
           | overclock it.
           | 
           | Just so you know, not every llm use case requires paying for
           | tokens. You can even run a local LLM and use cline w/ it for
           | all your coding needs. Pull that slot machine lever as many
           | times as you like without spending a dollar.
        
             | slurpyb wrote:
             | Do you understand what electricity is?
        
           | csallen wrote:
           | Books are not like gambling, they _are_ gambling. you
           | exchange dollars for chips (money -- some libraries even give
           | you digital credits for  "tokens") and spend them on a book
           | in exchange for the chance of getting something good out of
           | it.
           | 
           | If you don't get something good the first time you buy a
           | book, you might with the next book, or you might not. Either
           | way, the house wins.
           | 
           | It should be regulated as gambling, because it is. There's no
           | metaphor -- the only difference from a slot machine is that
           | books will never output cash directly, only the possibility
           | of an insight or idea that could make money. So if you're
           | lucky with your first gamble, you'll want to try another.
           | 
           | Gambling all the way down.
        
         | yewW0tm8 wrote:
         | Same with anything though? Startups, marriages, kids.
         | 
         | All those laid off coders gambled on a career that didn't pan
         | out.
         | 
         | Want more certainty in life, gonna have to get political.
         | 
         | And even then there is no guarantee the future give a crap.
         | Society may well collapse in 30 years, or 100...
         | 
         | This is all just role play to satisfy the prior generations
         | story driven illusions.
        
         | Suppafly wrote:
         | >Trying to get Image Generators to do what I want has a very
         | "gambling-like" quality to it.
         | 
         | Especially when you try to get them to generate something they
         | explicitly tell you they won't, like nudity. It feels akin to
         | hacking.
        
       | gitroom wrote:
       | man, pricing everywhere is getting nuts. makes me wonder if most
       | stuff just gets harder to use over time or im just old now - you
       | ever hit a point where you stop caring about new tools because it
       | feels like too much work?
        
       | biker142541 wrote:
       | Can we please stop using 'vibe coding' to mean 'ai assisted
       | coding'?? (best breakdown, imo:
       | https://simonwillison.net/2025/Mar/19/vibe-coding/)
       | 
       | Is it really vibe coding if you are building a detailed coding
       | plan, conducting "git-based experimentation with ruthless
       | pruning", and essentially reviewing the code incrementally for
       | correctness and conciseness? Sure, it's a process dependent on
       | AI, but it's very far from nearly "forget[ing] that the code even
       | exists".
       | 
       | That all said, I do think the article captures some of the
       | current cost/quality dilemmas. I wouldn't jump to conclusions
       | that these incentives are actually driving most current training
       | decisions, but it's an interesting area to highlight.
        
         | Animats wrote:
         | "Vibe coding" is a trend.[1]
         | 
         | [1]
         | https://trends.google.com/trends/explore?geo=US&q=%22vibe%20...
        
         | Ancapistani wrote:
         | There should be a distinction, but I don't think it's really
         | clear where it is yet.
         | 
         | In my own usage, I tend to alternate between tiny, well-defined
         | tasks and larger-scale, planned architectural changes or new
         | features. Things in between those levels are hit and miss.
         | 
         | It also depends on what I'm building and why. If it's a quick-
         | and-dirty script for my own use, I'll often write up - or speak
         | - a prompt and let it do its thing in the background while I
         | work on other things. I care much less about code quality in
         | those instances.
        
           | codr7 wrote:
           | It's still gambling, you're trading learning/reinforcing for
           | efficiency, which in the long run means losing skills.
        
         | parliament32 wrote:
         | This reads like "is it really gambling when I have a many-step
         | _system_ for predicting roulette outcomes? "
        
       | samtp wrote:
       | I've pretty clearly seen the critical thinking ability of
       | coworkers who depend on AI too much sharply decline over the past
       | year. Instead of taking 30 seconds to break down the problem and
       | work through assumptions, they immediately copy/paste into an LLM
       | and spit back what it tells them.
       | 
       | This has lead to their abilities stalling while their output
       | seemingly goes up. But when you look at the quality of their
       | output, and their ability to get projects over the last 10% or
       | make adjustments to an already completed project without breaking
       | things, it's pretty horrendous.
        
         | Etheryte wrote:
         | My observations align with this pretty closely. I have a number
         | of colleagues who I wager are largely using LLM-s, both by
         | changes in coding style and how much they suddenly add
         | comments, and I can't help but feel a noticeable drop in the
         | quality of the output. Issues that should clearly have no
         | business making it to code review are now regularly left for
         | others to catch, it often feels like they don't even look at
         | their own diffs. What to make of it, I'm not entirely sure. I
         | do think there are ways LLM-s can help us work in better ways,
         | but they can also lead to considerably worse outcomes.
        
           | jimbokun wrote:
           | Just replace your colleagues with the LLMs they are using.
           | You will reduce costs with no decrease in the quality of
           | work.
        
         | andy99 wrote:
         | I think lack of critical thinking is the root cause, not a
         | symptom. I think pretty much everyone uses LLMs these days, but
         | you can tell who sees the output and considers it "done" vs who
         | uses LLM output as an input to their own process.
        
           | mystified5016 wrote:
           | I mean, I can tell that I'm having this problem and my
           | critical thinking skills are otherwise typically quite sharp.
           | 
           | At work I've inherited a Kotlin project and I've never
           | touched Kotlin or android before, though I'm an experienced
           | programmer in other domains. ChatGPT has been guiding me
           | through what needs to be done. The problem I'm having is that
           | it's just too damn easy to follow its advice without
           | checking. I might save a few minutes over reading the docs
           | myself, but I don't get the context the docs would have given
           | me.
           | 
           | I'm a 'Real Programmer' and I can tell that the code is
           | logically sound and self-consistent. The code works and it's
           | usually rewritten so much as to be distinctly _my_ code and
           | style. But still it 's largely magical. If I'm doing things
           | the less-correct way, I wouldn't really know because this
           | whole process has led me to some pretty lazy thinking.
           | 
           | On the other hand, I _very much_ do not care about this
           | project. I 'm very sure that it will be used just a few times
           | and never see the light of day again. I don't expect to ever
           | do android development again after this, either. I think lazy
           | thinking and farming the involved thinking out to ChatGPT is
           | acceptable here, but it's clear how easily this could become
           | a _very_ bad habit.
           | 
           | I am making a modest effort to understand what I'm doing. I'm
           | also completely rewriting or ignoring the code the AI gives
           | me, it's more of an API reference and example. I can
           | definitely see how a less-seasoned programmer might get
           | suckered into blindly accepting AI code and iterating prompts
           | until the code works. It's pretty scary to think about how
           | the coming generations of programmers are going to experience
           | and conceptualize programming.
        
         | jobs_throwaway wrote:
         | As someone who vibe codes at times (and is a professional
         | programmer), I'm curious how yall go about resisting this? Just
         | avoid LLMs entirely and do everything by hand? Very rigorously
         | go over any LLM-generated code before committing?
         | 
         | It certainly is hard when I'm say writing unit tests to avoid
         | the temptation to throw it into Cursor and prompt until it
         | works.
        
           | breckenedge wrote:
           | Set a budget. Get rate limited. Let the experience remind you
           | how much time you're actually wasting letting the model write
           | good looking but buggy code, versus just writing code
           | responsibly.
        
       | charcircuit wrote:
       | This article ignores the enormous demand of AI coding paired with
       | competition between providers. Reducing the price of tokens means
       | that people can afford to generate more tokens. A code provider
       | being cheaper on average to operate than another is a competitive
       | advantage.
        
       | chaboud wrote:
       | 1. Yes. I've spent several late nights nudging Cline and Claude
       | (and other systems) to the right answers. And being able to use
       | AWS Bedrock to do this has been great (note: I work at Amazon).
       | 
       | 2. I've had good fortunes keeping the agents to constrained
       | areas, working on functions, or objects, with clearly defined (by
       | me) boundaries. If the measure of a junior engineer is that you
       | correct them once a day, an engineer once a week, a senior once a
       | month, a principal once a quarter... Treat these agents like
       | hyper-energetic interns. Nudge frequently.
       | 
       | 3. Standard org management coding practices apply. Force the
       | agents to show work, plan, unit test, investigate.
       | 
       | And, basically, I've described that we're becoming Software
       | Development Managers with teams of on-demand low-quality interns.
       | That's an incredibly powerful tool, but don't expect hyper-
       | elegant and compact code from them. Keep that for the senior
       | engineering staff (humans) for now.
       | 
       | (Note: The AlphaEvolve announcement makes me wonder if I'm going
       | to have hyper-energetic applied science interns next...)
        
       | xianshou wrote:
       | Amusingly, about 90% of my rat's-nest problems with Sonnet 3.7
       | are solved by simply appending a few words to the end of the
       | prompt:
       | 
       | "write minimum code required"
       | 
       | It's not even that sensitive to the wording - "be terse" or "make
       | minimal changes" amount to the same thing - but the resulting
       | code will often be at least 50% shorter than the un-guided
       | version.
        
         | panstromek wrote:
         | Well, the article mentions that this reduces accuracy. Do you
         | hit that problem often then?
        
       | andy99 wrote:
       | I wish more had been written about the first assertion that using
       | an LLM to code is like gambling and you're always hoping that
       | just one more prompt will get you what you want.
       | 
       | It really captures how little control one has over the process,
       | while simultaneously having the illusion of control.
       | 
       | I don't really believe that code is being made verbose to make
       | more profits. There's probably some element of model providers
       | not prioritizing concise code, but if conciseness while
       | maintaining "quality" was possible is would give one model a
       | sufficient edge over others that I suspect providers would do it.
        
         | techpineapple wrote:
         | Something I caught about Andrej Karpathy's original tweet, was
         | he said "give into the vibes", and I wonder if he meant that
         | about outcomes too.
        
           | andy99 wrote:
           | I still think the original tweet was tongue-in-cheek and not
           | really meant to be a serious description of how to do things.
        
       | Pxtl wrote:
       | I can _feel_ how the extreme autocomplete of AI is a drug.
       | 
       | Half of my job is fighting the "copy/paste/change one thing"
       | garbage that developers generate. Keeping code DRY. The
       | autocompletes do an amazing job of automating the repeated
       | boilerplate. "Oh you're doing this little snippet for the first
       | and second property? Obviously you want to do that for every
       | property! Let me just expand that out for you!"
       | 
       | And I'm like "oooh, that's nice and convenient".
       | 
       | ...
       | 
       | But I also should be looking at that with the stink-eye... part
       | of that code is now duplicated a dozen times. Is there any way to
       | reduce that duplication to the bare minimum? At least so it's
       | only one duplicated declaration or call and all of the rest is
       | per-thingy?
       | 
       | Or any way to directly/automatically wrap the thing without going
       | property-by-property?
       | 
       | Normally I'd be asking myself these questions by the 3rd line.
       | But this just made a dozen of those in an instant. And it's so
       | tempting and addictive to just say "this is fine" and move on.
       | 
       | That kind of code is not fine.
        
         | Ancapistani wrote:
         | > That kind of code is not fine.
         | 
         | I agree, but I'm also challenging that position within myself.
         | 
         |  _Why_ isn 't it OK? If your primary concern is readability,
         | then perhaps LLMs can better understand generated code relative
         | to clean, human-readable code. Also, if you're not directly
         | interacting with it, who cares?
         | 
         | As for duplication introducing inconsistencies, that's another
         | issue entirely :)
        
         | Suppafly wrote:
         | >That kind of code is not fine.
         | 
         | Depends on your definition of fine. Is it less readable because
         | it's doing the straight forward thing several times instead of
         | wrapping it into a loop or a method, or is it more readable
         | because of that.
         | 
         | Is it not fine because it's slower, or does it all just compile
         | down to the same thing anyway?
         | 
         | Or is it not fine because you actually should be doing
         | different things for the different properties but assumed you
         | don't because you let the AI do the thinking for you?
        
       | andrewstuart wrote:
       | Claude was last week.
       | 
       | The author should try Gemini it's _much_ better.
        
         | martin-t wrote:
         | Honestly can't tell if satire or not.
        
           | jazoom wrote:
           | It's not satire. Gemini is much better for coding, at least
           | for me.
           | 
           | Just to illustrate, I asked both about a browser automation
           | script this morning. Claude used Selenium. Gemini used
           | Playwright.
           | 
           | I think the main reasons Gemini is much better are:
           | 
           | 1. It gets my whole code base as context. Claude can't take
           | that many tokens. I also include documentation for newer
           | versions of libraries (e.g. Svelte 5) that the LLM is not so
           | familiar with.
           | 
           | 2. Gemini has a more recent knowledge cutoff.
           | 
           | 3. Gemini 2.5 Pro is a thinking model.
           | 
           | 4. It's free to use through the web UI.
        
       | neilv wrote:
       | I would seriously consider banning "vibe coding" right now,
       | because:
       | 
       | 1. Poor solutions.
       | 
       | 2. Solutions not understood by the person who prompted them.
       | 
       | 3. Development team being made dumber.
       | 
       | 4. Legal and ethical concerns about laundering open source
       | copyrights.
       | 
       | 5. I'm suspicious of the name "vibe coding", like someone is
       | intentionally marketing it to people who don't care to be good at
       | their jobs.
       | 
       | 6. I only want to hire people who can do holistically _better_
       | work than current  "AI". (Not churn code for a growth startup's
       | Potemkin Village, nor to only nominally satisfy a client's
       | requirements while shipping them piles of counterproductive
       | garbage.)
       | 
       | 7. Publicizing that you are a no-AI-slop company might scare away
       | the majority of the bad prospective employees, while
       | disproportionately attracting the especially good ones. (Not that
       | everyone who uses "AI" is bad, but they've put themselves in the
       | bucket with all the people who are bad, and that's a vastly
       | better filter for the art of hiring than whether someone has
       | spent months memorizing LeetCode answers solely for interviews.)
        
       | YossarianFrPrez wrote:
       | There are two sets of perverse incentives at play. The main one
       | the author focuses on is that LLM companies are incentivized to
       | produce verbose answers, so that when you task an LLM on
       | extending an already verbose project, the tokens used and
       | therefore cost increases.
       | 
       | The second one is more intra/interpersonal: under pressure to
       | produce, it's very easy to rely on LLMs to get one 80% of the way
       | there and polish the remaining 20%. I'm in a new domain that
       | requires learning a new language. So something I've started doing
       | is asking ChatGPT to come up with exercises / coding etudes /
       | homework for me based on past interactions.
        
       | neonate wrote:
       | https://archive.ph/EzbNK
        
       | Vox_Leone wrote:
       | Noted -- but honestly, that's somewhat expected. Vibe-style
       | coding often lacks structure, patterns, and architectural
       | discipline. That means the developer must do more heavy lifting:
       | decide what they want, and be explicit -- whether that's 'avoid
       | verbosity,' 'use classes,' 'encapsulate logic,' or 'handle errors
       | properly.'
        
       | johnea wrote:
       | I generally agree with the concerns of this article, and wonder
       | about the theory of the LLM having a innate inclination to
       | generate bloated code.
       | 
       | Even in this article though, I feel like there is a lot of
       | anthropomorphization of LLMs.
       | 
       | > LLMs and their limitations when reasoning about abstract logic
       | problems
       | 
       | As I understand them, LLMs don't "reason" about anything. It's
       | purely a statistical sequencing of words (or other tokens) as
       | determined by the training set and the prompt. Please correct me
       | if I'm wrong.
       | 
       | Also, regarding this theory that the models may be biased to
       | produce bloated code: I've reposted this once already, and no one
       | has replied yet, and I still wonder:
       | 
       | ----------
       | 
       | To me, this represents one of the most serious issues with LLM
       | tools: the opacity of the model itself. The code (if provided)
       | can be audited for issues, but the model, even if examined, is an
       | opaque statistical amalgamation of everything it was trained on.
       | 
       | There is no way (that I've read of) for identifying biases, or
       | intentional manipulations of the model that would cause the tool
       | to yield certain intended results.
       | 
       | There are examples of DeepState generating results that refuse to
       | acknowledge Tienanmen square, etc. These serve as examples of how
       | the generated output can intentionally be biased, without the
       | ability to readily predict this general class of bias by
       | analyzing the model data.
       | 
       | ----------
       | 
       | I'm still looking for confirmation or denial on both of these
       | questions...
        
       | sherburt3 wrote:
       | Really makes you wonder where this is all going. What is going to
       | be the thing where we say "Maybe we took this a little too far."
       | I'm sure whatever bloated react apps we see today are nothing in
       | comparison to the monstrosities we have in store for us in the
       | future.
        
         | deadbabe wrote:
         | The future should be less bloat. We don't need frameworks
         | anymore, we can produce output to straight html pages with
         | vanilla JavaScript. Could be good.
        
       | coolcase wrote:
       | Dopamine? That sort of thing triggers cortisol for me if
       | anything!
        
       | erulabs wrote:
       | These perverse incentives run at the heart of almost all
       | Developer Software as a Service tooling. Using someone else's
       | hosted model incentivizes increasing token usage, but it's
       | nothing special about AI.
       | 
       | Consider Database-as-a-service companies: They're not
       | incentivized to optimize on CPU usage, they charge per cpu.
       | They're not incentivized to improve disk compression, they charge
       | for disk-usage. There are several DB vendors who explicitly
       | disable disk compression and happily charge for storage capacity.
       | 
       | When you run the software yourself, or the model yourself, the
       | incentives aligned: use less power, use less memory, use less
       | disk, etc.
        
       | lubujackson wrote:
       | I feel like "vibe coding" as a "no look" sort of way to produce
       | anything is bad and will probably remain bad for some time.
       | 
       | However... "vibe architecting" is likely going to be the way
       | forward. I have had success with generating/tuning an
       | architecture plan with AI, having it create stub files/functions
       | then filling them out individually. I can get pretty much the
       | whole way without typing code, but it does require a fair bit
       | more architectural thinking than usual and a good bit of reading
       | code (then telling the AI to "do better").
       | 
       | I think of it like the analogy of blind men describing an
       | elephant when they can only feel a single part. AI is decent at
       | high level architecture and decent at low level production but
       | you need a human to understand the big picture and how the pieces
       | fit (and which ones are missing).
        
       | croes wrote:
       | If I do the same with a human developer instead of an AI it's
       | called ordering not vibe coding.
       | 
       | What's the difference?
        
       | ramoz wrote:
       | I disagree with the idea that LLM providers are deliberately
       | designing solutions to consume more tokens. We're in the early
       | days of agentic coding, and the landscape is intensely
       | competitive. Providers are focused on building highly capable
       | systems to drive adoption, especially with open-source
       | alternatives just a git clone away.
       | 
       | Yes, Claude Code can be token-heavy, but that's often a trade-off
       | for their current level of capability compared to other options.
       | Additionally, Claude Code has built-in levers for cost (I prefer
       | they continue to focus on advanced capability, let pricing
       | accessibility catch up).
       | 
       | "early days" means:
       | 
       | - Prompt engineering is still very much a required skill for
       | better code and lower pricing
       | 
       | - Same with still needing to be an engineer for the same reasons,
       | and:
       | 
       | - Devs need to actively guide these agents. This includes
       | detailed planning, progress tracking, and careful context
       | management - which, as the author notes, is more involved than
       | many realize. I've personally found success using Gemini to
       | create structured plans for Claude Code to execute, which helps
       | manage its verbosity and focus to "thoughtful" execution (as
       | guided by gemini). I drop entire codebases into Gemini (for
       | free).
        
         | mecredis wrote:
         | Hi! Author here. I don't actually think they're deliberately
         | doing this, hence my choice of "perverse incentives" vs.
         | something more accusatory. The issue is that they don't have a
         | ton of incentive to fix it.
         | 
         | Agree with you on all the rest, and I think writing a post like
         | this was very much intended as a gut-check on things since the
         | early days are hopefully the times when things can get fixed
         | up.
        
           | ramoz wrote:
           | My speculation is that these companies have significant
           | reason to prioritize lowering the amount of tokens produced
           | as well as cost of tokens.
           | 
           | The leaked Claude Code codebase was riddled with "concise",
           | "do not add comments", "mimic codestyle", even an explicit
           | "You should minimize output tokens as much as possible" etc.
           | Btw, Claude Code uses a custom system prompt, not the leaked
           | 24k claude.ai one.
        
         | slurpyb wrote:
         | It's so cool that we're all actively participating in the
         | handover of all our work to these massive companies so we can
         | be forever reliant on their blackbox subscriptions. Don't fret;
         | there will be a day where those profit numbers will have to go
         | up and they will consciously make the product worse, just to
         | trigger more queries, and thus extract more money from you.
         | Gross.
        
       | brooke2k wrote:
       | I don't understand the productivity that people get out of these
       | AI tools. I've tried it and I just can't get anything remotely
       | worthwhile unless it's something very simple or something
       | completely new being built from the ground up.
       | 
       | Like sure, I can ask claude to give me the barebones of a web
       | service that does some simple task. Or a webpage with some
       | information on it.
       | 
       | But any time I've tried to get AI services to help with
       | bugfixing/feature development on a large, complex, potentially
       | multi-language codebase, it's useless.
       | 
       | And those tasks are the ones that actually take up the majority
       | of my time. On the occasion that I'm spinning a new thing up
       | quickly, I don't really need an AI to do it for me -- I mean,
       | that's the easy part!
       | 
       | Is there something I'm missing? Am I just not using it right? I
       | keep seeing people talk about how addictive it is, how the
       | productivity boost is insane, how all their code is now written
       | by AI and then audited, and I just don't see how that's possible
       | outside of really simple rote programming.
        
         | Starlevel004 wrote:
         | > Is there something I'm missing? Am I just not using it right?
         | 
         | The talk about it makes more sense when you remember most
         | developers are primarily writing CRUD webapps or adware, which
         | is essentially a solved problem already.
        
         | slurpyb wrote:
         | You are not alone! I strongly agree and I feel like I am losing
         | my mind reading some of the comments people have about these
         | services.
        
         | hx8 wrote:
         | Probably 80% of the time I spend coding, I'm inside a code file
         | I haven't read in the last month. If I need to spend more than
         | 30 seconds reading a section of code before I understand it,
         | I'll ask AI to explain it to me. Usually, it does a good job of
         | explaining code at a level of complexity that would take me
         | 1-15 minutes to understand, but does a poor job of answering
         | more complex questions or at understanding more complex code.
         | 
         | It's a moderately useful tool for me. I suspect the people that
         | get the most use out of are those that would take more than 1
         | hour to read code I would take 10 minutes to read. Which is to
         | say the least experienced people get the most value.
        
         | lukan wrote:
         | Yesterday I gave cursor a try and made my first (intentionally
         | very lazy) vibe coding approach (a simple threejs project). It
         | accepted the task and did things, failed, did things, failed,
         | did things ... failed for good.
         | 
         | I guess I could work on the magic incantations to tweak here a
         | bit until it works and I guess that's the way it is done. But I
         | wasn't hooked.
         | 
         | I do get value out of LLM's for isolated broken down subtasks,
         | where asking a LLM is quicker than googling.
         | 
         | For me, AI will probably become really usefull, once I can scan
         | and integrate my own complex codebase so it gives me solutions
         | that work there and not hallucinate API points or jump between
         | incompatible libary versions (my main issue).
        
         | colechristensen wrote:
         | Some people do really repetitive or really boilerplate things,
         | others do not.
         | 
         | Also you have to learn to talk to it and how to ask it things.
        
       | UncleOxidant wrote:
       | > I have probably spent over $1,000 vibe coding various projects
       | into reality
       | 
       | dude, you can use Gemini Pro 2.5 with Cline - it's free and is
       | rated at least as good as Claude Sonnet 3.7 right now.
        
       ___________________________________________________________________
       (page generated 2025-05-14 23:00 UTC)