[HN Gopher] Perverse incentives of vibe coding
___________________________________________________________________
Perverse incentives of vibe coding
Author : laurex
Score : 115 points
Date : 2025-05-14 19:29 UTC (3 hours ago)
(HTM) web link (fredbenenson.medium.com)
(TXT) w3m dump (fredbenenson.medium.com)
| comex wrote:
| > There was no standardization of parts in the probe. Two widgets
| intended to do almost the same job could be subtly different or
| wildly different. Braces and mountings seemed hand carved. The
| probe was as much a sculpture as a machine.
|
| > Blaine read that, shook his head, and called Sally. Presently
| she joined him in his cabin.
|
| > "Yes, I wrote that," she said. "It seems to be true. Every nut
| and bolt in that probe was designed separately. It's less
| surprising if you think of the probe as having a religious
| purpose. But that's not all. You know how redundancy works?"
|
| > "In machines? Two gilkickies to do one job. In case one fails."
|
| > "Well, it seems that the Moties work it both ways."
|
| > "Moties?"
|
| > She shrugged. "We had to call them something. The Mote
| engineers made two widgets do one job, all right, but the second
| widget does two other jobs, and some of the supports are also
| bimetallic thermostats and thermoelectric generators all in one.
| Rod, I barely understand the words. Modules: human engineers work
| in modules, don't they?"
|
| > "For a complicated job, of course they do."
|
| > "The Moties don't. It's all one piece, everything working on
| everything else. Rod, there's a fair chance the Moties are
| brighter than we are."
|
| - The Mote in God's Eye, Larry Niven and Jerry Pournelle (1974)
|
| [...too bad that today's LLMs are not brighter than we are, at
| least when it comes to writing correct code...]
| mnky9800n wrote:
| That book is very much fun and also I never understood why
| Larry Niven is so obsessed with techno feudalism and gender
| roles. I think this is my favourite book but I think his best
| book is maybe Ringworld.
| Loughla wrote:
| Ringworld is a great book. The later books have great
| concepts, but could do without so much. . . rishing. Niven
| plainly inserted his furry porn fetish into those books, for
| reasons unclear to any human alive.
| Suppafly wrote:
| >for reasons unclear to any human alive
|
| Given how prevalent furries seem to be, especially in nerd
| adjacent culture, I'd say he was ahead of his time.
| AlexCoventry wrote:
| The zero-sum mentality which leads people to think that way
| is already clear in _The Mote In God 's Eye_. I think the
| point of the book is that despite being superior to humans in
| every way imaginable, the Moties are condemned to repeated
| violent conflict by Malthusian pressures, because they have
| nowhere to expand. One way I interpret the "mote" in God's
| eye is the authors' belief that no matter how good we get,
| we'll always be in potentially violent conflict with each
| other for limited resources. (The "beam" in our own eye is
| then that we're still fighting each other over less pressing
| concerns. :-)
| Suppafly wrote:
| >I think this is my favourite book but I think his best book
| is maybe Ringworld.
|
| Ringworld is pretty good, the multiples sequels get kind of
| out there.
| mnky9800n wrote:
| I never read any of the sequels just a couple of the short
| story collections and some of the man kzin wars. What's
| wild about them?
| jerf wrote:
| Yeah, I've had that thought too.
|
| I think a lot about Motie engineering versus human engineering.
| Could Motie engineering be practical? Is human engineering a
| fundamentally good idea, or is it just a reflection of our
| working memory of 7 +/- 2? Biology is Motie-esque, but it's
| pretty obvious we are nowhere near a technology level that
| could ever bring a biological system up from scratch.
|
| If Motie engineering is a good idea, it's not a smooth
| gradient. The Motie-est code I've seen is also the worst. It is
| definitely not the case that getting a bit more Motie-esque,
| all else being equal, produces better results. Is there some
| crossover point where it gets better and maybe passes our
| modular designs? If AIs do get better than us at coding, and it
| turns out they do settle on Motie-esque coding, no human will
| ever be able to penetrate it ever again. We'd have to instruct
| our AI coders to deliberately cripple themselves to stay
| comprehensible, and that is... economically a tricky
| proposition.
|
| After all, anyone can write anything into a novel they want to
| and make anything work. It's why I've generally stopped reading
| fiction that is explicitly meant to make ideological or
| political points to the exclusion of all else; anything can
| work on a page. Does Motie engineering correspond to anything
| that could be manifested practically in reality?
|
| Will the AIs be better at modularization than any human? Will
| they actually manifest the Great OO Promise of vast piles of
| amazingly well-crafted, re-usable code once they mature? Or
| will the optimal solution turn out to be bespoke, locally-
| optimized versions of everything everywhere, and the solution
| to combining two systems is to do whatever locally-sensible
| customizations are called for?
|
| (I speak of the final, mature version, however long that may
| be. Today LLMs are kind of the worst of both worlds. That turns
| out to be a big step up from "couldn't play in this space at
| all", so I'm not trying to fashionably slag on AIs here. I'm
| more saying that the one point we have is not yet enough to
| draw so much as a line through, let alone an entire multi-
| dimensional design methodology utility landscape.)
|
| I didn't expect to live to see the answers, but maybe I will.
| fwip wrote:
| For me, "Motie engineering" always brings to mind "The Story
| of Mel." http://www.catb.org/jargon/html/story-of-mel.html
| bradly wrote:
| > it might be difficult for AI companies to prioritize code
| conciseness when their revenue depends on token count.
|
| Would open source, local models keep pressure on AI companies to
| prioritize the usable code, as code quality and engineering time
| saved are critical to build vs buy discussions?
| jsheard wrote:
| Depends if open source models can remain relevant once the
| status quo of "company burns a bunch of VC money to train a
| model, open sources it, and generates little if any revenue"
| runs out of steam. That's obviously not sustainable long term.
| Larrikin wrote:
| Maybe we will get some university backed SETI like projects
| to replace all those personal mining rigs now that that hype
| is finally fading.
| Workaccount2 wrote:
| Are using the APIs worth the extra cost vs using the web tools? I
| haven't used any API tools, I am not a programmer, but I have
| generated many millions of tokens in the web canvas, something
| that would cost way more than the $20 I spend for them.
| jfim wrote:
| If you're using Claude code or cursor, for example, they can
| read files automatically instead of needing the user to copy
| paste back and forth.
|
| Both can generate code though, I've generated code using the
| web interface and it works, it's just a bit tedious to copy
| back and forth.
| thimabi wrote:
| I think the idea that LLMs are incentivized to write verbose
| code fails when one considers non-API usage.
|
| Like you, I've accumulated tons of LLM usage via apps and web
| apps. I can actually see how the models are much more succinct
| there compared to the API interface.
|
| My uneducated guess is that LLM models try to fit their
| responses into the "output tokens" limit, which is surely much
| lower in UIs than what can be set in pay-as-you-go interfaces.
| tippytippytango wrote:
| This article captures a lot of the problem. It's often
| frustrating how it tries to work around really simple issues with
| complex workarounds that don't work at all. I tell it the secret
| simple thing it's missing and it gets it. It always makes me
| think, god help the vibe coders that can't read code. I actually
| feel bad for them.
| r053bud wrote:
| I fear that's going to end up being a significant portion of
| engineers in the future.
| babyent wrote:
| I think we are in the Flash era again lol.
|
| You remember those days right? All those Flash sites.
| iotku wrote:
| There's a pretty big gap between "make it work" and "make it
| good".
|
| I've found with LLMs I can usually convince them to get me at
| least something that mostly works, but each step compounds with
| excessive amounts of extra code, extraneous comments ("This
| loop goes through each..."), and redundant functions.
|
| In the short term it feels good to achieve something 'quickly',
| but there's a lot of debt associated with running a random
| number generator on your codebase.
| didgetmaster wrote:
| In my opinion, the difference between good code and code that
| simply works (sometimes barely); is that good code will still
| work (or error out gracefully) when the state and the inputs
| are not as expected.
|
| Good programs are written by people who anticipate what might
| go wrong. If the document says 'don't do X'; they know a
| tester is likely to try X because a user will eventually do
| it.
| grufkork wrote:
| Working as an instructor for a project course for first-year
| university students, I have run in to this a couple of times.
| The code required for the project is pretty simple, but there
| are a couple of subtle details that can go wrong. Had one group
| today with bit shifts and other "advanced" operators
| everywhere, but the code was not working as expected. I asked
| them to just `Serial.println()` so they could check what was
| going on, and they were stumped. LLMs are already great tools,
| but if you don't know basic troubleshooting/debugging you're in
| for a bad time when the brick wall arrives.
|
| On the other hand, it shows how much coding is just repetition.
| You don't need to be a good coder to perform serviceable work,
| but you won't create anything new and amazing either, if you
| don't learn to think and reason - but that might for some
| purposes be fine. (Worrying for the ability of the general
| population however)
|
| You could ask whether these students would have gotten anything
| done without generated code? Probably, it's just a momentarily
| easier alternative to actual understanding. They did however
| realise the problem and decided by themselves to write their
| own code in a simpler, more repetitive and "stupid" style, but
| one that they could reason about. So hopefully a good lesson
| and all well in the end!
| martin-t wrote:
| > I tell it the secret simple thing it's missing and it gets
| it.
|
| Anthropomorphizing LLMs is not helpful. It doesn't get
| anything, you just gave it new tokens, ones which are more
| closely correlated with the correct answer. It also generates
| responses similar to what a human would say in the same
| situation.
|
| Note i first wrote "it also mimicks what a human would say",
| then I realized I am anthropomorphizing a statistical algorithm
| and had to correct myself. It's hard sometimes but language
| shapes how we think (which is ironically why LLMs are a thing
| at all) and using terms which better describe how it really
| works is important.
| ben_w wrote:
| Given that LLMs are trained on humans, who don't respond well
| to being dehumanised, I expect anthropomorphising them to be
| better than the opposite of that.
|
| https://www.microsoft.com/en-us/worklab/why-using-a-
| polite-t...
| Suppafly wrote:
| >Anthropomorphizing LLMs is not helpful
|
| It's a feature of language to describe things in those terms
| even if they aren't accurate.
|
| >using terms which better describe how it really works is
| important
|
| Sometimes, especially if you doing something where that
| matters, but abstracting those details away is also useful
| when trying to communicate clearly in other contexts.
| tippytippytango wrote:
| Patronizing much?
| sigmaisaletter wrote:
| In section 4, the author writes "... cheaper than Claude 3.7
| ($0.80 per token vs. $3)".
|
| This is an obvious mistake, the price is per Megatoken, not per
| token.
|
| Source: https://www.anthropic.com/pricing
| vanschelven wrote:
| > Its "almost there" quality -- the feeling we're just one prompt
| away from the perfect solution -- is what makes it so addicting.
| Vibe coding operates on the principle of variable-ratio
| reinforcement, a powerful form of operant conditioning where
| rewards come unpredictably. Unlike fixed rewards, this
| intermittent success pattern ("the code works! it's brilliant! it
| just broke! wtf!"), triggers stronger dopamine responses in our
| brain's reward pathways, similar to gambling behaviors.
|
| Though I'm not a "vibe coder" myself I very much recognize this
| as part of the "appeal" of GenAI tools more generally. Trying to
| get Image Generators to do what I want has a very "gambling-like"
| quality to it.
| dingnuts wrote:
| it's not like gambling, it is gambling. you exchange dollars
| for chips (tokens -- some casinos even call the chips tokens)
| and insert it into the machine in exchange for the chance of a
| prize.
|
| if it doesn't work the first time you pull the lever, it might
| the second time, and it might not. Either way, the house wins.
|
| It should be regulated as gambling, because it is. There's no
| metaphor, the only difference from a slot machine is that AI
| will never output cash directly, only the possibility of an
| output that could make money. So if you're lucky with your
| first gamble, it'll give you a second one to try.
|
| Gambling all the way down.
| princealiiiii wrote:
| > It should be regulated as gambling, because it is.
|
| That's wild. Anything with non-deterministic output will have
| this.
| kagevf wrote:
| > "Anything with non-deterministic output will have this.
|
| Anything with non-deterministic output that charges money
| ...
|
| _Edit_ Added words to clarify what I meant.
| GuinansEyebrows wrote:
| i think at least a lot of things (if not most things)
| that i pay for have an agreed-upon result in exchange for
| payment, and a mitigation system that'll help me get what
| i paid for in the event that something else prevents that
| from happening. if you pay for something and you don't
| know what you're going to get, and you have to keep
| paying for it in the hopes that you get what you want out
| of it... that sounds a lot like gambling. not exactly,
| but like.
| 0cf8612b2e1e wrote:
| If I ask an artist to draw a picture, I still have to pay
| for the service, even if I am unhappy without the result.
| nkrisc wrote:
| Sounds like you should negotiate a better contract next
| time, such as one that allows for revisions.
| cogman10 wrote:
| In the US? No, you actually do not need to pay for the
| service if you deem the quality of the output to be
| substandard. In particular with art, it's pretty standard
| to put in a non-refundable downpayment with the final
| payment due on delivery.
|
| You only lose those rights in the contracts you sign
| (which, in terms of GPT, you've likely clicked through a
| T&C which waves all right to dispute or reclaim payment).
|
| If you ask an artist to draw a picture and decide it's
| crap, you can refuse to take it and to pay for it. They
| won't be too happy about it, but they'll own the picture
| and can sell it on the market.
| 0cf8612b2e1e wrote:
| There must be artists working on an hourly contract rate.
|
| Maybe art is special, but there are other professions
| where someone can invest heaps of time and effort without
| delivering the expected result. A trial attorney,
| treasure hunter, oil prospector, app developer. All
| require payment for hours of service, regardless of
| outcome.
| cogman10 wrote:
| It'll mostly depend on the contract you sign with these
| services and the state you live in.
|
| When it comes to work that requires craftmanship it's
| pretty common to be able to not pay them if they do a
| poor job. It may cost you more than you paid them to fix
| their mistake, but you can generally reclaim your money
| you paid them if the work they did was egregiously poor.
| GuinansEyebrows wrote:
| maybe more accurately anything with non-deterministic
| output that you have to pay-per-use instead of paying by
| outcome.
| Suppafly wrote:
| >that you have to pay-per-use instead of paying by
| outcome.
|
| That's still not gambling and it's silly to pretend it
| is. It _feels_ like gambling but that 's it.
| martin-t wrote:
| That's incorrect, gambling is about waiting.
|
| Brain scans have revealed that waiting for a potential win
| stimulates the same areas as the win itself. That's the
| "appeal" of gambling. Your brain literally feels like it's
| winning while waiting because it _might_ win.
| squeaky-clean wrote:
| So how exactly does that work for the $25/mo flat fee that I
| pay OpenAI for chatgpt. They want me to keep getting the
| wrong output and burning money on their backend without any
| additional payment from me?
| dwringer wrote:
| Something of an aside, but this is sort of equivalent to
| asking "how does that work for the $50 dollars the casino
| gave me to gamble with for free"? I once made 50 dollars
| exactly in that way by taking the casino's free tokens and
| putting them all on black in a single roulette spin. People
| like that are not the ones companies like that make money
| off of.
| kimixa wrote:
| For the amount of money OpenAI burns that $25/mo is
| functionally the same as zero - they're still in the "first
| one is free" phase.
|
| Though you could say the same thing about pretty much any
| VC funded sector in the "Growth" phase. And I probably
| will.
| AlexCoventry wrote:
| Is it really gambling, if the house always loses? :-)
| mystified5016 wrote:
| I run genAI models on my own hardware for free. How does that
| fit into your argument?
| codr7 wrote:
| The fact that you can get your drugs for free doesn't
| exactly make you less of an addict.
| squeaky-clean wrote:
| It does literally make it not gambling though, which is
| what's betting discussed.
|
| It also kind of breaks the whole argument that they're
| designed to be addictive in order to make you spend more
| on tokens.
| codr7 wrote:
| As long as that argument makes you happy, go for it :)
| latentsea wrote:
| I used to run GenAI image generators on my own hardware,
| and I 200% agree with your stance. Literally wound up
| selling my RTX 4090 to get the dealer to move out of the
| house. I'm better off now, but can't ever really own a
| GPU again without opening myself back up to that. Sigh...
| NathanKP wrote:
| This only makes sense if you have an all or nothing concept
| of the value of output from AI.
|
| Every prompt and answer is contributing value toward your
| progress toward the final solution, even if that value is
| just narrowing the latent space of potential outputs by
| keeping track of failed paths in the context window, so that
| it can avoid that path in a future answer after you provide
| followup feedback.
|
| The vast majority of slot machine pulls produce no value to
| the player. Every single prompt into an LLM tool produces
| some form of value. I have never once had an entirely wasted
| prompt unless you count the AI service literally crashing and
| returning a "Service Unavailable" type error.
|
| One of the stupidest takes about AI is that a partial
| hallucination or a single bug destroys the value of the tool.
| If a response is 90% of the way there and I have to fix the
| 10% of it that doesn't meet my expectations, then I still got
| 90% value from that answer.
| NegativeLatency wrote:
| > Every prompt and answer is contributing value toward your
| progress toward the final solution
|
| This has not been my experience, maybe sometimes, but
| certainly not always.
|
| As an example: asking chatgpt/gemini about how to
| accomplish some sql data transformation set me back in
| finding the right answer because the answer it did give me
| was so plausible but also super duper not correct in the
| end. Would've been better off not using it in that case.
|
| Brings to mind "You can't build a ladder to the moon"
| secabeen wrote:
| > One of the stupidest takes about AI is that a partial
| hallucination or a single bug destroys the value of the
| tool. If a response is 90% of the way there and I have to
| fix the 10% of it that doesn't meet my expectations, then I
| still got 90% value from that answer.
|
| That assumes that the value of a solution is linear with
| the amount completed. If the Pareto Principle holds (80% of
| effects come from 20% of causes), then not getting that
| critical 10+% likely has an outsized effect on the value of
| the solution. If I have to do the 20% of the work that's
| hard and important after taking what the LLM did for the
| remainder, I haven't gained as much because I still have to
| build the state machine in my head to understand the
| problem-space well enough to do that coding.
| PaulDavisThe1st wrote:
| This assumes you can easily and reliably identify the 10%
| you need to fix.
| rapind wrote:
| By this logic:
|
| - I buy stock that doesn't perform how I expected.
|
| - I hire someone to produce art.
|
| - I pay a lawyer to represent me in court.
|
| - I pay a registration fee to play a sport expecting to win.
|
| - I buy a gift for someone expecting friendship.
|
| Are all gambas.
|
| You aren't paying for the result (the win), you are paying
| for the service that _may_ produce the desired result, and in
| some cases one of may possibly desirable results.
| rjbwork wrote:
| >I buy stock that doesn't perform how I expected.
|
| Hence the adage "sir, this is a casino"
| nkrisc wrote:
| None of those are a games of chance, except the first.
| Suppafly wrote:
| >None of those are a games of chance, except the first.
|
| Neither is GenAI, the grandparent comment is dumb.
| abletonlive wrote:
| Yikes. The reactionary reach for more regulation from a
| certain group is just so tiresome. This is the real mind
| virus that I wish would be contained in Europe.
|
| I almost can't believe this idea is being seriously
| considered by anybody. By that logic buying any CPU is
| gambling because it's not deterministic how far you can
| overclock it.
|
| Just so you know, not every llm use case requires paying for
| tokens. You can even run a local LLM and use cline w/ it for
| all your coding needs. Pull that slot machine lever as many
| times as you like without spending a dollar.
| slurpyb wrote:
| Do you understand what electricity is?
| csallen wrote:
| Books are not like gambling, they _are_ gambling. you
| exchange dollars for chips (money -- some libraries even give
| you digital credits for "tokens") and spend them on a book
| in exchange for the chance of getting something good out of
| it.
|
| If you don't get something good the first time you buy a
| book, you might with the next book, or you might not. Either
| way, the house wins.
|
| It should be regulated as gambling, because it is. There's no
| metaphor -- the only difference from a slot machine is that
| books will never output cash directly, only the possibility
| of an insight or idea that could make money. So if you're
| lucky with your first gamble, you'll want to try another.
|
| Gambling all the way down.
| yewW0tm8 wrote:
| Same with anything though? Startups, marriages, kids.
|
| All those laid off coders gambled on a career that didn't pan
| out.
|
| Want more certainty in life, gonna have to get political.
|
| And even then there is no guarantee the future give a crap.
| Society may well collapse in 30 years, or 100...
|
| This is all just role play to satisfy the prior generations
| story driven illusions.
| Suppafly wrote:
| >Trying to get Image Generators to do what I want has a very
| "gambling-like" quality to it.
|
| Especially when you try to get them to generate something they
| explicitly tell you they won't, like nudity. It feels akin to
| hacking.
| gitroom wrote:
| man, pricing everywhere is getting nuts. makes me wonder if most
| stuff just gets harder to use over time or im just old now - you
| ever hit a point where you stop caring about new tools because it
| feels like too much work?
| biker142541 wrote:
| Can we please stop using 'vibe coding' to mean 'ai assisted
| coding'?? (best breakdown, imo:
| https://simonwillison.net/2025/Mar/19/vibe-coding/)
|
| Is it really vibe coding if you are building a detailed coding
| plan, conducting "git-based experimentation with ruthless
| pruning", and essentially reviewing the code incrementally for
| correctness and conciseness? Sure, it's a process dependent on
| AI, but it's very far from nearly "forget[ing] that the code even
| exists".
|
| That all said, I do think the article captures some of the
| current cost/quality dilemmas. I wouldn't jump to conclusions
| that these incentives are actually driving most current training
| decisions, but it's an interesting area to highlight.
| Animats wrote:
| "Vibe coding" is a trend.[1]
|
| [1]
| https://trends.google.com/trends/explore?geo=US&q=%22vibe%20...
| Ancapistani wrote:
| There should be a distinction, but I don't think it's really
| clear where it is yet.
|
| In my own usage, I tend to alternate between tiny, well-defined
| tasks and larger-scale, planned architectural changes or new
| features. Things in between those levels are hit and miss.
|
| It also depends on what I'm building and why. If it's a quick-
| and-dirty script for my own use, I'll often write up - or speak
| - a prompt and let it do its thing in the background while I
| work on other things. I care much less about code quality in
| those instances.
| codr7 wrote:
| It's still gambling, you're trading learning/reinforcing for
| efficiency, which in the long run means losing skills.
| parliament32 wrote:
| This reads like "is it really gambling when I have a many-step
| _system_ for predicting roulette outcomes? "
| samtp wrote:
| I've pretty clearly seen the critical thinking ability of
| coworkers who depend on AI too much sharply decline over the past
| year. Instead of taking 30 seconds to break down the problem and
| work through assumptions, they immediately copy/paste into an LLM
| and spit back what it tells them.
|
| This has lead to their abilities stalling while their output
| seemingly goes up. But when you look at the quality of their
| output, and their ability to get projects over the last 10% or
| make adjustments to an already completed project without breaking
| things, it's pretty horrendous.
| Etheryte wrote:
| My observations align with this pretty closely. I have a number
| of colleagues who I wager are largely using LLM-s, both by
| changes in coding style and how much they suddenly add
| comments, and I can't help but feel a noticeable drop in the
| quality of the output. Issues that should clearly have no
| business making it to code review are now regularly left for
| others to catch, it often feels like they don't even look at
| their own diffs. What to make of it, I'm not entirely sure. I
| do think there are ways LLM-s can help us work in better ways,
| but they can also lead to considerably worse outcomes.
| jimbokun wrote:
| Just replace your colleagues with the LLMs they are using.
| You will reduce costs with no decrease in the quality of
| work.
| andy99 wrote:
| I think lack of critical thinking is the root cause, not a
| symptom. I think pretty much everyone uses LLMs these days, but
| you can tell who sees the output and considers it "done" vs who
| uses LLM output as an input to their own process.
| mystified5016 wrote:
| I mean, I can tell that I'm having this problem and my
| critical thinking skills are otherwise typically quite sharp.
|
| At work I've inherited a Kotlin project and I've never
| touched Kotlin or android before, though I'm an experienced
| programmer in other domains. ChatGPT has been guiding me
| through what needs to be done. The problem I'm having is that
| it's just too damn easy to follow its advice without
| checking. I might save a few minutes over reading the docs
| myself, but I don't get the context the docs would have given
| me.
|
| I'm a 'Real Programmer' and I can tell that the code is
| logically sound and self-consistent. The code works and it's
| usually rewritten so much as to be distinctly _my_ code and
| style. But still it 's largely magical. If I'm doing things
| the less-correct way, I wouldn't really know because this
| whole process has led me to some pretty lazy thinking.
|
| On the other hand, I _very much_ do not care about this
| project. I 'm very sure that it will be used just a few times
| and never see the light of day again. I don't expect to ever
| do android development again after this, either. I think lazy
| thinking and farming the involved thinking out to ChatGPT is
| acceptable here, but it's clear how easily this could become
| a _very_ bad habit.
|
| I am making a modest effort to understand what I'm doing. I'm
| also completely rewriting or ignoring the code the AI gives
| me, it's more of an API reference and example. I can
| definitely see how a less-seasoned programmer might get
| suckered into blindly accepting AI code and iterating prompts
| until the code works. It's pretty scary to think about how
| the coming generations of programmers are going to experience
| and conceptualize programming.
| jobs_throwaway wrote:
| As someone who vibe codes at times (and is a professional
| programmer), I'm curious how yall go about resisting this? Just
| avoid LLMs entirely and do everything by hand? Very rigorously
| go over any LLM-generated code before committing?
|
| It certainly is hard when I'm say writing unit tests to avoid
| the temptation to throw it into Cursor and prompt until it
| works.
| breckenedge wrote:
| Set a budget. Get rate limited. Let the experience remind you
| how much time you're actually wasting letting the model write
| good looking but buggy code, versus just writing code
| responsibly.
| charcircuit wrote:
| This article ignores the enormous demand of AI coding paired with
| competition between providers. Reducing the price of tokens means
| that people can afford to generate more tokens. A code provider
| being cheaper on average to operate than another is a competitive
| advantage.
| chaboud wrote:
| 1. Yes. I've spent several late nights nudging Cline and Claude
| (and other systems) to the right answers. And being able to use
| AWS Bedrock to do this has been great (note: I work at Amazon).
|
| 2. I've had good fortunes keeping the agents to constrained
| areas, working on functions, or objects, with clearly defined (by
| me) boundaries. If the measure of a junior engineer is that you
| correct them once a day, an engineer once a week, a senior once a
| month, a principal once a quarter... Treat these agents like
| hyper-energetic interns. Nudge frequently.
|
| 3. Standard org management coding practices apply. Force the
| agents to show work, plan, unit test, investigate.
|
| And, basically, I've described that we're becoming Software
| Development Managers with teams of on-demand low-quality interns.
| That's an incredibly powerful tool, but don't expect hyper-
| elegant and compact code from them. Keep that for the senior
| engineering staff (humans) for now.
|
| (Note: The AlphaEvolve announcement makes me wonder if I'm going
| to have hyper-energetic applied science interns next...)
| xianshou wrote:
| Amusingly, about 90% of my rat's-nest problems with Sonnet 3.7
| are solved by simply appending a few words to the end of the
| prompt:
|
| "write minimum code required"
|
| It's not even that sensitive to the wording - "be terse" or "make
| minimal changes" amount to the same thing - but the resulting
| code will often be at least 50% shorter than the un-guided
| version.
| panstromek wrote:
| Well, the article mentions that this reduces accuracy. Do you
| hit that problem often then?
| andy99 wrote:
| I wish more had been written about the first assertion that using
| an LLM to code is like gambling and you're always hoping that
| just one more prompt will get you what you want.
|
| It really captures how little control one has over the process,
| while simultaneously having the illusion of control.
|
| I don't really believe that code is being made verbose to make
| more profits. There's probably some element of model providers
| not prioritizing concise code, but if conciseness while
| maintaining "quality" was possible is would give one model a
| sufficient edge over others that I suspect providers would do it.
| techpineapple wrote:
| Something I caught about Andrej Karpathy's original tweet, was
| he said "give into the vibes", and I wonder if he meant that
| about outcomes too.
| andy99 wrote:
| I still think the original tweet was tongue-in-cheek and not
| really meant to be a serious description of how to do things.
| Pxtl wrote:
| I can _feel_ how the extreme autocomplete of AI is a drug.
|
| Half of my job is fighting the "copy/paste/change one thing"
| garbage that developers generate. Keeping code DRY. The
| autocompletes do an amazing job of automating the repeated
| boilerplate. "Oh you're doing this little snippet for the first
| and second property? Obviously you want to do that for every
| property! Let me just expand that out for you!"
|
| And I'm like "oooh, that's nice and convenient".
|
| ...
|
| But I also should be looking at that with the stink-eye... part
| of that code is now duplicated a dozen times. Is there any way to
| reduce that duplication to the bare minimum? At least so it's
| only one duplicated declaration or call and all of the rest is
| per-thingy?
|
| Or any way to directly/automatically wrap the thing without going
| property-by-property?
|
| Normally I'd be asking myself these questions by the 3rd line.
| But this just made a dozen of those in an instant. And it's so
| tempting and addictive to just say "this is fine" and move on.
|
| That kind of code is not fine.
| Ancapistani wrote:
| > That kind of code is not fine.
|
| I agree, but I'm also challenging that position within myself.
|
| _Why_ isn 't it OK? If your primary concern is readability,
| then perhaps LLMs can better understand generated code relative
| to clean, human-readable code. Also, if you're not directly
| interacting with it, who cares?
|
| As for duplication introducing inconsistencies, that's another
| issue entirely :)
| Suppafly wrote:
| >That kind of code is not fine.
|
| Depends on your definition of fine. Is it less readable because
| it's doing the straight forward thing several times instead of
| wrapping it into a loop or a method, or is it more readable
| because of that.
|
| Is it not fine because it's slower, or does it all just compile
| down to the same thing anyway?
|
| Or is it not fine because you actually should be doing
| different things for the different properties but assumed you
| don't because you let the AI do the thinking for you?
| andrewstuart wrote:
| Claude was last week.
|
| The author should try Gemini it's _much_ better.
| martin-t wrote:
| Honestly can't tell if satire or not.
| jazoom wrote:
| It's not satire. Gemini is much better for coding, at least
| for me.
|
| Just to illustrate, I asked both about a browser automation
| script this morning. Claude used Selenium. Gemini used
| Playwright.
|
| I think the main reasons Gemini is much better are:
|
| 1. It gets my whole code base as context. Claude can't take
| that many tokens. I also include documentation for newer
| versions of libraries (e.g. Svelte 5) that the LLM is not so
| familiar with.
|
| 2. Gemini has a more recent knowledge cutoff.
|
| 3. Gemini 2.5 Pro is a thinking model.
|
| 4. It's free to use through the web UI.
| neilv wrote:
| I would seriously consider banning "vibe coding" right now,
| because:
|
| 1. Poor solutions.
|
| 2. Solutions not understood by the person who prompted them.
|
| 3. Development team being made dumber.
|
| 4. Legal and ethical concerns about laundering open source
| copyrights.
|
| 5. I'm suspicious of the name "vibe coding", like someone is
| intentionally marketing it to people who don't care to be good at
| their jobs.
|
| 6. I only want to hire people who can do holistically _better_
| work than current "AI". (Not churn code for a growth startup's
| Potemkin Village, nor to only nominally satisfy a client's
| requirements while shipping them piles of counterproductive
| garbage.)
|
| 7. Publicizing that you are a no-AI-slop company might scare away
| the majority of the bad prospective employees, while
| disproportionately attracting the especially good ones. (Not that
| everyone who uses "AI" is bad, but they've put themselves in the
| bucket with all the people who are bad, and that's a vastly
| better filter for the art of hiring than whether someone has
| spent months memorizing LeetCode answers solely for interviews.)
| YossarianFrPrez wrote:
| There are two sets of perverse incentives at play. The main one
| the author focuses on is that LLM companies are incentivized to
| produce verbose answers, so that when you task an LLM on
| extending an already verbose project, the tokens used and
| therefore cost increases.
|
| The second one is more intra/interpersonal: under pressure to
| produce, it's very easy to rely on LLMs to get one 80% of the way
| there and polish the remaining 20%. I'm in a new domain that
| requires learning a new language. So something I've started doing
| is asking ChatGPT to come up with exercises / coding etudes /
| homework for me based on past interactions.
| neonate wrote:
| https://archive.ph/EzbNK
| Vox_Leone wrote:
| Noted -- but honestly, that's somewhat expected. Vibe-style
| coding often lacks structure, patterns, and architectural
| discipline. That means the developer must do more heavy lifting:
| decide what they want, and be explicit -- whether that's 'avoid
| verbosity,' 'use classes,' 'encapsulate logic,' or 'handle errors
| properly.'
| johnea wrote:
| I generally agree with the concerns of this article, and wonder
| about the theory of the LLM having a innate inclination to
| generate bloated code.
|
| Even in this article though, I feel like there is a lot of
| anthropomorphization of LLMs.
|
| > LLMs and their limitations when reasoning about abstract logic
| problems
|
| As I understand them, LLMs don't "reason" about anything. It's
| purely a statistical sequencing of words (or other tokens) as
| determined by the training set and the prompt. Please correct me
| if I'm wrong.
|
| Also, regarding this theory that the models may be biased to
| produce bloated code: I've reposted this once already, and no one
| has replied yet, and I still wonder:
|
| ----------
|
| To me, this represents one of the most serious issues with LLM
| tools: the opacity of the model itself. The code (if provided)
| can be audited for issues, but the model, even if examined, is an
| opaque statistical amalgamation of everything it was trained on.
|
| There is no way (that I've read of) for identifying biases, or
| intentional manipulations of the model that would cause the tool
| to yield certain intended results.
|
| There are examples of DeepState generating results that refuse to
| acknowledge Tienanmen square, etc. These serve as examples of how
| the generated output can intentionally be biased, without the
| ability to readily predict this general class of bias by
| analyzing the model data.
|
| ----------
|
| I'm still looking for confirmation or denial on both of these
| questions...
| sherburt3 wrote:
| Really makes you wonder where this is all going. What is going to
| be the thing where we say "Maybe we took this a little too far."
| I'm sure whatever bloated react apps we see today are nothing in
| comparison to the monstrosities we have in store for us in the
| future.
| deadbabe wrote:
| The future should be less bloat. We don't need frameworks
| anymore, we can produce output to straight html pages with
| vanilla JavaScript. Could be good.
| coolcase wrote:
| Dopamine? That sort of thing triggers cortisol for me if
| anything!
| erulabs wrote:
| These perverse incentives run at the heart of almost all
| Developer Software as a Service tooling. Using someone else's
| hosted model incentivizes increasing token usage, but it's
| nothing special about AI.
|
| Consider Database-as-a-service companies: They're not
| incentivized to optimize on CPU usage, they charge per cpu.
| They're not incentivized to improve disk compression, they charge
| for disk-usage. There are several DB vendors who explicitly
| disable disk compression and happily charge for storage capacity.
|
| When you run the software yourself, or the model yourself, the
| incentives aligned: use less power, use less memory, use less
| disk, etc.
| lubujackson wrote:
| I feel like "vibe coding" as a "no look" sort of way to produce
| anything is bad and will probably remain bad for some time.
|
| However... "vibe architecting" is likely going to be the way
| forward. I have had success with generating/tuning an
| architecture plan with AI, having it create stub files/functions
| then filling them out individually. I can get pretty much the
| whole way without typing code, but it does require a fair bit
| more architectural thinking than usual and a good bit of reading
| code (then telling the AI to "do better").
|
| I think of it like the analogy of blind men describing an
| elephant when they can only feel a single part. AI is decent at
| high level architecture and decent at low level production but
| you need a human to understand the big picture and how the pieces
| fit (and which ones are missing).
| croes wrote:
| If I do the same with a human developer instead of an AI it's
| called ordering not vibe coding.
|
| What's the difference?
| ramoz wrote:
| I disagree with the idea that LLM providers are deliberately
| designing solutions to consume more tokens. We're in the early
| days of agentic coding, and the landscape is intensely
| competitive. Providers are focused on building highly capable
| systems to drive adoption, especially with open-source
| alternatives just a git clone away.
|
| Yes, Claude Code can be token-heavy, but that's often a trade-off
| for their current level of capability compared to other options.
| Additionally, Claude Code has built-in levers for cost (I prefer
| they continue to focus on advanced capability, let pricing
| accessibility catch up).
|
| "early days" means:
|
| - Prompt engineering is still very much a required skill for
| better code and lower pricing
|
| - Same with still needing to be an engineer for the same reasons,
| and:
|
| - Devs need to actively guide these agents. This includes
| detailed planning, progress tracking, and careful context
| management - which, as the author notes, is more involved than
| many realize. I've personally found success using Gemini to
| create structured plans for Claude Code to execute, which helps
| manage its verbosity and focus to "thoughtful" execution (as
| guided by gemini). I drop entire codebases into Gemini (for
| free).
| mecredis wrote:
| Hi! Author here. I don't actually think they're deliberately
| doing this, hence my choice of "perverse incentives" vs.
| something more accusatory. The issue is that they don't have a
| ton of incentive to fix it.
|
| Agree with you on all the rest, and I think writing a post like
| this was very much intended as a gut-check on things since the
| early days are hopefully the times when things can get fixed
| up.
| ramoz wrote:
| My speculation is that these companies have significant
| reason to prioritize lowering the amount of tokens produced
| as well as cost of tokens.
|
| The leaked Claude Code codebase was riddled with "concise",
| "do not add comments", "mimic codestyle", even an explicit
| "You should minimize output tokens as much as possible" etc.
| Btw, Claude Code uses a custom system prompt, not the leaked
| 24k claude.ai one.
| slurpyb wrote:
| It's so cool that we're all actively participating in the
| handover of all our work to these massive companies so we can
| be forever reliant on their blackbox subscriptions. Don't fret;
| there will be a day where those profit numbers will have to go
| up and they will consciously make the product worse, just to
| trigger more queries, and thus extract more money from you.
| Gross.
| brooke2k wrote:
| I don't understand the productivity that people get out of these
| AI tools. I've tried it and I just can't get anything remotely
| worthwhile unless it's something very simple or something
| completely new being built from the ground up.
|
| Like sure, I can ask claude to give me the barebones of a web
| service that does some simple task. Or a webpage with some
| information on it.
|
| But any time I've tried to get AI services to help with
| bugfixing/feature development on a large, complex, potentially
| multi-language codebase, it's useless.
|
| And those tasks are the ones that actually take up the majority
| of my time. On the occasion that I'm spinning a new thing up
| quickly, I don't really need an AI to do it for me -- I mean,
| that's the easy part!
|
| Is there something I'm missing? Am I just not using it right? I
| keep seeing people talk about how addictive it is, how the
| productivity boost is insane, how all their code is now written
| by AI and then audited, and I just don't see how that's possible
| outside of really simple rote programming.
| Starlevel004 wrote:
| > Is there something I'm missing? Am I just not using it right?
|
| The talk about it makes more sense when you remember most
| developers are primarily writing CRUD webapps or adware, which
| is essentially a solved problem already.
| slurpyb wrote:
| You are not alone! I strongly agree and I feel like I am losing
| my mind reading some of the comments people have about these
| services.
| hx8 wrote:
| Probably 80% of the time I spend coding, I'm inside a code file
| I haven't read in the last month. If I need to spend more than
| 30 seconds reading a section of code before I understand it,
| I'll ask AI to explain it to me. Usually, it does a good job of
| explaining code at a level of complexity that would take me
| 1-15 minutes to understand, but does a poor job of answering
| more complex questions or at understanding more complex code.
|
| It's a moderately useful tool for me. I suspect the people that
| get the most use out of are those that would take more than 1
| hour to read code I would take 10 minutes to read. Which is to
| say the least experienced people get the most value.
| lukan wrote:
| Yesterday I gave cursor a try and made my first (intentionally
| very lazy) vibe coding approach (a simple threejs project). It
| accepted the task and did things, failed, did things, failed,
| did things ... failed for good.
|
| I guess I could work on the magic incantations to tweak here a
| bit until it works and I guess that's the way it is done. But I
| wasn't hooked.
|
| I do get value out of LLM's for isolated broken down subtasks,
| where asking a LLM is quicker than googling.
|
| For me, AI will probably become really usefull, once I can scan
| and integrate my own complex codebase so it gives me solutions
| that work there and not hallucinate API points or jump between
| incompatible libary versions (my main issue).
| colechristensen wrote:
| Some people do really repetitive or really boilerplate things,
| others do not.
|
| Also you have to learn to talk to it and how to ask it things.
| UncleOxidant wrote:
| > I have probably spent over $1,000 vibe coding various projects
| into reality
|
| dude, you can use Gemini Pro 2.5 with Cline - it's free and is
| rated at least as good as Claude Sonnet 3.7 right now.
___________________________________________________________________
(page generated 2025-05-14 23:00 UTC)