[HN Gopher] My AI skeptic friends are all nuts
___________________________________________________________________
My AI skeptic friends are all nuts
Author : tabletcorry
Score : 394 points
Date : 2025-06-02 21:09 UTC (1 hours ago)
(HTM) web link (fly.io)
(TXT) w3m dump (fly.io)
| sneak wrote:
| THANK YOU.
|
| I was a 3-4x programmer before. Now I'm a 9-15x programmer when
| wrangling LLMs.
|
| This is a sea change and it's already into "incredible" territory
| and shows no signs of slowing down.
|
| > _Think of anything you wanted to build but didn't. You tried to
| home in on some first steps. If you'd been in the limerent phase
| of a new programming language, you'd have started writing. But
| you weren't, so you put it off, for a day, a year, or your whole
| career._
|
| I have been banging out little projects that I have wanted to
| exist for years but always had on the back burner. Write a
| detailed readme and ask the agent to interrogate you about the
| missing parts of the spec then update the README. Then have it
| make a TODO and start implementing. Give it another code base for
| style guide.
|
| I've made more good and useful and working code in the last month
| than I have in the last two years.
| FridgeSeal wrote:
| That's nothing, I was a 4X programmer and now I'm a 500x
| programmer!
|
| I don't just run one agent, I run all of them!
|
| My time to close tickets is measured in minutes!
|
| I don't even review code, I have a different agent review it
| for me!
| ofjcihen wrote:
| And to make sure that agent doesn't make mistakes I have a
| different agent review that agents work!
| tptacek wrote:
| Why would you do this and not just read the code yourself?
| yifanl wrote:
| Because at thats what you need to do to get a 943x coder
| black belt
| grey-area wrote:
| Well given reading code is more tedious than writing it
| and the author of this article claims gai is most useful
| for tedious or repetitive code, why would you want to
| read it? Since this AI agent understands and reasons
| about the text it writes and reads it should be pretty
| infallible at checking the code too, right?
|
| Just get another agent to review it and merge it, job
| done.
| tsimionescu wrote:
| Plus, the sales agents are running to promote the
| finished product to other companies and close deals, and
| the accounting agents are checking that all expenses are
| being accounted for and we have a positive cash flow.
| Obviously, the audit agents are checking that no errors
| sneak into this process, according to a plan devised by
| the legal agents.
| happytoexplain wrote:
| The parent neglected to add /s
| hooverd wrote:
| why not use AI to summarize the code for you?
| sneak wrote:
| I can't tell if you're being sarcastic or not, but if you
| are, the real world is not far behind. I can imagine a world
| where a mixture of AI agents (some doing hypercritical code
| review) can return you tested and idiomatic PRs faster than
| you can describe the new architecture in issues.
|
| I think a lot of people are unfamiliar with the (expensive)
| SOTA.
| indigodaddy wrote:
| Lol omg I guess your original comment was NOT sarcastic!?
| MegaButts wrote:
| > I was a 3-4x programmer before. Now I'm a 9-15x programmer
|
| What the fuck does this mean?
| mouse_ wrote:
| Nerds got taken aside and talked to about how it's not nice
| or cool to brag about IQ score so they invented a new
| artificial metric to brag about.
| throwawayqqq11 wrote:
| It means cranking out hello world even faster i guess. I
| wonder how complex all these projects are people are proud to
| have completed with the help of AI.
| sneak wrote:
| It's a riff on the "10x programmer" concept. People who
| haven't worked with 10x programmers tend to not believe they
| exist.
|
| I'm nowhere near that, but even unaided I'm quite a bit
| faster than most people I've hired or worked with. With LLMs
| my high quality output has easily tripled.
|
| Writing code may be easier than reading it - but reading it
| is FASTER than writing it. And that's what matters.
| hansvm wrote:
| It depends on the value of x. I think it's safe to assume x
| <= 0.75, else they'd contribute negatively to their teams
| (happens from time to time, but let's be generous).
| Previously they'd be anywhere from a 0/10 to 3/10 programmer,
| and now they get up to 9/10 on a good day but sometimes are a
| net negative, as low as -2.25/10 on a bad day. I imagine that
| happens when tired or distracted and unable to adequately
| police LLM output.
| surgical_fire wrote:
| Ox3 and 0x15 is the same value.
| nico wrote:
| I'm not sure about giving specific metrics or kpis of
| efficiency or performance
|
| It definitely feels different to develop using LLMs, especially
| things from scratch. At this point, you can't just have the LLM
| do everything. Sooner or later you need to start intervening
| more often, and as the complexity of the project grows, so does
| the attention you need to give to guiding the LLM. At that
| point the main gains are mostly in typing and quickly looking
| some things up, which are still really nice gains
| retrac wrote:
| Machine translation and speech recognition. The state of the art
| for these is a multi-modal language model. I'm hearing impaired
| veering on deaf, and I use this technology all day every day. I
| wanted to watch an old TV series from the 1980s. There are no
| subtitles available. So I fed the show into a language model
| (Whisper) and now I have passable subtitles that allow me to
| watch the show.
|
| Am I the only one who remembers when that was the stuff of
| science fiction? It was not so long ago an open question if
| machines would ever be able to transcribe speech in a useful way.
| How quickly we become numb to the magic.
| dmonitor wrote:
| Old TV series should have closed captions available (which are
| apparently different from subtitles), however the question of
| where to obtain aside from VHS copies them might be difficult.
| worble wrote:
| And of course, a lot of modern "dvd players" do not properly
| transmit closed captions as subtitles over HDMI, so that sure
| isn't helping
|
| A slightly off topic but interesting video about this
| https://www.youtube.com/watch?v=OSCOQ6vnLwU
| anotherevan wrote:
| Many DVDs of old movies and TV shows may contain the closed
| captions, but they are not visible through HDMI. You have to
| connect your DVD player to your TV via the composite video
| analogue outputs.
|
| This video explains all about it:
| https://youtu.be/OSCOQ6vnLwU
| clvx wrote:
| I feel you. In the late 00's/early 10's, downloading and
| getting American movies were fairly easy but getting the
| subtitles was a challenge. It was even worse with movies from
| other regions. Even now I know people that record conversations
| to be replayed using Whisper so they can get 100% the info from
| it.
|
| Disclaimer: I'm not praising piracy but outside of US borders
| is a free for all.
| albertzeyer wrote:
| That's not quite true. State of the art both in speech
| recognition and translation is still a dedicated model only for
| this task alone. Although the gap is getting smaller and
| smaller, and it also heavily depends on who invests how much
| training budget.
|
| For example, for automatic speech recognition (ASR), see:
| https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
|
| The current best ASR model has 600M params (tiny compared to
| LLMs, and way faster than any LLM: 3386.02 RTFx vs 62.12 RTFx,
| much cheaper) and was trained on 120,000h of speech. In
| comparison, the next best speech LLM (quite close in WER, but
| slightly worse) has 5.6B params and was trained on 5T tokens,
| 2.3M speech hours. It has been always like this: With a
| fraction of the cost, you will get a pure ASR model which still
| beats every speech LLM.
|
| The same is true for translation models, at least when you have
| enough training data, so for popular translation pairs.
|
| However, LLMs are obviously more powerful in what they can do
| despite just speech recognition or translation.
| BeetleB wrote:
| It's not the speech recognition model alone that's fantastic.
| It's coupling it to an LLM for cleanup that makes all the
| difference.
|
| See https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-
| rec...
|
| (This is not the best example as I gave it free rein to
| modify the text - I should post a followup that has an
| example closer to a typical use of speech recognition).
|
| Without that extra cleanup, Whisper is simply not good
| enough.
| edflsafoiewq wrote:
| What translation models are better than LLMs?
|
| The problem with Google-Translate-type models is the
| interface is completely wrong. Translation is not
| sentence->translation, it's (sentence,context)->translation
| (or even (sentence,context)->(translation,commentary)). You
| absolutely have to be able to input contextual information,
| instructions about how certain terms are to be translated,
| etc. This is trivial with an LLM.
| thatjoeoverthr wrote:
| This is true, and LLMs crush Google in many translation
| tasks, but they do too many other things. They can and do
| go off script, especially if they "object" to the content
| being translated.
|
| "As a safe AI language model, I refuse to translate this"
| is not a valid translation of "spierdalaj".
| selfhoster11 wrote:
| That's literally an issue with the tool being made
| defective by design by the manufacturer. Not with the
| tool-category itself.
| albertzeyer wrote:
| I'm not sure what type of model Google uses nowadays for
| their webinterface. I know that they also actually provide
| LLM-based translation via their API.
|
| Also the traditional cross-attention-based encoder-decoder
| translation models support document-level translation, and
| also with context. And Google definitely has all those
| models. But I think the Google webinterface has used much
| weaker models (for whatever reason; maybe inference
| costs?).
|
| I think DeepL is quite good. For business applications,
| there is Lilt or AppTek and many others. They can easily
| set up a model for you that allows you to specify context,
| or be trained for some specific domain, e.g. medical texts.
|
| I don't really have a good reference for a similar
| leaderboard for translation models. For translation, the
| metric to measure the quality is anyway much more
| problematic than for speech recognition. I think for the
| best models, only human evaluation is working well now.
| gpm wrote:
| I've been using small local LLMs for translation recently
| (<=7GB total vram usage) and they, even the small ones,
| definitely beat Google Translate in my experience. And they
| don't require sharing whatever I'm reading with Google,
| which is nice.
| yubblegum wrote:
| What are you using? whisper?
| gpm wrote:
| Uh, translation, not transcription.
|
| Just whatever small LLM I have installed as the default
| for the `llm` command line tool at the time. Currently
| that's gemma3:4b-it-q8_0 though it's generally been some
| version of llama in the past. And then this fish shell
| function (basically a bash alias)
| function trans llm "Translate \"$argv\" from
| French to English please" end
| Terr_ wrote:
| > However, LLMs are obviously more powerful in what they can
| do despite just speech recognition
|
| Unfortunately, one of those powerful features is "make up new
| things that fit well but nobody _actually said_ ", and...
| well, there's no way to disable it. :p
| pants2 wrote:
| That leaderboard omits the current SOTA which is
| GPT-4o-transcribe (an LLM)
| albertzeyer wrote:
| Do you have any comparisons in terms of WER? I doubt that
| GPT-4o-transcribe is better than the best models from that
| leaderboard (https://huggingface.co/spaces/hf-
| audio/open_asr_leaderboard). A quick search on this got me
| here: https://www.reddit.com/r/OpenAI/comments/1jvdqty/gpt4
| otransc... https://scribewave.com/blog/openai-launches-
| gpt-4o-transcrib...
|
| It is stated that GPT-4o-transcribe is better than Whisper-
| large. That might be true, but what version of Whisper-
| large actually exactly? Looking at the leaderboard, there
| are a lot of Whisper variants. But anyway, the best Whisper
| variant, CrisperWhisper, is currently only at rank 5. (I
| assume GPT-4o-transcribe was not compared to that but to
| some other Whisper model.)
|
| It is stated that Scribe v1 from elevenlabs is better than
| GPT-4o-transcribe. In the leaderboard, Scribe v1 is also
| only at rank 6.
| kulahan wrote:
| Translation seems like _the_ ideal application. It seems as
| though an LLM would truly have no issues integrating societal
| concepts, obscure references, pop culture, and more, and be
| able to compare it across culture to find a most-perfect
| translation. Even if it has to spit out three versions to
| perfectly communicate, it's still leaps and bounds ahead of
| traditional translators already.
| crote wrote:
| > it's still leaps and bounds ahead of traditional
| translators already
|
| Traditional _machine_ translators, perhaps. Human translation
| is still miles ahead when you actually care about the quality
| of the output. But for getting a general overview of a
| foreign-language website, translating a menu in a restaurant,
| or communicating with a taxi driver? Sure, LLMs would be a
| great fit!
| og_kalu wrote:
| >Human translation is still miles ahead when you actually
| care about the quality of the output.
|
| The current SOTA LLMs are better than Traditional machine
| translators (there is no perhaps) and most human
| translators.
|
| If a 'general overview' is all you think they're good for,
| then you've clearly not seriously used them.
| BeetleB wrote:
| Reference?
|
| (Not saying I don't believe you - it would be fascinating
| if true).
| troupo wrote:
| > It seems as though an LLM would truly have no issues
| integrating societal concepts, obscure references, pop
| culture, and more, and be able to compare it across culture
| to find a most-perfect translation.
|
| Somehow LLMs can't do that for structured code with well
| defined semantics, but sure, they will be able to extract
| "obscure references" from speech/text
| BeetleB wrote:
| > Machine translation and speech recognition.
|
| Yes, yes and yes!
|
| I tried speech recognition many times over the years (Dragon,
| etc). Initially they all were "Wow!", but they simply were not
| good enough to use. 95% accuracy is not good enough.
|
| Now I use Whisper to record my voice, and have it get passed to
| an LLM for cleanup. The LLM contribution is what finally made
| this feasible.
|
| It's not perfect. I still have to correct things. But only
| about a tenth of the time I used to. When I'm transcribing
| notes for myself, I'm at the point I don't even bother
| verifying the output. Small errors are OK for my own notes.
| andrepd wrote:
| What is the relevance of this comment? The post is about LLMs
| in programming. Not about translation or NLP, two things
| transformers do quite well and that hardly anyone contests.
| hiAndrewQuinn wrote:
| Definitely not. I took this same basic idea of feeding videos
| into Whisper to get SRT subtitles and took it a step further to
| make automatic Anki flashcards for listening practice in
| foreign languages [1]. I literally feel like I'm living in the
| future every time I run across one of those cards from whatever
| silly Finnish video I found on YouTube pops up in my queue.
|
| These models have made it possible to robustly practice all 4
| quadrants of language learning for most common languages using
| nothing but a computer, not just passive reading. Whisper is
| directly responsible for 2 of those quadrants, listening and
| speaking. LLMs are responsible for writing [2]. We absolutely
| live in the future.
|
| [1]: https://github.com/hiandrewquinn/audio2anki
|
| [2]: https://hiandrewquinn.github.io/til-site/posts/llm-
| tutored-w...
| tipofthehat wrote:
| Hi Andrew, I've been trying to get a similar audio language
| support app hacked together in a podcast player format (I
| started with Anytime Player) using some of the same
| principles in your project (transcript generation, chunking,
| level & obscurity aware timestamped hints and translations).
|
| I really think support for native content is the ideal way to
| learn for someone like me, especially with listening.
|
| Thanks for posting and good luck.
| backtoyoujim wrote:
| I don't think you are also including having AI lie of
| "hallucinating" to us which is an important point even if the
| article is only about having AI write code for an organization.
| mtklein wrote:
| I completely agree that technology in the last couple years has
| genuinely been fulfilling the promise established in my
| childhood sci-fi.
|
| The other day, alone in a city I'd never been to before, I
| snapped a photo of a bistro's daily specials hand-written on a
| blackboard in Chinese, copied the text right out of the photo,
| translated it into English, learned how to pronounce the menu
| item I wanted, and ordered some dinner.
|
| Two years ago this story would have been: notice the special
| board, realize I don't quite understand all the characters well
| enough to choose or order, and turn wistfully to the menu to
| hopefully find something familiar instead. Or skip the bistro
| and grab a pre-packaged sandwich at a convenience store.
| ryoshoe wrote:
| >The other day, alone in a city I'd never been to before, I
| snapped a photo of a bistro's daily specials hand-written on
| a blackboard in Chinese, copied the text right out of the
| photo, translated it into English, learned how to pronounce
| the menu item I wanted, and ordered some dinner.
|
| To be fair apps dedicated apps like Pleco have supported
| things like this for 6+ years, but the spread of modern
| language models has made it more accessible
| asolus wrote:
| > I snapped a photo of a bistro's daily specials hand-written
| on a blackboard in Chinese, copied the text right out of the
| photo, translated it into English, learned how to pronounce
| the menu item I wanted, and ordered some dinner.
|
| > Two years ago
|
| This functionality was available in 2014, on either an iPhone
| or android. I ordered specials in Taipei way before Covid.
| Here's the blog post celebrating it:
|
| https://blog.google/products/translate/one-billion-installs/
|
| This is all a post about AI, hype, and skepticism. In my
| childhood sci-fi, the idea of people working multiple jobs to
| still not be able to afford rent was written as shocking or
| seen as dystopian. All this incredible technology is a double
| edges sword, but doesn't solve the problems of the day, only
| the problems of business efficiency, which exacerbates the
| problems of the day.
| taurath wrote:
| > I snapped a photo of a bistro's daily specials hand-written
| on a blackboard in Chinese, copied the text right out of the
| photo, translated it into English, learned how to pronounce
| the menu item I wanted, and ordered some dinner.
|
| > Two years ago
|
| This functionality was available in 2014, on either an iPhone
| or android. I ordered specials in Taipei way before Covid.
| Here's the blog post celebrating it:
|
| https://blog.google/products/translate/one-billion-installs/
|
| This is all a post about AI, hype, and skepticism. In my
| childhood sci-fi, the idea of people working multiple jobs to
| still not be able to afford rent was written as shocking or
| seen as dystopian. All this incredible technology is a double
| edges sword, but doesn't solve the problems of the day, only
| the problems of business efficiency, which exacerbates the
| problems of the day.
| archagon wrote:
| Last time I used Whisper with a foreign language (Chinese)
| video, I'm pretty sure it just made some stuff up.
|
| The captions looked like they would be correct in context, but
| I could not cross-reference them with snippets of manually
| checked audio, to the best of my ability.
| makeitdouble wrote:
| > Am I the only one who remembers when that was the stuff of
| science fiction?
|
| Would you go to a foreign country and sign a work contract
| based on the LLM translation ?
|
| Would you answer a police procedure based on the speech
| recognition alone ?
|
| That to me was the promise of the science fiction. Going to
| another planet and doing inter-species negotiations based on
| machine translation. We're definitely not there IMHO, and I
| wouldn't be surprised if we don't quite get there in our
| lifetime.
|
| Otherwise if we're lowering the bar, speech to text has been
| here for decades, albeit clunky and power hungry. So
| improvements have been made, but watching old movies is a way
| too low stake situation IMHO.
| hot_topic wrote:
| We have the tools to do this, and will have commercial
| products for everything you listed in the next couple years.
| anotherevan wrote:
| Using AI to generate subtitles is inventive. Is it smart enough
| to insert the time codes such that the subtitle is well enough
| synchronised to the spoken line?
|
| As someone who has started losing the higher frequencies and
| thus clarity, I have subtitles on all the time just so I don't
| miss dialogue. The only pain point is when the subtitles (of
| the same language) are not word-for-word with the spoken line.
| The discordance between what you are reading and hearing is
| really distracting.
|
| This is my major peeve with my The West Wing DVDs, where the
| subtitles are often an abridgement of the spoken line.
| kerryritter wrote:
| A well articulated blog, imo. Touches on all the points I see
| argued about on LinkedIn all the time.
|
| I think leveling things out at the beginning is important. For
| instance, I recently talked to a senior engineer who said "using
| AI to write programming is so useless", but then said they'd
| never heard of Cursor. Which is fine - but I so often see strong
| vocal stances against using AI tools but then referring to early
| Copilot days or just ChatGPT as their experience, and the game
| has changed so much since then.
| gdubs wrote:
| One thing that I find truly amazing is just the simple fact that
| you can now be fuzzy with the input you give a computer, and get
| something meaningful in return. Like, as someone who grew up
| learning to code in the 90s it always seemed like science fiction
| that we'd get to a point where you could give a computer some
| vague human level instructions and get it more or less do what
| you want.
| csallen wrote:
| It's mind blowing. At least 1-2x/week I find myself shocked
| that this is the reality we live in
| FridgeSeal wrote:
| And you only need the energy and water consumption of a small
| town to do it!
|
| Truly the most incredible times!
| oblio wrote:
| Were you expecting builders of Dyson Spheres to drive
| around in Yugo cars? They're obviously all driving Ford
| F-750s for their grocery runs.
| IshKebab wrote:
| Some people are never happy. Imagine if you demonstrated
| ChatGPT in the 90s and someone said "nah... it uses, like
| 500 watts! no thank you!".
| postalrat wrote:
| Much less than building an iphone.
| ACCount36 wrote:
| Wait till you hear about the "energy and water consumption"
| of Netflix.
| mentos wrote:
| It's surreal to me been using ChatGPT everyday for 2 years,
| makes me question reality sometimes like 'howtf did I live to
| see this in my lifetime'
|
| I'm only 39, really thought this was something reserved for
| the news on my hospital tv deathbed.
| csallen wrote:
| I turned 38 a few months ago, same thing here. I would love
| to go back in time 5 years and tell myself about what's to
| come. 33yo me wouldn't have believed it.
| malfist wrote:
| Today I had a dentist appointment and the dentist suggested I
| switch toothpaste lines to see if something else works for my
| sensitivity better.
|
| I am predisposed to canker sores and if I use a toothpaste
| with SLS in it I'll get them. But a lot of the SLS free
| toothpastes are new age hippy stuff and is also fluoride
| free.
|
| I went to chatgpt and asked it to suggest a toothpaste that
| was both SLS free and had fluoride. Pretty simple ask right?
|
| It came back with two suggestions. It's top suggestion had
| SLS, it's backup suggestion lacked fluoride.
|
| Yes, it is mind blowing the world we live in. Executives want
| to turn our code bases over to these tools
| sneak wrote:
| "an LLM made a mistake once, that's why I don't use it to
| code" is exactly the kind of irrelevant FUD that TFA is
| railing against.
|
| Anyone not learning to use these tools well (and cope with
| and work around their limitations) is going to be left in
| the dust in months, perhaps weeks. It's insane how much
| utility they have.
| breuleux wrote:
| They won't. The speed at which these models evolve is a
| double-edged sword: they give you value quickly... but
| any experience you gain dealing with them also becomes
| obsolete quickly. One year of experience using agents
| won't be more valuable than one week of experience using
| them. No one's going to be left in the dust because no
| one is more than a few weeks away from catching up.
| grey-area wrote:
| Looking forward to seeing you live up to your hyperbole
| in a few weeks, the singularity is near!
| malfist wrote:
| Once? Lol.
|
| I present a simple problem with well defined parameters
| that LLMs can use to search product ingredient lists
| (that are standardized). This is the type of problems
| LLMs are supposed to be good at and it failed in every
| possible way.
|
| If you hired master woodworker and he didn't know what
| wood was, you'd hardly trust him with hard things, much
| less simple ones
| pmdrpg wrote:
| Feel similarly, but even if it is wrong 30% of the time,
| you can (as the author of this op ed points out) pour an
| ungodly amount of resources into getting that error down by
| chaining them together so that you have many chances to
| catch the error. And as long as that only destroys the
| environment and doesn't cost more than a junior dev, then
| they're going to trust their codebases with it yes, it's
| the competitive thing to do, and we all know competition
| produces the best outcome for everyone... right?
| csallen wrote:
| It takes very little time or brainpower to circumvent AI
| hallucinations in your daily work, if you're a frequent
| user of LLMs
| gertlex wrote:
| Feels like you're comparing how LLMs handle unstandardized
| and incomplete marketing-crap that is virtually all product
| pages on the internet, and how LLMs handle the corpus of
| code on the internet that can generally be trusted to be at
| least semi functional (compiles or at least lints; and
| often easily fixed when not 100%).
|
| Two very different combinations it seems to me...
|
| If the former combination was working, we'd be using
| chatgpt to fill our amazon carts by now. We'd probably be
| sanity checking the contents, but expecting pretty good
| initial results. That's where the suitability of AI for
| lots of coding-type work feels like it's at.
| malfist wrote:
| Product ingredient lists are mandated by law and follow a
| standard. Hard to imagine a better codified NLP problem
| gertlex wrote:
| I hadn't considered that, admittedly. It seems like that
| would make the information highly likely to be present...
|
| I've admittedly got an absence of anecdata of my own
| here, though: I don't go buying things with ingredient
| lists online much. I was pleasantly surprised to see a
| very readable list when I checked a toothpaste page on
| amazon just.
| layer8 wrote:
| At the very least, it demonstrates that you can't trust
| LLMs to correctly assess that they couldn't find the
| necessary information, or if they do internally, to tell
| you that they couldn't. The analogous gaps of awareness
| and acknowledgment likely apply to their reasoning about
| code.
| NikkuFox wrote:
| If you've not found a toothpaste yet, see if UltraDex is
| available where you live.
| pmdrpg wrote:
| I remember the first time I played with GPT and thought "oh,
| this is fully different from the chatbots I played with
| growing up, this isn't like anything else I've seen" (though
| I suppose it is implemented much like predictive text, but
| the difference in experience is that predictive text is
| usually wrong about what I'm about to say so it feels silly
| by comparison)
| cosmic_cheese wrote:
| Though I haven't embraced LLM codegen (except for non-
| functional filler/test data), the fuzziness is why I like to
| use them as talking documentation. It makes for a lot less of
| fumbling around in the dark trying to figure out the magic
| combination of search keywords to surface the information
| needed, which can save a lot of time in aggregate.
| pixl97 wrote:
| Honestly LLMs are a great canary if your documentation /
| language / whatever is 'good' at all.
|
| I wish I would have kept it around but had ran into an issue
| where the LLM wasn't giving a great answer. Look at the
| documentation, and yea, made no sense. And all the forum
| stuff about it was people throwing out random guessing on how
| it should actually work.
|
| If you're a company that makes something even moderately
| popular and LLMs are producing really bad answers there is
| one of two things happening.
|
| 1. Your a consulting company that makes their money by
| selling confused users solutions to your crappy product 2.
| Your documentation is confusing crap.
| NooneAtAll3 wrote:
| (you're)
| progval wrote:
| The other side of the coin is that if you give it a precise
| input, it will fuzzily interpret it as something else that is
| easier to solve.
| BoorishBears wrote:
| It will, or it might? Because if every time you use an LLM is
| misinterprets your input as something easier to solve, you
| might want to brush up on the fundamentals of the tool
|
| (I see some people are quite upset with the idea of having to
| mean what you say, but that's something that serves you well
| when interacting with people, LLMs, and even when programming
| computers.)
| progval wrote:
| Might, of course. And in my experience it's what happens
| most times I ask a LLM to do something I can't trivially do
| myself.
| BoorishBears wrote:
| Well everyone's experience is different, but that's been
| a pretty atypical failure mode in my experience.
|
| That being said, I don't primarily lean on LLMs for
| things I have no clue how to do, and I don't think I'd
| recommend that as the primary use case either at this
| point. As the article points out, LLMs are pretty useful
| for doing tedious things you know how to do.
|
| Add up enough "trivial" tasks and they can take up a non-
| trivial amount of energy. An LLM can help reduce some of
| the energy zapped so you can get to the harder, more
| important, parts of the code.
|
| I also do my best to communicate clearly with LLMs: like
| I use words that mean what I intend to convey, not words
| that mean the opposite.
| jacobgkau wrote:
| I use words that convey very clearly what I mean, such as
| "don't invent a function that doesn't exist in your next
| response" when asking what function a value is coming
| from. It says it understands, then proceeds to do what I
| specifically asked it not to do anyway.
|
| The fact that you're responding to someone who found AI
| non-useful with "you must be using words that are the
| opposite of what you really mean" makes your rebuttal
| come off as a little biased. Do you really think the
| chances of "they're playing opposite day" are higher than
| the chances of the tool not working well?
| BoorishBears wrote:
| But that's _exactly_ what I mean by brush up on the tool:
| "don't invent a function that doesn't exist in your next
| response" doesn't mean anything to an LLM.
|
| It implies you're continuing with a context window where
| it already hallucinated function calls, yet your fix is
| to give it an instruction that relies on a kind of
| introspection it can't really demonstrate.
|
| My fix in that situation would be to start a fresh
| context and provide as much relevant documentation as
| feasible. If that's not enough, then the LLM probably
| won't succeed for the API in question no matter how many
| iterations you try and it's best to move on.
|
| > ... makes your rebuttal come off as a little biased.
|
| Biased how? I don't personally benefit from them using
| AI. They used wording that was contrary to what they
| meant in the comment I'm responding to, that's why I
| brought up the possibility.
| jacobgkau wrote:
| > Biased how?
|
| Biased as in I'm pretty sure he didn't write an AI prompt
| that was the "opposite" of what he wanted.
|
| And generalizing something that "might" happen as
| something that "will" happen is not actually an
| "opposite," so calling it that (and then basing your
| assumption of that person's prompt-writing on that
| characterization) was a stretch.
| khasan222 wrote:
| I find this very very much depends on the model and
| instructions you give the llm. Also you can use other
| instructions to check the output and have it try again.
| Definitely with larger codebases it struggles but the
| power is there.
|
| My favorite instruction is using component A as an
| example make component B
| pessimizer wrote:
| When you have a precise input, why give it to an LLM? When I
| have to do arithmetic, I use a calculator. I don't ask my
| coworker, who is generally pretty good at arithmetic,
| although I'd get the right answer 98% of the time. Instead, I
| use my coworker for questions that are less completely
| specified.
|
| Also, if it's an important piece of arithmetic, and I'm in a
| position where I need to ask my coworker rather than do it
| myself, I'd expect my coworker (and my AI) to grab (spawn) a
| calculator, too.
| Barrin92 wrote:
| >simple fact that you can now be fuzzy with the input you give
| a computer, and get something meaningful in return
|
| I got into this profession precisely because I wanted to give
| precise instructions to a machine and get _exactly_ what I
| want. Worth reading Dijkstra, who anticipated this, and the
| foolishness of it, half a century ago
|
| _" Instead of regarding the obligation to use formal symbols
| as a burden, we should regard the convenience of using them as
| a privilege: thanks to them, school children can learn to do
| what in earlier days only genius could achieve. (This was
| evidently not understood by the author that wrote --in 1977--
| in the preface of a technical report that "even the standard
| symbols used for logical connectives have been avoided for the
| sake of clarity". The occurrence of that sentence suggests that
| the author's misunderstanding is not confined to him alone.)
| When all is said and told, the "naturalness" with which we use
| our native tongues boils down to the ease with which we can use
| them for making statements the nonsense of which is not
| obvious.[...]_
|
| _It may be illuminating to try to imagine what would have
| happened if, right from the start our native tongue would have
| been the only vehicle for the input into and the output from
| our information processing equipment. My considered guess is
| that history would, in a sense, have repeated itself, and that
| computer science would consist mainly of the indeed black art
| how to bootstrap from there to a sufficiently well-defined
| formal system. We would need all the intellect in the world to
| get the interface narrow enough to be usable "_
|
| Welcome to prompt engineering and vibe coding in 2025, where
| you have to argue with your computer to produce a formal
| language, that we invented in the first place so as to not have
| to argue in imprecise language
|
| https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
| vector_spaces wrote:
| right: we don't use programming languages instead of natural
| language simply to make it hard. For the same reason, we use
| a restricted dialect of natural language when writing math
| proofs -- using constrained languages reduces ambiguity and
| provides guardrails for understanding. It gives us some hope
| of understanding the behavior of systems and having
| confidence in their outputs
|
| There are levels of this though -- there are few instances
| where you actually _need_ formal correctness. For most
| software, the stakes just aren 't that high, all you need is
| predictable behavior in the "happy path", and to be within
| some forgiving neighborhood of "correct".
|
| That said, those championing AI have done a very poor job at
| communicating the value of constrained languages, instead
| preferring to parrot this (decades and decades and decades
| old) dream of "specify systems in natural language"
| gdubs wrote:
| It sounds like you think I don't find value in using machines
| in their precise way, but that's not a correct assumption. I
| love code! I love the algorithms and data structures of data
| science. I also love driving 5-speed transmissions and
| shooting on analog film - but it isn't always what's needed
| in a particular context or for a particular problem. There
| are lots of areas where a 'good enough solution done quickly'
| is way more valuable than a 100% correct and predictable
| solution.
| jiggawatts wrote:
| You can be fuzzier than a soft fluff of cotton wool. I've had
| incredible success trying to find the name of an old TV show or
| specific episode using AIs. The hit rate is surprisingly good
| even when using the vaguest inputs.
|
| "You know, that show in the 80s or 90s... maybe 2000s with the
| people that... did things and maybe didn't do things."
|
| "You might be thinking of episode 11 of season 4 of such and
| such snow where a key plot element was both doing and not doing
| things on the penalty of death"
| floren wrote:
| See I try that sort of thing, like asking Gemini about a
| science fiction book I read in 5th grade that (IIRC) involved
| people living underground near/under a volcano, and food in
| pill form, and it immediately hallucinates a non-existent
| book by John Christopher named "The City Under the Volcano"
| wyre wrote:
| Claude tells me it's City of Ember, but notes the pill-food
| doesn't match the plot and asks for more details of the
| book.
| bityard wrote:
| I was a big fan of Star Trek: The Next Generation as a kid and
| one of my favorite things in the whole world was thinking about
| the Enterprise's computer and Data, each one's strengths and
| limitations, and whether there was really any fundamental
| difference between the two besides the fact that Data had a
| body he could walk around in.
|
| The Enterprise computer was (usually) portrayed as fairly close
| to what we have now with today's "AI": it could synthesize,
| analyze, and summarize the entirety of Federation knowledge and
| perform actions on behalf of the user. This is what we are
| using LLMs for now. In general, the shipboard computer didn't
| hallucinate except during most of the numerous holodeck
| episodes. It could rewrite portions of its own code when the
| plot demanded it.
|
| Data had, in theory, a personality. But that personality was
| basically, "acting like a pedantic robot." We are told he is
| able to grow intellectually and acquire skills, but with
| perfect memory and fine motor control, he can already basically
| "do" any human endeavor with a few milliseconds of research.
| Although things involving human emotion (art, comedy, love) he
| is pretty bad at and has to settle for sampling, distilling,
| and imitating thousands to millions of examples of human
| creation. (Not unlike "AI" art of today.)
|
| Side notes about some of the dodgy writing:
|
| A few early epsiodes of Star Trek: The Next Generation treated
| the Enterprise D computer as a semi-omniscient character and it
| always bugged me. Because it seemed to "know" things that it
| shouldn't and draw conclusions that it really shouldn't have
| been able to. "Hey computer, we're all about to die, solve the
| plot for us so we make it to next week's episode!" Thankfully
| someone got the memo and that only happened a few times.
| Although I always enjoyed episodes that centered around the
| ship or crew itself somehow instead of just another run-in with
| aliens.
|
| The writers were always adamant that Data had no emotions (when
| not fitted with the emotion chip) but we heard him say things
| _all the time_ that were rooted in emotion, they were just not
| particularly strong emotions. And he claimed to not grasp
| humor, but quite often made faces reflecting the mood of the
| room or indicating he understood jokes made by other crew
| members.
| gdubs wrote:
| Thanks, love this - it's something I've thought about as
| well!
| jacobgkau wrote:
| > The writers were always adamant that Data had no
| emotions... but quite often made faces reflecting the mood of
| the room or indicating he understood jokes made by other crew
| members.
|
| This doesn't seem too different from how our current AI
| chatbots don't actually understand humor or have emotions,
| but can still explain a joke to you or generate text with a
| humorous tone if you ask them to based on samples, right?
|
| > "Hey computer, we're all about to die, solve the plot for
| us so we make it to next week's episode!"
|
| I'm curious, do you recall a specific episode or two that
| reflect what you feel boiled down to this?
| AnotherGoodName wrote:
| >"Being a robot's great, but we don't have emotions and
| sometimes that makes me very sad".
|
| From Futurama in a obvious parody of how Data was portrayed
| ofjcihen wrote:
| I feel like we get one of these articles that addresses valid AI
| criticisms with poor arguments every week and at this point I'm
| ready to write a boilerplate response because I already know what
| they're going to say.
|
| Interns don't cost 20 bucks a month but training users in the
| specifics of your org is important.
|
| Knowing what is important or pointless comes with understanding
| the skill set.
| mrkurt wrote:
| PLEASE write your response. We'll publish it on the Fly.io
| blog. Unedited. If you want.
| kubb wrote:
| Maybe make a video of how you're vibecoding a valuable
| project in an existing codebase, and how agents are saving
| you time by running your tools in a loop.
| metaltyphoon wrote:
| Seriously... thats the one thing I never see being posted?
| Is it because Agent mode will take 30-40 minutes to just
| bookstrap a project and create some file?
| csallen wrote:
| It takes like 2-6 minutes to do that, depending on the
| scope of the project
| andrepd wrote:
| So they can cherry pick the 1 out of 10 times that it
| actually performs in an impressive manner? That's the
| essence of most AI demos/"benchmarks" I've seen.
|
| Testing for myself has always yielded unimpressive results.
| Maybe I'm just unlucky?
| ofjcihen wrote:
| I'm uninterested in giving you content. In particular because
| of your past behavior.
|
| Thanks for the offer though.
| tptacek wrote:
| Kurt, how dare you.
| ofjcihen wrote:
| You wouldn't happen to work for fly.io as well, would
| you?
|
| Edit: Nm, thought I remembered your UN and see on your
| profile that you do.
| grzm wrote:
| Yes. And the author of the submission.
| throwaway314155 wrote:
| > past behavior
|
| Do go on.
| csallen wrote:
| Can you direct me somewhere with superior counterarguments? I'm
| quite curious
| briandrupieski wrote:
| > with poor arguments every week
|
| This roughly matches my experience too, but I don't think it
| applies to this one. It has a few novel things that were new
| ideas to me and I'm glad I read it.
|
| > I'm ready to write a boilerplate response because I already
| know what they're going to say
|
| If you have one that addresses what this one talks about I'd be
| interested in reading it.
| slg wrote:
| >> with poor arguments every week
|
| >This roughly matches my experience too, but I don't think it
| applies to this one.
|
| I'm not so sure. The argument that any good programming
| language would inherently eliminate the concern for
| hallucinations seems like a pretty weak argument to me.
| ofjcihen wrote:
| It's a confusing one for sure.
|
| To be honest I'm not sure where the logic for that claim
| comes from. Maybe an abundance of documentation is the
| assumption?
|
| Either way, being dismissive of one of LLMs major flaws and
| blaming it on the language doesn't seem like the way to
| make that argument.
| simonw wrote:
| Why does that seem weak to you?
|
| It seems obviously true to me: code hallucinations are
| where the LLM outputs code with incorrect details - syntax
| errors, incorrect class methods, invalid imports etc.
|
| If you have a strong linter in a loop those mistakes can be
| automatically detected and passed back into the LLM to get
| fixed.
|
| Surely that's a solution to hallucinations?
|
| It won't catch other types of logic error, but I would
| classify those as bugs, not hallucinations.
| ofjcihen wrote:
| A good example of where a linter wouldn't work is when
| the LLM has you import a package that doesn't exist.
| slg wrote:
| >It won't catch other types of logic error, but I would
| classify those as bugs, not hallucinations.
|
| Let's go a step further, the LLM can produce bug free
| code too if we just call the bugs "glitches".
|
| You are making a purely arbitrary decision on how to
| classify an LLM's mistakes based on how easy it is to
| catch them, regardless of their severity or cause. But
| simply categorizing the mistakes in a different bucket
| doesn't make them any less of a problem.
| layer8 wrote:
| I don't see why an LLM wouldn't hallucinate project
| requirements or semantic interface contracts. The only
| way you could escape that is by full-blown formal
| verification and specification.
| kubb wrote:
| There's also the reverse genre: valid criticism of absolutely
| strawman arguments that nobody makes.
| tptacek wrote:
| Which of the arguments in this post hasn't occurred on HN in
| the past month or so?
| mountainriver wrote:
| I feel the opposite, and pretty much every metric we have shows
| basically linear improvement of these models over time.
|
| The criticisms I hear are almost always gotchas, and when
| confronted with the benchmarks they either don't actually know
| how they are built or don't want to contribute to them. They
| just want to complain or seem like a contrarian from what I can
| tell.
|
| Are LLMs perfect? Absolutely not. Do we have metrics to tell us
| how good they are? Yes
|
| I've found very few critics that actually understand ML on a
| deep level. For instance Gary Marcus didn't know what a test
| train split was. Unfortunately, rage bait like this makes money
| attemptone wrote:
| >I feel the opposite, and pretty much every metric we have
| shows basically linear improvement of these models over time.
|
| Wait, what kind of metric are you talking about? When I did
| my masters in 2023 SOTA models where trying to push the
| boundaries by minuscule amounts. And sometimes blatantly
| changing the way they measure "success" to beat the previous
| SOTA
| Night_Thastus wrote:
| Models are absolutely not improving linearly. They improve
| logarithmically with size, and we've already just about hit
| the limits of compute without becoming totally unreasonable
| from a space/money/power/etc standpoint.
|
| We can use little tricks here and there to try to make them
| better, but fundamentally they're about as good as they're
| ever going to get. And none of their shortcomings are growing
| pains - they're fundamental to the way an LLM operates.
| calf wrote:
| What valid AI criticisms? Most criticisms of AI are not very
| deep nor founded in complexity theoretic arguments, whereas
| Yann LeCun himself gave an excellent 1 slide explanation of the
| limits of LLMs. Most AI criticisms are low quality arguments.
| cobertos wrote:
| It's been so much more rewarding playing with AI coding tools on
| my own than through the subtle and not so subtle nudges at work.
| The work AI tools are a walled garden, have a shitty interface,
| feel made to extract from me than to help me. In my personal
| stuff, downloading models, playing with them, the tooling, the
| interactions, it all been so much more rewarding to give me
| stable comfortable workflows I can rely on and that work with my
| brain.
|
| The dialog around it is so adversarial it's been hard figuring
| out how to proceed until dedicating a lot of effort to diving
| into the field myself, alone, on my personal time and learned
| what's comfortable to use it on.
| j-bos wrote:
| Exactly, seems much skepticism comes from only scratching the
| surface of what's possible.
| FridgeSeal wrote:
| Is there a term for "skeptics just haven't used it enough"
| argument?
|
| Because it frequently got rolled out in crypto-currency
| arguments too.
| oblio wrote:
| Now, now, be nice. There is value to obtain for the user
| from current gen AI tools. From cryptocurrencies... uh...
| nawgz wrote:
| I think if you asked someone who's owned Bitcoin from
| 2015 to now if cryptocurrency has any value they'd
| probably have a good an$wer for you.
|
| Edit: downvoters, are you denying that monetary value is
| the ultimate value?
| steveklabnik wrote:
| I do think that's a poor argument, but there's a better
| version: tools take skills to use properly.
|
| The other day, I needed to hammer two drywall anchors into
| some drywall. I didn't have a hammer handy. I used the back
| of a screwdriver. It sucked. It even technically worked!
| But it wasn't a pleasant experience. I could take away from
| this "screwdrivers are bullshit," but I'd be wrong: I was
| using a tool the wrong way. This doesn't mean that "if you
| just use a screwdriver more as a hammer, you'll like it",
| it means that I should use a screwdriver for screwing in
| screws and a hammer for hammering things.
| 01HNNWZ0MV43FF wrote:
| Damn. Well I'll spend a few bucks trying it out and I'll ask my
| employer if they're okay with me using agents on company time,
| but
|
| But I'm not thrilled about centralized, paid tools. I came into
| software during a huge FOSS boom. Like a huge do it yourself,
| host it yourself, Publish Own Site, Syndicate Elsewhere, all the
| power to all the people, borderline anarchist communist boom.
|
| I don't want it to be like other industries where you have to buy
| a dog shit EMR and buy a dog shit CAD license and buy a dog shit
| tax prep license.
|
| Maybe I lived through the whale fall and Moloch is catching us. I
| just don't like it. I rage against dying lights as a hobby.
| renjimen wrote:
| You can self host an open-weights LLM. Some of the AI-powered
| IDEs are open source. It does take a little more work than just
| using VSCode + Copilot, but that's always been the case for
| FOSS.
| Philpax wrote:
| An important note is that the models you can host at home
| (e.g. without buying ten(s of) thousand dollar rigs) won't be
| as effective as the proprietary models. A realistic size
| limit is around 32 billion parameters with quantisation,
| which will fit on a 24GB GPU or a sufficiently large MBP.
| These models are roughly on par with the original GPT-4 -
| that is, they will generate snippets, but they won't pull off
| the magic that Claude in an agentic IDE can do. (There's the
| recent Devstral model, but that requires a specific harness,
| so I haven't tested it.)
|
| DeepSeek-R1 is on par with frontier proprietary models, but
| requires a 8xH100 node to run efficiently. You can use
| _extreme_ quantisation and CPU offloading to run it on an
| enthusiast build, but it will be closer to seconds-per-token
| territory.
| jay_kyburz wrote:
| Yeah, I'm ready to jump in, but I need an agent running on my
| hardware at home without internet access.
|
| How far away are we from that? How many RYX 50s do I need?
|
| This is a serious question btw.
| pixl97 wrote:
| It's unfortunate that AMD isn't in on the AI stuff, because
| they are releasing a 96GB card ($10k so it's pricey
| currently) which would drop the number you need.
| guywithahat wrote:
| I mean it depends on the model; some people running deepseek
| report they have better performance at home running on a CPU
| with lots of ram (think a few hundred gigabytes). Even when
| running locally vram is more relevant than the performance of
| the GPU. That said I'm really not the person to ask about
| this, as I don't have AI agents running amuck on my machine
| yet
| mouse_ wrote:
| > Extraordinarily talented people are doing work that LLMs
| already do better, out of spite.
|
| So what, people should just stop doing any tasks that LLMs do
| subjectively better?
| tptacek wrote:
| I don't know the full answer to this question, but I have a
| partial answer: they should at least stop doing _tedious_ tasks
| that LLMs do better.
| asadotzler wrote:
| Some of us thrive in tedium, and also do it better than bots.
| pixl97 wrote:
| And then sit in shock when you're replaced with an auto-
| loom.
| p-o wrote:
| Unrelated to your friends, but a big part of learning is to
| do tedious tasks. Maybe once you master a topic LLMs can be
| better, but for many folks out there, using LLMs as a
| shortcut can impede learning.
| tptacek wrote:
| I'm ~8,000 XP into MathAcademy right now, doing the
| calculus stuff I skipped by not going to college. I'm doing
| a lot, lot, lot of tedious practice. But I know why I'm
| doing it, and when I'm doing doing it, I'm going to go back
| to using SageMath to do actual work.
| metalliqaz wrote:
| Can someone explain to me what this means?
|
| > People coding with LLMs today use agents. Agents get to poke
| around your codebase on their own. They author files directly.
| They run tools. They compile code, run tests, and iterate on the
| results. ...
|
| Is this what people are really doing? Who is just turning AI
| loose to modify things as it sees fit? If I'm not directing the
| work, how does it even know what to do?
|
| I've been subjected to forced LLM integration from management,
| and there are no "Agents" anywhere that I've seen.
|
| Is anyone here doing this that can explain it?
| adamgordonbell wrote:
| you are giving it instructions but it's running a while loop
| with a list of tools and it can poke around in your code base
| until it thinks it's done whatever you ask for.
|
| See Claude Code, windsurf, amp, Kilcode, roo, etc.
|
| I might describe a change I need to have made and then it does
| it and then I might say "Now the tests are failing. Can you fix
| them?" and so on.
|
| Sometimes it works very great. sometimes you find yourself
| arguing with the computer.
| williamcotton wrote:
| I run Cursor in a mode that starts up shell processes, runs
| linters, tests etc on its own, updates multiple files, runs the
| linter and tests again, fixes failures, and so on. It auto
| stops at 20 iterations through the feedback loop.
|
| Depending on the task it works really well.
| metalliqaz wrote:
| This example seems to keep coming up. Why do you need an AI
| to run linters? I have found that linters actually add very
| little value to an experience programmer, and actually get in
| the way when I am in the middle of active development. I have
| to say I'm having a hard time visualizing the amazing
| revolution that is alluded to by the author.
| williamcotton wrote:
| Static errors are caught by linters before runtime errors
| are caught by a test suite. When you have an LLM in a
| feedback loop, otherwise known as an _agent_ , then
| iterative calls to the LLM will include requests and
| responses from linters and test suites, which can assure
| the user, who typically follows along with the entire
| process, that the agent is writing better code than it
| would otherwise.
| tsimionescu wrote:
| You're missing the point. The main thing the AI does is to
| generate code based on a natural-language description of a
| problem. The liners and tests and on exist to guide this
| process.
|
| The initial AI-based work flows were "input a prompt into
| ChatGPT's web UI, copy the output into your editor of
| choice, run your normal build processes; if it works,
| great, if not, copy the output back to ChatGPT, get new
| code, rinse and repeat".
|
| The "agent" stuff is trying to automate this loop. So as a
| human, you still write more or less the same prompt, but
| now the agent code automates that loop of generating code
| with an LLM and running regular tools on it and sending
| those tools' output back to the LLM until they succeed for
| you. So, instead of getting code that may not even be in
| the right programming language as you do from an LLM, you
| get code that is 100% guaranteed to run and passes your
| unit tests and any style constraints you may have imposed
| in your code base, all without extra manual interaction (or
| you get some kind of error if the problem is too hard for
| the LLM).
| tptacek wrote:
| I cut several paragraphs from this explaining how agents work,
| which I wrote anticipating this exact comment. I'm very happy
| to have brought you to this moment of understanding --- it's a
| big one. The answer is "yes, that's exactly what people are
| doing": "turning LLMs loose" (really, giving them some fixed
| number of tool calls, some of which might require human
| approval) to do stuff on real systems. This is exactly what
| Cursor is about.
|
| I think it's really hard to undersell how important agents are.
|
| We have an intuition for LLMs as a function blob -> blob
| (really, token -> token, but whatever), and the limitations of
| such a function, ping-ponging around in its own state space,
| like a billion monkeys writing plays.
|
| But you can also get go blob -> json, and json -> tool-call ->
| blob. The json->tool interaction isn't stochastic; it's simple
| systems code (the LLM could indeed screw up the JSON, since
| that process is stochastic --- but it doesn't matter, because
| the agent isn't stochastic and won't accept it, and the LLM
| will just do it over). The json->tool-call->blob process is
| entirely fixed system code --- and simple code, at that.
|
| Doing this _grounds the code generation process_. It has a
| directed stochastic structure, and a closed loop.
| metalliqaz wrote:
| I'm sorry but this doesn't explain anything. Whatever it is
| you have in your mind, I'm afraid it's not coming across on
| the page. There is zero chance that I'm going to let an AI
| start running arbitrary commands on my PC, let alone anything
| that resembles a commit.
|
| What is an actual, real world example?
| IshKebab wrote:
| > There is zero chance that I'm going to let an AI start
| running arbitrary commands on my PC
|
| The interfaces prompt you when it wants to run a command,
| like "The AI wants to run 'cargo add anyhow', is that ok?"
| seabrookmx wrote:
| They're not arbitrary, far from it. You have a very
| constrained set of tools each agent can do. An agent has a
| "job" if you will.
|
| Maybe the agent feeds your PR to the LLM to generate some
| feedback, and posts a the text to the PR as a comment.
| Maybe it can also run the linters, and use that as input to
| the feedback.
|
| But the at the end of the day, all it's really doing is
| posting text to a github comment. At worst it's useless
| feedback. And while I personally don't have much AI in my
| workflow today, when a bunch of smart people are telling me
| the feedback can be useful I can't help but be curious!
| tsimionescu wrote:
| This all works something like this: an "agent" is a small
| program that takes a prompt as input, say "//fix
| ISSUE-0451".
|
| The agent code runs a regex that recognizes this prompt as
| a reference to a JIRA issue, and runs a small curl with
| predefined credentials to download the bug description.
|
| It then assembles a larger text prompt such as "you will
| act as a master coder to understand and fix the following
| issue as faithfully as you can: {JIRA bug description
| inserted here}. You will do so in the context of the
| following code: {contents of 20 files retrieved from Github
| based on Metadata in the JIRA ticket}. Your answer must be
| in the format of a Git patch diff that can be applied to
| one of these files".
|
| This prompt, with the JIRA bug description and code from
| your Github filled in, will get sent to some LLM chosen by
| some heuristic built into the agent - say it sends it to
| ChatGPT.
|
| Then, the agent will parse the response from ChatGPT and
| try to parse it as a Git patch. If it respects git patch
| syntax, it will apply it to the Git repo, and run something
| like `make build test`. If that runs without errors, it
| will generate a PR in your Github and finally output the
| link to that PR for you to review.
|
| If any of the steps fails, the agent will generate a new
| prompt for the LLM and try again, for some fixed number of
| iterations. It may also try a different LLM or try to
| generate various follow-ups to the LLM (say, it will send a
| new prompt in the same "conversation" like "compilation
| failed with the following issue: {output from make build}.
| Please fix this and generate a new patch."). If there is no
| success after some number of tries, it will give up and
| output error information.
|
| You can imagine many complications to this workflow - the
| agent may interrogate the LLM for more intermediate steps,
| it may ask the LLM to generate test code or even to
| generate calls to other services that the agent will then
| execute with whatever credentials it has.
|
| It's a byzantine concept with lots of jerry-rigging that
| apparently actually works for some use cases. To me it has
| always seemed far too much work to get started before
| finding out if there is any actual benefit for the
| codebases I work on, so I can't say I have any experience
| with how well these things work and how much they end up
| costing.
| steveklabnik wrote:
| > Is this what people are really doing?
|
| Some people are, and some people are not. This is where some of
| the disconnect is coming from.
|
| > Who is just turning AI loose to modify things as it sees fit?
|
| In the advent of source control, why not? If it does something
| egregiously wrong, you can throw it away easily and get back to
| a previous state with ease.
|
| > If I'm not directing the work, how does it even know what to
| do?
|
| You're directing the work, but at a higher level of
| abstraction.
| metalliqaz wrote:
| > You're directing the work, but at a higher level of
| abstraction.
|
| The article likens this to a Makefile. I gotta say, why not
| just use a Makefile and save the CO2?
| steveklabnik wrote:
| Being kind of like a Makefile does not mean that they're
| equivalent. They're different tools, good for different
| things. That they happen to both be higher level than
| source code doesn't mean that they're substitutes.
| aykutcan wrote:
| This is how I work:
|
| I use Cursor by asking it exactly what I want and how I want
| it. By default, Cursor has access to the files I open, and it
| can reference other files using grep or by running specific
| commands. It can edit files.
|
| It performs well in a fairly large codebase, mainly because I
| don't let it write everything. I carefully designed the
| architecture and chose the patterns I wanted to follow. I also
| wrote a significant portion of the initial codebase myself and
| created detailed style guides for my teammates.
|
| As a result, Cursor (or you can say models you selecting
| because cursor is just a router for commercial models) handles
| small, focused tasks quite well. I also review every piece of
| code it generates. It's particularly good at writing tests,
| which saves me time.
| jsheard wrote:
| > But all day, every day, a sizable chunk of the front page of HN
| is allocated to LLMs: incremental model updates, startups doing
| things with LLMs, LLM tutorials, screeds against LLMs. It's
| annoying!
|
| You forgot the screeds against the screeds (like this one)
| philosophty wrote:
| "they're smarter than me" feels like false humility and an
| attempt to make the medicine go down better.
|
| 1. Thomas is obviously very smart.
|
| 2. To be what we think of as "smart" is to be in touch with
| reality, which includes testing AI systems for yourself and
| recognizing their incredible power.
| mrkurt wrote:
| It's not false. He's talking about people smarter than him (at
| writing and shipping infrastructure code).
|
| Thomas is the smartest at other things.
| philosophty wrote:
| It is false and you're proving it. Smarter means smarter.
|
| Smarter does not mean "better at writing and shipping
| infrastructure code."
|
| Some of the smartest people I know are also infra engineers
| and none of them are AI skeptics in 2025.
| bitpush wrote:
| A writing for the ages. I've found most of the LLM skeptics are
| either being hypocritical or just being gate-keepy (we dont want
| everyone to write code)
| fellowniusmonk wrote:
| I think the hardest part is not spending the next 3 months of my
| life in a cave finishing all the hobby/side projects I didn't
| quite get across the line.
|
| It really does feel like I've gone from being 1 senior engineer
| to a team that has a 0.8 Sr. Eng, 5 Jrs. and one dude that spends
| all his time on digging through poorly documented open source
| projects and documenting them for the team.
|
| Sure I can't spend quite as much time working on hard problems as
| I used to, but no one knows that I haven't talked to a PM in
| months, no one knows I haven't written a commit summary in
| months, it's just been my AI doppelgangers. Compared to myself a
| year ago I think I now PERSONALLY write 150% more HARD code than
| I did before. So maybe, my first statement about being 0.8 is
| false.
|
| I think of it like electric bikes, there seems to be indication
| that people with electric assist bikes actually burn more
| calories/spend more time/go farther on an electric bike than
| those who have manual bikes
| https://www.sciencedirect.com/science/article/abs/pii/S22141....
| halpow wrote:
| > I haven't written a commit summary in months
|
| I don't know what you're posting, but if it's anything like
| what I see being done by GitHub copilot, your commit messages
| are junk. They're equivalent to this and you're wasting
| everyone's time: // Sets the value
| const value = "red"
| fullstackchris wrote:
| this behaviour is literally removable with proper prompting.
|
| this is a strawmans argument... of whatever your are arguing
| fellowniusmonk wrote:
| One of the most interesting things in all of this is it is
| clear some people are struggling with the feeling of a loss
| in status.
|
| I see it myself, go to a tech/startup meetup as a
| programmer today vs in 2022 before ZIRP ended.
|
| It's like back to my youth where people didn't want to hear
| my opinion and didn't view me as "special" or "in demand"
| because I was "a nerd who talked to computers", that's
| gotta be tough for a lot of people who grew up in the post
| "The Social Network" era.
|
| But anyone paying attention knew where the end of ZIRP was
| going to take us, the fact that it dovetailed with the rise
| of LLMs is a double blow for sure.
| IshKebab wrote:
| Yeah I tried Copilot's automatic commit messages and they're
| trash, but the agent-based ones are much better.
| fellowniusmonk wrote:
| If you've ever run or been part of a team that does thorough,
| multi-party, pull request reviews you know what I am talking
| about.
|
| The only part I don't automate is the pull request review (or
| patch review, pre-commit review, etc. before git.), thats
| always been the line to hold for protecting codebases with
| many contributors of varying capability, this is explicitly
| addressed in the article as well.
|
| You can fight whatever straw man you want. Shadowbox the
| hypotheticals in your head, etc. I don't get all these recent
| and brand new accounts just straight up insulting and
| insinuating all this crap all over HN today, the agro slop is
| out of control today, is it humans even writing all this
| stuff?
|
| Maybe that's the real problem.
| halpow wrote:
| I told you how it is. Copilot writes crap descriptions that
| just distract from the actual code and the _intention_ of
| the change. If your commit messages are in any way better
| than that, then please enlighten us rather than calling me
| a bot.
| gigel82 wrote:
| What this boils down to is an argument for slop. Yeah, who cares
| about the quality, the mediocrity, the craft... get the slop,
| push it in, call it done. It mostly works in the golden path,
| it's about 6 or 7 orders of magnitude slower than hand-written
| software but that's ok, just buy more AWS resources, bill the
| client, whatever.
|
| I can maybe even see that point in some niches, like outsourcing
| or contracting where you really can't be bothered to care about
| what you leave behind after the contract is done but holy shit,
| this is how we end up with slow and buggy crap that no one can
| maintain.
| trinix912 wrote:
| It's not much different without the AI. Managers don't care
| about efficient code, they care about code that meets the
| business goals - whether that's good or bad is debatable.
| Agencies duct-taping together throwaway code isn't new. The
| classic "just buy more AWS resources" & such have been around
| for quite a while.
| pixl97 wrote:
| >Yeah, who cares about the quality, the mediocrity, the craft..
|
| Just about no-one in the F100 unless they are on very special
| teams.
|
| If you care about the craft you're pushed out for some that
| drops out 10x LOC a day because your management has no ability
| to measure what good software is. Extra bonus points for
| including 4GB of node_modules in your application.
| mattwad wrote:
| There's a huge caveat i don't see often, which is that it depends
| on your language for programming. IE. AI is reallllly good at
| writing Next.js/Typescript apps, but not so much Ruby on Rails.
| YMMV
| el_memorioso wrote:
| I agree with this. People who are writing Python, Javascript,
| or Typescript tell me that they get great results. I've had
| good results using LLMs to flesh out complex SQL queries, but
| when I write Elixir code, what I get out of the LLM often
| doesn't even compile even when given function and type specs in
| the prompt. As the writer says, maybe I should be using an
| agent, but I'd rather understand the limits of the lower-level
| tools before adding other layers that I may not have access to.
| dgb23 wrote:
| My hunch is that to exploit LLMs one should lean on data
| driven code more. LLMs seem to have a very easy time to
| generate data literals. Then it's far less of an issue to
| write in a niche language.
|
| Not familiar with Elixir but I assume it's really good at
| expressing data driven code, since it's functional and has
| pattern matching.
| stopachka wrote:
| tptacek, curious question: what agent / stack do you currently
| use?
| bloat wrote:
| Seconded. I am very much still in the mode of copying from the
| chat window and then editing. I would like to have whatever she
| is having.
| tptacek wrote:
| I use Codex CLI for casual stuff, because of the ergonomics of
| just popping open another terminal tab.
|
| I use Zed as my primary interface to "actually doing project
| work" LLM stuff, because it front-ends both OpenAI and
| Google/Gemini models, and because I really like the interface.
| I still write code in Emacs; Zed is kind of like the Github PR
| viewer for me.
|
| I'm just starting to use Codex Web for asynchronous agents
| because I have a friend who swears by queueing up a dozen async
| prompts every morning and sifting through them in the
| afternoon. The idea of just brainstorming a bunch of shit --- I
| can imagine keeping focus and motivation going long enough to
| just rattle ideas off! --- and then making coffee while it all
| gets tried, is super appealing to me.
| stopachka wrote:
| Thank you!
| mrmansano wrote:
| > I'm just starting to use Codex Web for asynchronous agents
| because I have a friend who swears by queueing up a dozen
| async prompts every morning and sifting through them in the
| afternoon
|
| Bunch of async prompts for the same task? Or are you
| parallelizing solving different issues and just reviewing in
| the afternoon?
|
| Sounds intriguing either way.
| andersa wrote:
| I'm curious how much you paid in the past month for API fees
| generated by these tools. Or at least what order of magnitude
| we're talking about.
| guywithahat wrote:
| I develop space-borne systems, so I can't use the best LLM's for
| ITAR/etc reasons, but this article really makes me feel like I'm
| missing out. This line in particular makes me wonder if my skills
| are becoming obsolete for general private industry:
|
| > People coding with LLMs today use agents. Agents get to poke
| around your codebase on their own. They author files directly.
| They run tools. They compile code, run tests, and iterate on the
| results. They also:
|
| Every once in a while I see someone on X posting how they have 10
| agents running at once building their code base, and I wonder if
| in 3 years most private industry coders will just be attending
| meetings to discuss what their agents have been working on, while
| people working on DoD contracts will be typing things into vim
| like a fool
| ghc wrote:
| > while people working on DoD contracts will be typing things
| into vim like a fool
|
| Forget LLMs, try getting Pandas approved. Heck I was told by
| some AF engineers they were banned from opening Chrome Dev
| Tools by their security office.
|
| FWIW I think the LLM situation is changing quite fast and
| they're appearing in some of our contracts. Azure-provided
| ones, of course.
| fellowniusmonk wrote:
| Frankly, as someone who is engaged in fields where LLMs can be
| used heavily.
|
| I would stay in any high danger/high precision/high regulation
| role.
|
| The speed at which LLM stuff is progressing is insane, what is
| cutting edge today wasn't available 6 months ago.
|
| Keep up as a side hobby if you wish, I would definitely
| recommend that, but I just have to imagine that in 2 years a
| turnkey github project will get you pretty much all the way
| there.
|
| Idk, that's my feeling fwiw.
|
| I love LLMs but I'm much less confident that people and
| regulation will keep up with this new world in a way that
| benefits the very people who created the content that LLMs are
| built on.
| threeseed wrote:
| > The speed at which LLM stuff is progressing is insane
|
| You clearly haven't been following the space or maybe
| following too much.
|
| Because the progress has been pretty slow over the last
| years.
|
| Yes modals are cheaper and faster but they aren't
| substantially better.
| fellowniusmonk wrote:
| Over the last years? As in two years or more? Could you
| explain that a bit more?
|
| I consider "LLM stuff" to be all inclusive of the eco-
| system of "coding with LLMs" in the current threads
| context, not specific models.
|
| Would you still say, now that the definition has been
| clarified, that there has been slow progress in the last 2+
| years?
|
| I am also curious if you could clarify where we would need
| to be today for you to consider it "fast progress"? Maybe
| there is a generational gap between us in defining fast vs
| slow progress?
| bloat wrote:
| So we replace the task of writing tedious boilerplate with the
| task of reading the AI's tedious boilerplate. Which takes just as
| long. And leaves you with less understanding. And is more boring.
| Philpax wrote:
| You are either a very fast producer or a very slow reader.
| Claude and Gemini are _much_ faster at producing code than I
| am, and reviewing their code - twice over, even - still takes
| less time than writing it myself.
| ckiely wrote:
| But you definitely don't understand it nearly as well as if
| you wrote it. And you're the one that needs to take
| responsibility for adding it to your codebase.
| oblio wrote:
| Are you, though? Reading code is harder, potentially much
| harder.[1]
|
| And I suspect the act of writing it yourself imparts some
| lower level knowledge you don't get by skimming the output of
| an AI.
|
| [1] https://www.joelonsoftware.com/2000/05/26/reading-code-
| is-li...
| KyleBerezin wrote:
| I think he is specifically referring to boilerplate code.
| It is not hard to understand boilerplate code.
| seadan83 wrote:
| Reviewing code is often slower than writing it. You don't
| have to be an exceptionally fast coder or slow reviewer for
| that to be true.
| thegeomaster wrote:
| If this was the case, regular code review as a practice
| would be entirely unworkable.
| tart-lemonade wrote:
| The amount of time I spend going back and forth between the
| implementation and the test cases to verify that the tests
| actually fully cover the possible failure cases alone can
| easily exceed the time spent writing it, and that's
| assuming I don't pull the branch locally and start stepping
| through it in the debugger.
|
| The idea that AI will make development faster because it
| eliminates the boring stuff seems quite bold because until
| we have AGI, someone still needs to verify the output, and
| code review tends to be even more tedious than writing
| boilerplate unless you're speed-reading through reviews.
| HDThoreaun wrote:
| > Which takes just as long.
|
| This has never once been my experience. Its definitely less fun
| but it takes way less time.
| seadan83 wrote:
| Indeed, instead of writing code to shave a Yak, we're now
| instead reviewing how the Yak was (most-shittily) shaved.
| flufluflufluffy wrote:
| and probably results in a greater net energy consumption/carbon
| output
| mostlysimilar wrote:
| All of these people advocating for AI software dev are
| effectively saying they would prefer to review code instead of
| write it. To each their own I guess but that just sounds like
| torture to me.
| stock_toaster wrote:
| > pull in arbitrary code from the tree, or from other trees
| online, into their context windows,
|
| I guess this presupposes that it is ok for 3rd parties to slurp
| up your codebase? And possibly (I guess it ostensibly depends on
| what plan you are on?) using that source code for further
| training (and generating that same code for others)?
|
| I imagine in some domains this would not be ok, but in others is
| not an issue.
| thousand_nights wrote:
| i feel like surprisingly, front end work which used to be viewed
| by programmers as "easier" is now more difficult of the two,
| because it's where LLMs suck the most
|
| you get a link to a figma design and you have to use your eyes
| and common sense to cobble together tailwind classes, ensure
| responsiveness, accessibility, try out your components to make
| sure they're not janky, test out on a physical mobile device,
| align margins, padding, truncation, wrapping, async loading
| states, blah blah you get it
|
| LLMs still suck at all that stuff that requires a lot of visual
| feedback, after all, you're making an interface for humans to
| use, and you're a human
|
| in contrast, when i'm working on a backend ticket ai feels so
| much more straightforward and useful
| simonw wrote:
| Programmers who think front end is "easier" than backend have
| been wrong for well over a decade.
| https://simonwillison.net/2012/Feb/13/why-are-front-end/
| smy20011 wrote:
| > Often, it will drop you precisely at that golden moment where
| shit almost works, and development means tweaking code and
| immediately seeing things work better. That dopamine hit is why I
| code.
|
| Only if you are familiar with the project/code. If not, you were
| throw into a foreign codebase and have no idea how to tweak it.
| ofjcihen wrote:
| And potentially make incredibly risky mistakes while the AI
| assures you it's fine.
| leoh wrote:
| >but it's bad at rust
|
| I have to say, my ability to learn Rust was massively accelerated
| via LLMs. I highly recommend them for learning a new skill. I
| feel I'm roughly at the point (largely sans LLMs) now where I can
| be nearly as productive in Rust as Python. +1 to RustRover as
| well, which I strongly prefer to any other IDE.
| sleepy_keita wrote:
| Me too -- actually, I'd say that the LLMs I use these days
| (Sonnet 4 and GPT4.1, o4, etc) are pretty good at rust.
| reaperducer wrote:
| _I have to say, my ability to learn Rust was massively
| accelerated via LLMs._
|
| How would you know?
|
| If you didn't know Rust already, how would you know the LLM was
| teaching you the right things and the best way to do things?
|
| Just because it compiles doesn't mean it works. The world is
| full of bad, buggy, insecure, poor code that compiles.
| metaltyphoon wrote:
| No only this, but I would challenge the OP to see if he
| really knows Rust but turning off LLM and see "how much you
| truly know".
|
| This is akin to be on tutorial hell and you "know the
| language "
| leoh wrote:
| Well, I coded at Google (in addition to other places) for
| over 10 years without LLMs in several languages and I feel
| like I'm about at par with Rust as I was with those
| languages. I'm open to being humbled, which I have felt by
| LLMs and ofc other folks -- "good" is subjective.
| empath75 wrote:
| I've been writing Rust code in production for 4+ years, and I
| can write Rust pretty well, and I've learned a lot from using
| chatgpt and co-pilot/cursor.
|
| In particular, it helped me write my first generic functions
| and macros, two things that were pretty intimidating to try
| and get into.
| ezst wrote:
| How much of that proficiency remains once you switch it off?
| leoh wrote:
| Quite a lot, but hey, feel free to put me to the test
| empath75 wrote:
| It is not bad at rust. I don't think I could even function well
| as a Rust programmer without chatgpt and now Cursor. It removes
| a lot of the burden of remembering how to write generic code
| and fixing borrow checking stuff. I can just write a generic
| function with tons of syntax errors and then tell cursor to fix
| it.
| ckiely wrote:
| The argument that programmers are into piracy and therefore
| should shut up about theft is nonsensical. Not defending piracy,
| but at least an artist or creator is still credited and their
| work is unadulterated. Piracy != plagiarism.
| hello_computer wrote:
| amoral & self-serving, yet rational & to-the-point. if an
| actual person wrote that, it's like they hired jabba the hutt
| as a staff writer. fly.io guy can be sanguine about LLM
| collateral damage since he'll fit right in with the cartels
| when our civilization finally burns out.
| grose wrote:
| It's also ignoring the fact that much plagiarized code is
| already under permissive licenses. If Star Wars or Daft Punk
| were CC-BY-SA nobody would need to pirate them, and there may
| even be a vibrant remix culture... which is kind of the whole
| point of open source, is it not?
| iLoveOncall wrote:
| I simply do not get this argument about LLMs writing tedious code
| or scaffolding. You don't need or want LLMs for that, you want
| libraries and frameworks.
|
| I barely write any scaffolding code, because I use tools that
| setup the scaffolding for me.
| halpow wrote:
| If you're lucky to work in such an environment, more power to
| you. A lot of people have to deal with React where you need so
| much glue for basic tasks, and React isn't even the worst
| offender. Some boilerplate you can't wrap.
| iLoveOncall wrote:
| I use React at work, there is barely any boilerplate. I
| actually started a brand new project based on React recently
| and the initial setup before working on actual components was
| minutes.
| TrackerFF wrote:
| My take is: It is OK to don't buy into the hype. There's a lot of
| hype, no denying that.
|
| But if you're actively _avoiding_ everything related to it, you
| might find yourself in a position where you 're suddenly being
| left in the dust. Maybe not now, not next month, not next year,
| but who some time in the future. The models really are improving
| fast!
|
| I've talked with devs that (claim they) haven't touched a model
| since ChatGPT was released - because it didn't live up to their
| expectations, and they just concluded it was a big nothingburger.
|
| Even though I don't follow the development religiously anymore, I
| do try to get acquainted with new releases every 3 months or so.
|
| I hate the term "vibe coding", but I personally know non-tech
| people that have vibe coded products / apps, shipped them, and
| make more money in sales than what most "legit" coders are
| making. These would be the same "idea people" that previously
| were looking for a coder to do all the heavy lifting. Something
| is changing, that's for sure.
|
| So, yeah, don't sleepwalk through it.
| FridgeSeal wrote:
| The counter-argument as I see it is that going from "not using
| LLM tooling" to "just as competent with LLM tooling" is...maybe
| a day? And lessening and the tools evolve.
|
| It's not like "becoming skilled and knowledgeable in a
| language" which took time. Even if you're theoretically being
| left behind, you can be back at the front of the pack again in
| a day or so. So why bother investing more than a little bit
| every few months?
| abdullin wrote:
| It takes deliberate practice to learn how to work with a new
| tool.
|
| I believe that AI+Coding is no different from this
| perspective. It usually takes senior engineers a few weeks
| just to start building an intuition of what is possible and
| what should be avoided. A few weeks more to adjust the
| mindset and properly integrate suitable tools into the
| workflow.
| breuleux wrote:
| In theory, but how long is that intuition going to remain
| valid as new models arrive? What if you develop a solid
| workflow to work around some limitations you've identified,
| only to realize months late that these limitations don't
| exist anymore and your workflow is suboptimal? AI is a new
| tool, but it's a very unstable one at the moment.
| abdullin wrote:
| I'd say that the core principles stayed the same for more
| than a year by now.
|
| What is changing - constraints are relaxing, making
| things easier than they were before. E.g. where you
| needed a complex RAG to accomplish some task, now Gemini
| Pro 2.5 can just swallow 200k-500k of cacheable tokens in
| prompt and get the job done with a similar or better
| accuracy.
| stock_toaster wrote:
| I think the more "general" (and competent) AI gets, the less
| being an early adopter _should_ matter. In fact, early
| adopters would in theory have to suffer through more
| hallucinations and poor output than late adopters.
|
| Here, the early bird gets the worm with 9 fingered hands, the
| late bird just gets the worm.
| simonw wrote:
| > The counter-argument as I see it is that going from "not
| using LLM tooling" to "just as competent with LLM tooling"
| is...maybe a day? And lessening and the tools evolve.
|
| Very much disagree with that. Getting productive and
| competent with LLM tooling takes _months_. I 've been deeply
| invested in this world for a couple of years now and I still
| feel like I'm only scraping the surface of what's possible
| with these tools.
| prisenco wrote:
| How many serious engineers completely avoid AI though? I argue
| against the hype all day but have found decent workflows for it
| yet people read my comments as the "AI skeptic" defined at the
| beginning of this piece. I assume I'd be included in people's
| mental statistic because I'm not a cheerleader even though it's
| not true at all.
|
| The conclusions I've come to from the AI boom is that marketers
| and those who believe them are going to be severely
| disappointed and that the importance of learning your craft at
| a deep level is going to be even more important, not less. We
| don't need to "get good at AI" because in my experience that
| takes less than a week. But we still need to intuitively
| understand how data moves between LLC and main memory or the
| problem the borrow checker solves or how networking latency
| effects a solution, etc. etc.
|
| Also non-technical people really need to step off and stop
| telling people they _need_ to use these tools. Nobody
| appreciates being told by someone who has never done their job
| anywhere near their level how to do their job. And technical
| people need to start publishing in-depth reproducible tutorials
| and stop relying on justtrustmebro-ism.
| pie_flavor wrote:
| I have one very specific retort to the 'you are still
| responsible' point. High school kids write lots of notes. The
| notes frequently never get read, but the performance is worse
| without them: the act of writing them embeds them into your head.
| I allegedly know how to use a debugger, but I haven't in years:
| but for a number I could count on my fingers, nearly every bug
| report I have gotten I know exactly down to the line of code
| where it comes from, because I wrote it or something next to it
| (or can immediately ask someone who probably did). You don't
| _get_ that with AI. The codebase is always new. Everything must
| be investigated carefully. When stuff slips through code review,
| even if it is a mistake you might have made, you would remember
| that you made it. When humans do not do the work, humans do not
| accrue the experience. (This may still be a good tradeoff, I
| haven 't run any numbers. But it's not such an obvious tradeoff
| as TFA implies.)
| sublinear wrote:
| I have to completely agree with this and nobody says this
| enough.
|
| This tradeoff of unfamiliarity with the codebase is a very well
| understood problem for decades. Maintaining a project is 99% of
| the time spent on a successful project.
|
| In my opinion though, having AI write the initial code is just
| putting most people in a worse situation with almost no upside
| long term.
| CurrentB wrote:
| I agree I'm bullish on AI for coding generally, but I am
| curious how they'd get around this problem. Even if they can
| code at super human level, then you just get rarer super
| human bugs. Or is another AI going to debug it? Unless this
| loop is basically fail proof, does the human's job just
| becoming debugging the hardest things to debug (or at least a
| blindspot of the AI)
| derefr wrote:
| So do the thing that a student copying their notes from the
| board does: look at the PR on one monitor, and write your own
| equivalent PR by typing the changes line-for-line into your IDE
| on the other. Pretend copy/paste doesn't exist. Pretend it's
| code you saw in a YouTube video of a PowerPoint presentation,
| or a BASIC listing from one of those 1980s computing magazines.
|
| (And, if you like, do as TFA says and rephrase the code into
| your own house style as you're transcribing it. It'll be better
| for it, _and_ you'll be mentally parsing the code you're
| copying at a deeper level.)
| galleywest200 wrote:
| Is this just repeating labor? Why not just write it all
| yourself in the first place if you are just going to need to
| copy it over later?
| chaps wrote:
| This is how a (video game) programming class in my high
| school was taught. You had to transcribe the code from a
| Digipen book.... then fix any broken code. Not entirely sure
| if _their many typos_ were intentional, but they very much
| helped learn because we had no choice but to correct their
| logic failures and taypos to move onto the next section. I 'm
| still surprised 20 years later how well that system worked to
| teach and push us to branch our understandings.
| SirHumphrey wrote:
| Yes, I was just about to say. Typing out code is a way to
| lear syntax of a new language and it's often recommended to
| not copy paste while you start learning.
| roarcher wrote:
| You still didn't have to build the mental model, understand
| the subtle tradeoffs and make the decisions that arrived at
| that design.
|
| I'm amazed that people don't see this. Absolutely nobody
| would claim that copying a novel is the same thing as writing
| a novel.
| the_snooze wrote:
| I feel like the dismissal of mental models is a direct
| consequence on the tech industry's manaical focus on scale
| and efficiency as the end-all be-all values to optimize.
|
| Nevermind other important values like resilience,
| adaptability, reliability, and scrutability. An AI writes a
| function foo() that does a thing correctly; who has the
| know-how that can figure out if foo() kills batteries, or
| under what conditions it could contribute to an ARP storm
| or disk thrashing, or what implicit hardware requirements
| it has?
| derefr wrote:
| I am suspicious of this argument, because it would imply
| that you can't understand the design intent / tradeoffs /
| etc of code written by your own coworkers.
| tabletcorry wrote:
| This level of knowledge is nearly impossible to maintain as the
| codebase grows though, beyond one or two people at a typical
| company. And tools need to exist for the new hire as well as
| the long-standing employee.
| ezst wrote:
| Welcome to project architecting, where the job isn't about
| putting more lines of code into this world, but more systems
| in place to track them. A well layered and structured
| codebase can grow for a very long time before it becomes too
| hard to maintain. And generally, the business complexity
| bites before the algorithmic one, and there's no quick fix
| for that.
| throw_nbvc1234 wrote:
| It's cultural too. I've heard people say along the lines
| "we don't ship the org chart here" in a positive light,
| then in a later meeting complain that nobody understands
| what's going on in their owner-less monorepo.
|
| Shipping the org chart isn't the only way to solve this
| problem but it is one that can work. But if you don't
| acknowledge the relationship between those problems, AGI
| itself probably isn't going to help (partially sarcastic).
| skissane wrote:
| > When stuff slips through code review, even if it is a mistake
| you might have made, you would remember that you made it.
|
| I don't know. Ever had the experience of looking at 5+ year old
| code and thinking "what idiot wrote this crap" and then
| checking "git blame" and realising "oh, I'm the idiot... why
| the hell did I do this? struggling to remember" - given enough
| time, humans start to forget why they did things a certain
| way... and sometimes the answer is simply "I didn't know any
| better at the time, I do now"
|
| > You don't get that with AI. The codebase is always new.
|
| It depends on how you use AI... e.g. I will often ask an AI to
| write me code to do X because it gets me over the "hump" of
| getting started... but now this code is in front of me on the
| screen, I think "I don't like how this code is written, I'm
| going to refactor it..." and by the time I'm done it is more my
| code than the AI's
| LegionMammal978 wrote:
| Oddly, I don't tend to get that experience very much. More
| often, it's "That's not how I'd naively write that code,
| there must be some catch to it. If only I had the foresight
| to write a comment about it..." Alas, I'm still not very good
| at writing enough comments.
| mrguyorama wrote:
| Understanding code takes more effort than writing it,
| somehow. That's always been a huge problem in the industry,
| because code you wrote five years ago was written by someone
| else, but AI coding takes that from "all code in your org
| except the code you wrote in the past couple years" to "all
| code was written by someone else".
|
| How well does your team work when you can't even answer a
| simple question about your system because _nobody wrote,
| tested, played with the code in question_?
|
| How do you answer "Is it possible for our system to support
| split payments?" when not a single member of your team has
| even worked on the billing code?
|
| No, code reviews do not familiarize an average dev to the
| level of understanding the code in question.
| stock_toaster wrote:
| I read a study[1] (caveat, not peer reviewed yet I don't
| think?) that seems to imply that you are correct.
| < When using GenAI tools, the effort invested in critical
| thinking < shifts from information gathering to
| information verification; < from problem-solving to AI
| response integration; and from task < execution to task
| stewardship.
|
| [1]: https://www.microsoft.com/en-us/research/wp-
| content/uploads/...
| wayvey wrote:
| This is a good point I think, and these steps take time and
| should definitely be done. I'm not sure people take this into
| account when talking about having AI code for them.
| ezst wrote:
| > The codebase is always new. Everything must be investigated
| carefully.
|
| That's dreadful. Not only is familiarity with the code not
| valued, it is impossible to build for your own sake/sanity.
| kubav027 wrote:
| +1
|
| Writing code is easier than long term maintenance. Any
| programmer is able to write so much code that he will not be
| able to maintain it. Unless there are good AI tools helping
| with maintenance there is no point to use generative tools for
| production code. From my experience AI tools are great for
| prototyping or optimizing procrastination.
| ddddang wrote:
| 100%, i had gemini write code for a blog in golang - it has
| some bugs and it took me a some time to find them.
|
| To me the sweet spot is, i write the code with the "Help" of an
| LLM. It means i double check everything it generates and prompt
| it to write code block by block - frequently acting as an
| editor.
|
| Either you want human intervention for correctness and
| extension or you don't. Having LLM's write large swaths of code
| is like completely relying on tesla's autopilot - you are
| probably more stressed than if you just drove yourself.
| wayvey wrote:
| The careful vetting of code and thoroughly testing it is
| super important, I would never even think of putting any
| generated code into any use without doing that.
|
| Also your last comparison made me chuckle, good one :)
| JoshTriplett wrote:
| Exactly. See also https://hazelweakly.me/blog/stop-building-ai-
| tools-backwards... for a detailed look at this aspect of AI
| coding.
| mgraczyk wrote:
| The important thing you are missing is that the learning
| landscape has now changed.
|
| You are now responsible for learning how to use LLMs well. If
| an untrained vibe coder is more productive for me, while
| knowing nothing about how the code actually works, I will hire
| the vibe coder instead of you.
|
| Learning is important, but it's most important that you learn
| how to use the best tools available so you can be productive.
| LLMs are not going away and they will only get better, so today
| that means you are responsible for learning how to use them,
| and that is already more important for most many roles than
| learning how to code yourself.
| Arainach wrote:
| >pull in arbitrary code from the tree, or from other trees
| online, into their context windows, run standard Unix tools to
| navigate the tree and extract information, interact with Git, run
| existing tooling, like linters, formatters, and model checkers,
| and make essentially arbitrary tool calls (that you set up)
| through MCP.
|
| ....for the vast majority of my career, anyone who suggested
| doing this - much less letting code that no one in the world
| (much less the company) truly understands the logic flow of do
| this - would be fired.
| jsnell wrote:
| > Almost nothing it spits out for me merges without edits. I'm
| sure there's a skill to getting a SOTA model to one-shot a
| feature-plus-merge!
|
| How does this section fit in with the agent section just after?
| In an agentic model, isn't the merge getting done by either the
| model or a tool, and the retry-loops on failures would be mostly
| invisible?
|
| E.g. when using Aider + Gemini Flash 2.5, probably 90% of the
| changes apply cleanly from my perspective (maybe half actually
| apply cleanly, the other half after a couple of roundtrips of
| Aider telling the model that the patch didn't apply). The 10%
| that only apply partially I usually throw away and redo the
| prompt, it's really rare that I start merging the code manually.
| computerfan494 wrote:
| Maybe it's only me, but I just don't write that much code. I try
| to change less than 100ish lines per day. I try to keep codebases
| small. I don't want to run a codebase with hundreds of thousands
| of lines of code in a production environment.
| kakadu wrote:
| A hammer hammers.
|
| It hammers 100% of the time, with no failure.
|
| It requires the same amount of labour from my part but it
| delivers the same outcome every time.
|
| That is what tools do, they act as an extension and allow you to
| do things not easily done otherwise.
|
| If the hammer sometimes hammers, sometimes squeaks and sometimes
| screws then it requires extra labour from my part just to make it
| do what purpose specific tools do, and that is where frustrations
| arise.
|
| Make it do one thing excellent and we talk then.
| IshKebab wrote:
| This is the kind of non-serious argument he's talking about.
| There are plenty of tools that require supervision to get good
| results. That doesn't make them useless.
|
| My 3D printer sometimes prints and sometimes makes spaghetti.
| Still useful.
| bawolff wrote:
| There is a big difference between "not entirely useless" and
| best tool for the job.
| threeseed wrote:
| They never said it was useless. You just invented that straw
| man in your head.
|
| 3D printing is largely used for prototyping where its lossy
| output is fine. But using it for production use cases
| requires fine tuning it can be 99.9% reliable. Unfortunately
| we can't do that for LLMs hence why it's still only suitable
| for prototyping.
| Philpax wrote:
| But you can adjust the output of a LLM and still come out
| ahead in both time and mental effort than writing it by
| hand. Unlike a 3D printer, it doesn't have to be right the
| first time around to still be useful.
| bigstrat2003 wrote:
| > But you can adjust the output of a LLM and still come
| out ahead in both time and mental effort than writing it
| by hand.
|
| No you can't, or at least I can't. LLMs are _more work_
| than just doing it by hand.
| okanat wrote:
| You don't use 3D printing to do large-scale production. If
| you agree that AI should only be used in prototype code and
| nothing else, then your argument makes sense.
| voxl wrote:
| I have a very simple counter argument: I've tried it and it's not
| useful. Maybe it is useful for you. Maybe even the things you're
| using it for are not trivial or better served by a different
| tool. That's fine, I don't mind you using a tool far away from my
| codebase and dependency tree. It has not been useful for me, and
| it's very unlikely it's ever going to be.
|
| Except that's not the argument people are making. They are
| arguing it will replace humans. They are arguing it will do
| research level mathematics. They are arguing this is the start of
| AGI. So if you want to put your head in the sand and ignore the
| greater message that is plastered everywhere then perhaps some
| self reflection is warranted.
| IshKebab wrote:
| > I have a very simple counter argument: I've tried it and it's
| not useful. Maybe it is useful for you.
|
| Indeed but the tedious naysaying that this is arguing against
| is that AI isn't good _full stop_. They aren 't saying "I tried
| it and it's not for me but I can see why other people would
| like it".
| simonw wrote:
| You have to learn to filter out the people who say "it's going
| to replace human experts" and listen to the people who say "I'm
| a human expert and this stuff is useful to me in these ways".
| meroes wrote:
| > LLMs can write a large fraction of all the tedious code you'll
| ever need to write.
|
| But, you still have to read it:
|
| > Reading other people's code is part of the job...I have to read
| the code line-by-line anyways.
|
| So instead of writing the tedious code, I only have to _read it_.
| Oh but don 't worry, I don't have to read it too carefully
| because:
|
| > Agents lint. They compile and run tests. If their LLM invents a
| new function signature, the agent sees the error
|
| But remember...
|
| > You've always been responsible for what you merge to main.
|
| So now I have to oversee this web of agents and AI ontop of
| coding? Am I doing more now for the same pay? Am I just
| speedrunning myself toward lower pay? Is AI adoption a prisoner's
| dilemma toward lowing my wages hardest?
|
| Because is good at coding compared to many other disciplines
| (e.g. math), it makes the internal AI politics among programmers
| more of an issue. Add fuel to that fire baby!
| TheCraiggers wrote:
| > "For art, music, and writing? I got nothing. I'm inclined to
| believe the skeptics in those fields."
|
| You've already lost me, because I view programming as an art
| form. I would no more use AI to generate code than I would use it
| to paint my canvas.
|
| I think the rest of the article is informative. It made me want
| to try some things. But it's written from the perspective of a
| CEO thinking all his developers are just salt miners; miners go
| into the cave and code comes out.
|
| I think that's actually what my hangup is. It's the old adage of
| programmers simply "copying and pasting from stack overflow" but
| taken to the extreme. It's the reduction of my art into mindless
| labor.
| tptacek wrote:
| Woodworking is also an art form. But most people just need
| furniture, fixtures, and structures. Nobody would take
| seriously the idea that new construction all be done with
| sashimono joinery in order to preserve the art form, but
| somehow we're meant to take seriously the idea of hand-
| dovetailed CRUD apps.
| layer8 wrote:
| I don't think that analogy matches very well. Most software
| is bespoke, the domain requirements, usage aspects, and
| architectural trade-offs are subtly, or often non-subtly,
| different each time, and take different trajectories over
| time. It's not like you're producing the same software 10,000
| times, like a piece of furniture. And AI isn't able to
| produce the exact same thing reproducibly anyway. A better
| argument would be that AI is actually approaching the
| craftsmanship/artisanal capabilities.
| sneak wrote:
| Most line of business apps and business logic are only
| bespoke and custom insofar as field names and relations and
| what APIs they trigger on which events.
|
| Just because software is "bespoke" doesn't mean it's
| complicated or special.
| layer8 wrote:
| > Most line of business apps and business logic are only
| bespoke and custom insofar as field names and relations
| and what APIs they trigger on which events.
|
| That's not my experience. Of course, everything is just a
| finite state machine operating on a memory tape.
| drbojingle wrote:
| That's cause there's an element of mindless labour to it. It's
| easier to spot that so it gets more focus.
| pydry wrote:
| If you find that theres an element of mindless labor to
| coding then you're probably doing it wrong.
| ACCount36 wrote:
| People don't pay programmers to produce great art. No one sees
| that "art" and no one cares. They pay programmers to get shit
| done.
| skydhash wrote:
| A functional code that is easy to maintain is art (but you
| have to be an experienced programmer to see it). A shoddy
| project isn't, but the whole company feels the pain.
| deanCommie wrote:
| I'm sure salt miners needed to make peace with their toil and
| also focused on tools and techniques to be more productive; how
| to remove the salt most elegantly in nice clean blocks,
| minimize waste, reduce burden on their physical bodies.
|
| But to their bosses their output was salt.
|
| I'm sorry but unless you're working in open source for the pure
| love of the tech/craft, the output of software engineering is
| PROBLEM SOLVING.
|
| That's why "build vs. buy" exists - sometimes it's better to
| buy a solution than buy one. That's why a valid solution to a
| problem sometimes is to convince a customer that their ask is
| wrong or unreasonable, and something simpler or easier would
| get them 99% of what they need with 1% of the effort.
|
| That's our job.
| dannyobrien wrote:
| This is also where I am, and I guess it has been a source of mild
| and growing consternation since I first blagged an OpenAI GPT
| account when they were private, in an attempt to get ahead of
| what was coming -- both the positive and negative sides of the
| advances. Most people either ignored the advances, or quickly
| identified and connected to the negative side, and effectively
| filtered out the rest.
|
| As somebody who comes from a politically left family, and was
| also around in the early days of the Web, let me tentatively note
| that this issue has a particular political slant, too. The left
| has strong roots in being able to effectively critique new
| developments, economic and social, that don't come from its own
| engines of innovation. The movement's theoriest work far more
| slowly on how to integrate the effect of those changes into its
| vision. That means when something like this comes along, the
| left's cultural norms err on the side of critique. Which is fine,
| but it makes any other expression both hard to convey, and
| instantly suspect in those communities. I saw this in the early
| Web, where from a small group of early adopters of all political
| slants, it was the independents, heterodox leftists, and the
| right, -- and most vocally, the libertarians -- who were able to
| most quickly adapt to and adopt the new technology. Academic
| leftists, and those who were inspired by them took a lot longer
| to accomodate the Net into their theses (beyond disregarding or
| rejecting it) and even longer to devise practical uses for it.
|
| It wasn't _that_ long, I should say -- a matter of months or
| years, and any latent objections were quickly swamped by younger
| voices who were familiar with the power of the Net; but from my
| point of view it seriously set back that movement in practicality
| and popularity during the 80s and 90s.
|
| I see the same with AI: the left has attracted a large
| generational of support across the world from providing an
| emotionally resonant and practical alternative to the status quo
| many people face. But you quickly lose the mandate of heaven if
| you fail to do more than just simplistically critique or reject a
| thing that the average person in the world feels they know
| better, or feels differently toward, than you do. This is
| something to consider, even if you still strongly believe
| yourselves to be correct in the critiques.
| baobun wrote:
| The privacy aspect and other security risks tho? So far all the
| praise I hear on productivity are from people using cloud-hosted
| models.
|
| Claude, Gemini, Copilot and and ChatGPT are non-starters for
| privacy-minded folks.
|
| So far, local experiements with agents have left me underwhelmed.
| Tried everything on ollama that can run on my dedicated Ryzen
| 8700G with 96GB DDR5. I'm ready to blow ~10-15k USD on a better
| rig if I see value in it but if I extrapolate current results I
| believe it'll be another CPU generation before I can expect
| positive productivity output from properly securely running local
| models when factoring in the setup and meta.
| oblio wrote:
| This is probably the biggest danger. Everyone is assuming
| optimization work reduces cost faster than these companies burn
| through capital. I'm half inclined to assume optimization work
| will do it, but it's far from as obvious as they want to
| portray it.
| BeetleB wrote:
| Privacy is not binary, and it would make it easier if you
| outlined specific scenarios.
|
| Most providers promise not to train on inputs if used via an
| API (and otherwise have a retention timeline for other
| reasons).
|
| I'm not sure the privacy concern is greater than using pretty
| much _any_ cloud provider for _anything_. Storing your data on
| AWS: Privacy concern?
| simonw wrote:
| Almost all of the cloud vendors have policies saying that they
| will not train on your input if you are a paying customer.
|
| The single biggest productivity boost you can get in LLM world
| is believing them when they make those promises to you!
| fossuser wrote:
| Yes! It's amazing how even in a field that tends to lean more
| early adopter than average you still get a lot of the default
| knee-jerk dismissal and cynicism - even when it's something
| _clearly_ amazing and useful as thinking machines.
|
| We're in the middle of a major shift - there will benefits to
| those that adapt first. People outside the field have no idea
| what's coming, even those of us in the field are underestimating
| the shift.
|
| There were a few outliers in the 60s who understood what the
| computing revolution meant and would mean, but most did not. This
| is likely an even bigger change than that.
| AnotherGoodName wrote:
| This happened with the introduction of smartphones too. Every
| slashdot post had a haughty and upvoted 'why would i want such a
| thing!!!'.
|
| It was obviously huge. You could see it taking off. Yet a lot of
| people proudly displayed ignorance and backed each other up on it
| to the point that discussion around the topic was often drowned
| out by the opposition to change. Now today it takes minutes of
| playing with ai coding agents to realise that it's extremely
| useful and going to be similarly huge.
|
| Resistance to change is not a virtue!
| ceejayoz wrote:
| Sometimes it works that way.
|
| Sometimes it's more like NFTs.
| postalrat wrote:
| Sometimes. Most people with an opinion of NFTs thought they
| were a joke. Hardly anyone thinks LLMs are a joke.
| jsheard wrote:
| I have bad news about our illustrious hosts:
| https://www.ycombinator.com/companies?query=web3
|
| They're not alone either, a bunch of the AI bankroll is
| coming from people who were also sold on crypto taking over
| the world.
| Karrot_Kream wrote:
| Slashdot then and HN now, predicted 100 out of the last 10
| recessions.
| oblio wrote:
| Smartphones were financially viable from day 1, though. I think
| LLMs will be used a lot and in a lot of places but the current
| level of investment they're getting right now feels out of line
| to me. Kind of like what I expect them to get in 10 years from
| now, when they're mature.
| BeetleB wrote:
| To be frank, I do think smartphones have made my life worse.
| I'd happily forego them if it were not for 2FA and how too many
| businesses expect I can receive texts.
| bawolff wrote:
| Some days i still don't understand why anyone would want a
| smart phone. I think being connected all the time has a
| significant negative impact on mental health (I say, as i type
| this from a smartphone)
| homebrewer wrote:
| > today it takes minutes of playing with ai coding agents to
| realise that it's extremely useful
|
| Yet some of us spent hours over the past three years playing
| with LLMs, and remain completely unimpressed by what we see.
| suddenlybananas wrote:
| Well don't you realise you need to try Jean 4.31 or Cocaptain
| 8.3E and then you'll see what the models are capable of!
| okanat wrote:
| I still think smartphones are a huge negative to humanity. They
| improve a narrow case: having access to ephemeral knowledge.
| Nobody writes articles or does deep knowledge work with
| smartphones.
|
| My position with the AI is almost the same. It is overall a net
| negative for cognitive abilities of people. Moreover I do think
| all AI companies need to pay fair licensing cost to all authors
| and train their models to accurately cite the sources. If they
| want more data for free, they need to propose copyright changes
| retroactively invalidating everything older than 50 years and
| also do the legwork for limiting software IP to 5 to 10 years.
| Karrot_Kream wrote:
| Just smartphones? I'd start at agriculture. Before
| agriculture, human society had little hierarchy. We were free
| the way we were meant to be.
| simonw wrote:
| "Nobody writes articles or does deep knowledge work with
| smartphones."
|
| I don't think that's true.
|
| I do most of my reading on a smart phone - including wading
| through academic papers, or reading full books in the kindle
| app and jotting down notes in the digital margins.
|
| A sizable number of my short form blog entries are written on
| my phone, and my long form writing almost always starts out
| in Apple Notes on my phone before transferring to a laptop.
|
| Predictive text and voice dictation has got good enough now
| that I suspect there have been entire books written on mobile
| devices.
|
| Whether you want to consider it "deep knowledge work" or not
| is up to you, but apparently a lot of Fifty Shades of Grey
| was written on a BlackBerry!
| https://www.huffpost.com/archive/ca/entry/fifty-shades-of-
| gr...
| steve_adams_86 wrote:
| > It is overall a net negative for cognitive abilities of
| people.
|
| I agree. A bunch of us here might use it to scaffold
| applications we already understand, use it as a rubber duck
| to help understand and solve new problems, research more
| effectively, or otherwise magnify skills and knowledge we
| already have in a manner that's directed towards improving
| and growing.
|
| That's cool. That's also not what most people will do with
| it. A bunch of us are total nerds, but most of the world
| really isn't like that. They want more entertainment, they
| want problems solved for them, they want ease. AI could allow
| a lot of people to use their brains less and lose function
| far more. For the minority among us who use it to do more and
| learn more, great. That group is a tiny minority from what I
| can tell.
|
| Take for example that a huge use case for generative AI is
| just... More sophisticated meme images. I see so much of
| that, and I'm really not looking for it. It's such an insane
| waste of cycles. But it's what the average person wants.
| tom_ wrote:
| And today everybody has a smartphone, pretty much. So what
| difference did it make, the opinion you had, whatever it was?
| In the end, none at all.
| storus wrote:
| Soon all coding will look like L3 support - debugging something
| you've never seen before, and under pressure. AI is really taking
| away the fun parts from everything and leaving just the drudgery
| in place.
| FridgeSeal wrote:
| "What do you mean you want to think about our architecture?
| Just get the LLM to do it, and we'll get it to fix it if
| anything goes wrong"
|
| "No we're not allocating any time to thinking about the design,
| just get the LLM to do it"
|
| I'm so excited for the bleak future.
| threeseed wrote:
| People said the same about VB style coding then low-code and
| now AI.
|
| They have been wrong every time and will continue to be wrong.
| storus wrote:
| This feels different; I asked DeepSeek R1 to give me an
| autoregressive image generation code in pytorch and it did a
| marvelous job. Similar for making a pytorch model for a
| talking lip-synced face; those two would take me weeks to do,
| AI did it in a few minutes.
|
| Autoregressive LLMs still have some major issues like over-
| dependency on the first few generated tokens and the problems
| with commutative reasoning due to one-sided masked attention
| but those issues are slowly getting fixed.
| threeseed wrote:
| People used to tell me all the amazing things no-code and
| low-code was able to do as well.
|
| And at the end of the day they went nowhere. Because (a)
| they will never be perfect for every use and (b) they
| abstract you from understanding the problem and solution.
| So often it will be easier to just write the code from
| scratch.
| sanderjd wrote:
| No-code and low-code tools have been very successful...
| sanderjd wrote:
| The key is to figure out how to move up the ladder of
| abstraction. You don't want to be a "coder" in a world where AI
| can code, but you do want to be a "person who makes software"
| in a world where making software just got easier.
| mjburgess wrote:
| Can we get a video of a workday conducted by these people?
|
| Unless there's a significant sense of what people are working on,
| and how LLMs are helping -- there's no point engaging -- there's
| no detail here.
|
| Sure, if your job is to turn out tweaks to a wordpress theme,
| presumably that's now 10x faster. If its to work on a new in-
| house electric motor in C for some machine, presumably that's
| almost entirely unaffected.
|
| No doubt junior web programmers working on a task backlog,
| specifically designed for being easy for juniors, are loving
| LLMs.
|
| I use LLMs all the time, but each non-trivial programming project
| that has to move out of draft-stage needs rewriting. In several
| cases, to such a degree that the LLM was a net impediment.
| jsnell wrote:
| Not exactly what you're asking for, but
| https://news.ycombinator.com/item?id=44159166 from today is not
| a junior web programming working through the backlog, and the
| commit history contains all the prompts.
| mjburgess wrote:
| Sure, thanks. I mean it's a typescript OAuth library, so
| perhaps we might say mid-level web programmer developing a
| library from scratch with excellent pre-existing references,
| and with a known good reference API to hand. I'd also count
| that as a good use case for an LLM.
| ramraj07 wrote:
| There's never winning with someone who is not arguing in
| bad faith. What do you do for your day job, write the next
| stuxnet?
| mjburgess wrote:
| I think you mistook my comment. Insofar as its anything,
| it was a concession to that use case.
|
| I gave an example below: debugging a microservice-based
| request flow from a front-end, thru various middle layers
| and services, to a back-end, perhaps triggering other
| systems along the way. Something similar to what I worked
| on in 2012 for the UK olympics.
|
| Unless I'm mistaken, and happy to be, I'm not sure where
| the LLM is supposed to offer a significant productivity
| factor here.
|
| Overall, my point is -- indeed -- that we cannot really
| have good faith conversations in blog posts and comment
| sections. These are empirical questions which need
| substantial evidence from both sides -- ideally, videos
| of a workday.
|
| Its very hard to guess what anyone is really talking
| about at the level of abstraction that all this hot air
| is conducted at.
|
| As far as i can tell the people hyping LLMs the most are
| juniors, data scientists who do not do software
| engineering, and people working on greenfield/blank-page
| apps.
|
| These groups never address the demand from these
| sceptical senior software engineers -- for obvious
| reasons.
| jcranmer wrote:
| Most of my day job is worrying about the correctness of
| compiler optimizations. LLMs frequently can't even
| accurately summarize the language manual (especially on
| the level of detail I need).
| mgraczyk wrote:
| I have done everything from architecture design for a DSP
| (Qualcomm), to training models that render photos on Pixel
| phones, to redoing Instagrams comments ranking system. I can't
| imaging doing anything without LLMs today, they would have made
| me much more productive at all of those things, whether it be
| Verilog, C++, python, ML, etc. I use them constantly now.
| mjburgess wrote:
| I use LLMs frequently also. But my point is, with respect to
| the scepticism from some engineers -- that we need to know
| what people are working on.
|
| You list what look like quite greenfield projects, very self-
| contained, and very data science oriented. These are quite
| significantly uncharacteristic of software engineering in the
| large. They have nothing to do with interacting systems each
| with 100,000s lines of code.
|
| Software engineers working on large systems (eg., many micro-
| services, data integration layers, etc.) are working on very
| different problems. Debugging a microservice system isn't
| something an LLM can do -- it has no ability, e.g., to trace
| a request through various apis from, eg., a front-end into a
| backend layer, into some db, to be transfered to some other
| db etc.
|
| This was all common enough stuff for software engineers 20
| years ago, and was part of some of my first jobs.
|
| A very large amount of this pollyanna-LLM view, which isnt by
| jnr software engineers, is by data scientists who are
| extremely unfamiliar with software engineering.
| mgraczyk wrote:
| Hmm how did you get that from what I listed?
|
| Every codebase I listed was over 10 years old and had
| millions of lines of code. Instagram is probably the
| world's largest and most used python codebase, and the
| camera software I worked on was 13 years old and had
| millions of lines of c++ and Java. I haven't worked on many
| self contained things in my career.
|
| LLMs can help with these things if you know how to use
| them.
| mjburgess wrote:
| OK, great. All I'm saying is until we really have videos
| (or equivalent empirical analysis) of these use cases,
| it's hard to assess these claims.
|
| Jobs comprise different tasks, some more amenable to LLMs
| than others. My view is that where scepticism exists
| amongst professional senior engineers, its probably well-
| founded and grounded in the kinds of tasks that they are
| engaged with.
|
| I'd imagine everyone in the debate is using LLMs to some
| degree; and that it's mostly about what productivity
| factor we imagine exists.
| CraigJPerry wrote:
| > it has no ability, e.g., to trace a request through
| various apis
|
| That's more a function of your tooling more than of your
| LLM. If you provide your LLM with tool use facilities to do
| that querying, i don't see the reason why it can't go off
| and perform that investigation - but i haven't tried it
| yet, off the back of this comment though, it's now high on
| my todo list. I'm curious.
|
| TFA covers a similar case:
|
| >> But I've been first responder on an incident and fed 4o
| -- not o4-mini, 4o -- log transcripts, and watched it in
| seconds spot LVM metadata corruption issues on a host we've
| been complaining about for months. Am I better than an LLM
| agent at interrogating OpenSearch logs and Honeycomb
| traces? No. No, I am not.
| mjburgess wrote:
| Great, let's see it. If it works, it works.
|
| For the first 10 years of my career I was a contractor
| walking into national and multinational orgs with large
| existing codebases, working within pre-existing _systems_
| not merely "codebases". Both hardware systems (e.g., new
| 4g networking devices just as they were released) and
| distributed software systems.
|
| I can think of many daily tasks I had across these roles
| that would not be very significantly speed-up by an LLM.
| I can also see that there's a few that would be. I also
| shudder to think what time would be wasted by me trying
| to learn 4g networking from LLM summarisation of new
| docs; and spending as much time working from improperly
| summarised code (etc.).
|
| I don't think snr software engineers are so scepticial
| here that they're saying LLMs are not, locally, helpful
| to their jobs. The issue is how local this help seems to
| be.
| ryukoposting wrote:
| I write embedded firmware for wireless mesh networks and
| satcom. Blend of Rust and C.
|
| I spent ~4 months using Copilot last year for hobby projects,
| and it was a pretty disappointing experience. At its best, it
| was IntelliSense but slower. At its worst, it was trying to
| inject 30 lines of useless BS.
|
| I only realized there was an "agent" in VS Code because they
| hijacked my ctrl+i shortcut in a recent update. You can't point
| it at a private API without doing some GitHub org-level
| nonsense. As far as my job is concerned, it's a non-feature
| until you can point it your own API without jumping through
| hoops.
| ramraj07 wrote:
| You used one AI tool that was never more than autocomplete a
| year ago and you think you have a full hold of all that AI
| offers today? That's like reviewing thai food when you've
| only had Chinese food.
| manmal wrote:
| Here's a 3+h video of the PSPDFKit (Nutrient) founder vibe-
| coding a Mac app. Can be watched at 2x:
| https://steipete.me/posts/2025/the-future-of-vibe-coding?utm...
| bsder wrote:
| Here's what to do: Show me a video of LLM fixing four filed
| issues in the KiCad codebase.
|
| If you do that, I'll swallow my AI skepticism.
|
| I would love to have an LLM that I can turn loose on an
| unfamiliar codebase that I can ask questions of. I would love to
| have an LLM that will fill in my Vulkan boilerplate. etc.
|
| I use emacs and Mercurial. You can demonstrate magic to me and I
| can be convinced even if it's not mainstream.
|
| Rewriting Javascript slop to StackOverflow standards is not
| convincing me.
|
| Get to it.
|
| (The OAuth stuff posted earlier certainly moved my needle, but
| the fact that they needed a gaggle of reviewers as well as hand
| holding when the LLM got stuck mutes the impact significantly.)
| sarchertech wrote:
| This article feels incredibly defensive. If you have really have
| a technique that makes you 100x, 50x, or even just 2x more
| productive, you don't need to write an article calling people who
| don't agree with you nuts.
|
| You keep using that tool, to your advantage. I'd you're really
| altruistic you post some videos of how productive you can be like
| DHH did with his blog in 15 minute videos.
|
| If you're really that much more productive, the skeptics won't be
| able to keep up and it should only take 6 months or some for that
| to become self evident.
| capnrefsmmat wrote:
| The argument seems to be that for an expert programmer, who is
| capable of reading and understanding AI agent code output and
| merging it into a codebase, AI agents are great.
|
| Question: If everyone uses AI to code, how does someone _become_
| an expert capable of carefully reading and understanding code and
| acting as an editor to an AI?
|
| The expert skills needed to be an editor -- reading code,
| understanding its implications, knowing what approaches are
| likely to cause problems, recognizing patterns that can be
| refactored, knowing where likely problems lie and how to test
| them, holding a complex codebase in memory and knowing where to
| find things -- currently come from long experience _writing_
| code.
|
| But a novice who outsources their thinking to an LLM or an agent
| (or both) will never develop those skills on their own. So where
| will the experts come from?
|
| I think of this because of my job as a professor; many of the
| homework assignments we use to develop thinking skills are now
| obsolete because LLMs can do them, permitting the students to
| pass without thinking. Perhaps there is another way to develop
| the skills, but I don't know what it is, and in the mean time I'm
| not sure how novices will learn to become experts.
| kulahan wrote:
| > If everyone uses AI to code, how does someone become an
| expert
|
| The same way they do now that most code is being
| copied/integrated from StackOverflow.
| groos wrote:
| I had this conversation with a friend:
|
| HIM: AI is going to take all entry level jobs soon. ME: So
| the next level one up will become entry level? HIM: Yes. ME:
| Inductively, this can continue up to the CEO. What about the
| CEO? HIM. Wait...
| bawolff wrote:
| I dont know if im convinced by this. Like if we were talking
| about novels, you don't have to be a writer to check grammar
| and analyze plot structure in a passable way. It is possible to
| learn by reading instead of doing.
| capnrefsmmat wrote:
| Sure, you could learn about grammar, plot structure,
| narrative style, etc. and become a reasonable novel critic.
| But imagine a novice who wants to learn to do this and has
| access to LLMs to answer any question about plots and style
| that they want. What should they do to become a good LLM-
| assisted author?
|
| The answer to that question is very different from how to
| become an author before LLMs, and I'm not actually sure what
| the answer _is_. It 's not "write lots of stories and get
| feedback", the conventional approach, but something new. And
| I doubt it's "have an LLM generate lots of stories for you",
| since you need more than that to develop the skill of
| understanding plot structures and making improvements.
|
| So the point remains that there is a step of learning that we
| no longer know how to do.
| andersa wrote:
| If no one really becomes an expert anymore, that seems like
| great news for the people who are already experts. Perhaps
| people actively desire this.
| stackskipton wrote:
| Problem is, at some point those experts retire or change
| their focus and you end up with COBOL problem.
|
| Except instead of just one language on enterprise systems no
| one wants to learn because there is no money in them, it's
| everything.
| andersa wrote:
| That seems like even better news for the people about to be
| paid large sums to fix all that stuff because no one else
| knows how any of it works.
| ofjcihen wrote:
| It's a great point and one I've wondered myself.
|
| Arguments are made consistently about how this can replace
| interns or juniors directly. Others say LLMs can help them
| learn to code.
|
| Maybe, but not on your codebase or product and not with a
| seniors knowledge of pitfalls.
|
| I wonder if this will be programmings iPhone moment where we
| start seeing a lack of deep knowledge needed to troubleshoot. I
| can tell you that we're already seeing a glut of security
| issues being explained by devs as "I asked copilot if it was
| secure and it said it was fine so I committed it".
| gwbas1c wrote:
| > Question: If everyone uses AI to code, how does someone
| become an expert capable of carefully reading and understanding
| code and acting as an editor to an AI?
|
| Well, if everyone uses a calculator, how do we learn math?
|
| Basically, force students to do it by hand long enough that
| they understand the essentials. Introduce LLMs at a point
| similar to when you allow students to use a calculator.
| mmasu wrote:
| While I agree with your suggestion, the comparison does not
| hold: calculators do not tell you which numbers to input and
| compute. With an LLM you can just ask vaguely, and get an
| often passable result
| rglover wrote:
| > So where will the experts come from?
|
| They won't, save for a relative minority of those who enjoy
| doing things the hard way or those who see an emerging market
| they can capitalize on (slop scrubbers).
|
| I wrote this post [1] last month to share my concerns about
| this exact problem. It's not that using AI is bad necessarily
| (I do every day), but it disincentivizes real learning and
| competency. And once using AI is normalized to the point where
| true learning (not just outcome seeking) becomes optional, all
| hell will break loose.
|
| > Perhaps there is another way to develop the skills
|
| Like sticking a fork in a light socket, the only way to truly
| learn is to try it and see what happens.
|
| [1] https://ryanglover.net/blog/chauffeur-knowledge-and-the-
| impe...
| ldjkfkdsjnv wrote:
| This is such a non issue and so far down the list of questions.
| Weve invented AI that can code, and you're asking about career
| progression? Thats the the top thing to talk about? Weve given
| life to essentially an alien life form
| hiAndrewQuinn wrote:
| I'll take the opposite view of most people. Expertise is a bad
| thing. We should embrace technological changes that render
| expertise economically irrelevant with open arms.
|
| Take a domain like US taxation. You can certainly become an
| expert in that, and many people do. Is it a _good_ thing that
| US taxes are so complicated that we have a market demand for
| thousands of such experts? Most people would say no.
|
| Don't get my wronf, I've been coding for more years of being
| alive than I haven't by this point, I love the craft. I still
| think younger me would have far preferred a world where he
| could have just had GPT do it all for him so he didn't need to
| spend his lunch hours poring over the finer points of e.g.
| Python iterators.
| open592 wrote:
| The question then becomes whether or not it's possible (or
| will be possible) to effectively use these LLMs for coding
| without already being an expert. Right now, building anything
| remotely complicated with an LLM, without scouring over every
| line of code generated, is not possible.
| jacobgkau wrote:
| > We should embrace technological changes that render
| expertise economically irrelevant with open arms.
|
| To use your example, is using AI to file your taxes
| _actually_ "rendering [tax] expertise economically
| irrelevant?" Or is it just papering over the over-complicated
| tax system?
|
| From the perspective of someone with access to the AI tool,
| you've somewhat eased the burden. But you haven't actually
| solved the underlying problem (with the actual solution
| obviously being a simpler tax code). You have, on the other
| hand, added an extra dependency on top of an already over-
| complicated system.
| layer8 wrote:
| In addition, a substantial portion of the complexity in
| software is essential complexity, not just accidental
| complexity that could be done away with.
| superb_dev wrote:
| But that is incompatible with the fact that you need be an
| expert to wield this tool effectively.
| hooverd wrote:
| Don't think of it from someone who had to learn. Think of it
| from someone who has never had the experience the friction of
| learning at all.
| TheOtherHobbes wrote:
| By the same logic we should allow anyone with an LLM to
| design ships, bridges, and airliners.
|
| Clearly, it would be very unwise to buy a bridge designed by
| an LLM.
|
| It's part of a more general problem - the engineering
| expectations for software development are much lower than for
| other professions. If your AAA game crashes, people get
| annoyed but no one dies. If your air traffic control system
| fails, you - and a large number of other poeple - are going
| to have a bad day.
|
| The industry that has a kind of glib unseriousness about
| engineering quality - not _theoretical_ quality, based on
| rules of thumb like DRY or faddy practices, but measurable
| reliability metrics.
|
| The concept of reliability metrics doesn't even figure in the
| LLM conversation.
|
| That's a _very_ bizarre place to be.
| triceratops wrote:
| Counter-counter point. The existence of tools like this can
| allow the tax code to become _even more complex_.
| layer8 wrote:
| I mean, we already have vibe tariffs, so vibe taxation
| isn't far off. ;)
| mgraczyk wrote:
| Deliberate practice, which may take a form different from
| productive work.
|
| I believe it's important for students to learn how to write
| data structures at some point. Red black trees, various heaps,
| etc. Students should write and understand these, even though
| almost nobody will ever implement one on the job.
|
| Analogously electrical engineers learn how to use conservation
| laws and Ohm's law to compute various circuit properties.
| Professionals use simulation software for this most of the
| time, but learning the inner workings is important for
| students.
|
| The same pattern is true of LLMs. Students should learn how to
| write code, but soon the code will write itself and
| professionals will be prompting models instead. In 5-10 years
| none of this will matter though because the models will do
| nearly everything.
| capnrefsmmat wrote:
| I agree with all of this. But it's already very difficult to
| do even in a college setting -- to force students to get
| deliberate practice, without outsourcing their thinking to an
| LLM, you need various draconian measures.
|
| And for many professions, true expertise only comes after
| years on the job, building on the foundation created by the
| college degree. If students graduate and immediately start
| using LLMs for everything, I don't know how they will
| progress from novice graduate to expert, unless they have the
| self-discipline to keep getting deliberate practice. (And
| that will be hard when everyone's telling them they're an
| idiot for not just using the LLM for everything)
| r3trohack3r wrote:
| I don't know about you, but I use LLMs as gateways to
| knowledge. I can set a deep research agent free on the internet
| with context about my current experience level, preferred
| learning format (books), what I'm trying to ramp up on, etc. A
| little while later, I have a collection of the definitive books
| for ramping up in a space. I then sit down and work through the
| book doing active recall and practice as I go. And I have the
| LLM there for Q&A while I work through concepts and "test the
| boundaries" of my mental models.
|
| I've become faster at the novice -> experienced arc with LLMs,
| even in domains that I have absolutely no prior experience
| with.
|
| But yeah, the people who just use LLMs for "magic oracle please
| tell me what do" are absolutely cooked. You can lead a horse to
| water, but you can't make it drink.
| dimal wrote:
| > a novice who outsources their thinking to an LLM or an agent
| (or both) will never develop those skills on their own. So
| where will the experts come from?
|
| Well, if you're a novice, don't do that. I _learn_ things from
| LLMs all the time. I get them to solve a problem that I'm
| pretty sure can be solved using some API that I'm only vaguely
| aware of, and when they solve it, I read the code so I can
| understand it. Then, almost always, I pick it apart and
| refactor it.
|
| Hell, just yesterday I was curious about how signals work under
| the hood, so I had an LLM give me a simple example, then we
| picked it apart. These things can be amazing tutors if you're
| curious. I'm insatiably curious, so I'm learning a lot.
|
| Junior engineers should not vibe code. They should use LLMs as
| pair programmers to learn. If they don't, that's on them. Is it
| a dicey situation? Yeah. But there's no turning back the clock.
| This is the world we have. They still have a path if they want
| it and have curiosity.
| capnrefsmmat wrote:
| > Well, if you're a novice, don't do that.
|
| I agree, and it sounds like you're getting great results, but
| they're all going to do it. Ask anyone who grades their
| homework.
|
| Heck, it's even common among expert users. Here's a study
| that interviewed scientists who use LLMs to assist with tasks
| in their research: https://doi.org/10.1145/3706598.3713668
|
| Only a few interviewees said they read the code through to
| verify it does what they intend. The most common strategy was
| to just run the code and see if it appears to do the right
| thing, then declare victory. Scientific codebases rarely have
| unit tests, so this was purely a visual inspection of output,
| not any kind of verification.
| killerstorm wrote:
| I think a large fraction of my programming skills come from
| looking through open source code bases. E.g. I'd download some
| code and spend some time navigating through files looking for
| something specific, e.g. "how is X implemented?", "what do I
| need to change to add Y?".
|
| I think it works a bit like pre-training: to find what you want
| quickly you need to have a model of coding process, i.e. why
| certain files were put into certain directories, etc.
|
| I don't think this process is incompatible with LLM use...
| sanderjd wrote:
| Yep, this is the thing I worry about as well.
|
| I find these tools incredibly useful. But I constantly edit
| their output and frequently ask for changes to other peoples'
| code during review, some of which is AI generated.
|
| But all of that editing and reviewing is informed by decades of
| writing code without these tools, and I don't know how I would
| have gotten the reps in without all that experience.
|
| So I find myself bullish on this for myself and the experienced
| people I work with, but worried about training the next
| generation.
| timewizard wrote:
| > You'll only notice this happening if you watch the chain of
| thought log your agent generates. Don't.
|
| "You're nuts!" says the guy with his head intentionally buried in
| the sand. Also way to tell me your business model is a joke
| without telling me your business model is a joke. Enjoy it while
| it lasts.
| d--b wrote:
| Yeah, and it's progressing so fast. Singularity is definitely on
| the table.
|
| Whoever says otherwise should read their own comments from 2
| years ago and see how wrong they were about where AI is today.
|
| Not saying singularity will happen for sure, but is it a
| possibility? Hell yeah.
| suddenlybananas wrote:
| It's not really that different than 2 years ago. Better but not
| qualitatively so.
| bawolff wrote:
| Sometimes i feel like people who really like AI have a very
| different experience programming then i do.
|
| They are constantly talking about AI doing all the tedious
| boilerplate bullshit. Don't get me wrong, some of my code is that
| too and its not fun. However the pro-AI people talk as if 80% of
| your day is dealing with that. For me its simply a rare enough
| occurence that the value proposition isn't that big. If that is
| the killer app of AI, it just doesn't sound that exciting to me.
| FridgeSeal wrote:
| When I see someone talk about the reams of boilerplate they're
| getting the LLM to write for them, I really do wonder what
| godawful sounding tools and tech-stack they're being subjected
| to.
| JoshTriplett wrote:
| Exactly. Back in the day, people talked about "design
| patterns". It took a while for (some of) the industry to
| recognize that "design patterns" are a sign that your
| libraries and tools aren't good enough, because you're having
| to write the same patterns repeatedly.
| rudedogg wrote:
| Anything where you're doing basic CRUD apps. Yes there are
| generators, but not for everything. For me that's where LLMs
| have been the most useful.
| prisenco wrote:
| Unpopular opinion, boilerplate is good for you. It's a warmup
| before a marathon. Writing it can be contemplative and zen-like
| and allows you to consider the shape of the future.
| alkonaut wrote:
| I can't even get copilot to autocomplete 5 working lines
| consistently. I spend hours every day arguing with ChatGPT about
| things it's hallucinating. And Agents? It took me a year to
| convince anyone to buy me a copilot subscription. It's not good
| enough now? But it was the bees knees just a year or two ago? See
| I _hate_ the thing where the JS-framework tempo thing happens to
| the part of the software world I'm in.
| darepublic wrote:
| I do use LLMs for coding and the newer models have definitely
| been a blessing. I don't know about using coding agents (or
| agentic coding) though. I personally do not find this better than
| chatting with the llm, getting the code back and then copy /
| pasting it and grokking / editing it. The author of this seems to
| suggest that.. there is one correct flow, his flow (which he
| doesn't entirely detail) and everything else is not appropriate.
| He doesn't go into what his process is when the LLM hallucinates
| either. Not all hallucinations show up in static analysis.
| 1970-01-01 wrote:
| Everything works right until it doesn't. LLMs are trained on
| things that have worked. Let's revisit in 2027 when things are
| insanely faster, but not much better.
| timr wrote:
| As someone who has spent the better part of today fixing the
| utter garbage produced by repeated iteration with these
| supposedly magical coding agents, I'm neither in the camp of the
| "AI skeptic" (at least as defined by the author), nor am I in the
| camp of people who thinks these things can "write a large
| fraction of all the tedious code you'll ever need to write."
|
| Maybe I'm doing it wrong, but I seem to have settled on the
| following general algorithm:
|
| * ask the agent to green-field a new major feature.
|
| * watch the agent spin until it is satisfied with its work.
|
| * run the feature. Find that it does not work, or at least has
| major deficiencies [1]
|
| * cycle through multiple independent iterations with the agent,
| doing something resembling "code review", fixing deficiencies one
| at a time [2]
|
| * eventually get to a point where I have to re-write major pieces
| of the code to extract the agent from some major ditch it has
| driven into, leading to a failure to make forward progress.
|
| Repeat.
|
| It's not that the things are useless or "a fad" -- they're
| clearly very useful. But the people who are claiming that
| programmers are going to be put out of business by bots are
| either a) talking their book, or b) extrapolating wildly into the
| unknown future. And while I am open to the argument that (b)
| _might_ be true, what I am observing in practice is that the rate
| of improvement is slowing rapidly, and /or the remaining problems
| are getting much harder to solve.
|
| [1] I will freely grant that at least some of these major
| deficiencies typically result from my inability / unwillingness
| to write a detailed enough spec for the robot to follow, or
| anticipate every possible problem with the spec I did bother to
| write. T'was ever thus...
|
| [2] This problem is fractal. However, it's at least fun, in that
| I get to yell at the robot in a way that I never could with a
| real junior engineer. One Weird Fact about working with today's
| agents is that if you threaten them, they seem to do better work.
| therealmarv wrote:
| Results can vary significantly, and in my experience, both the
| choice of tools and models makes a big difference.
|
| It's a good idea to periodically revisit and re-evaluate AI and
| tooling. I've noticed that many programmers tried AI when, for
| example, GPT-3.5 was first released, became frustrated, and
| never gave it another chance--even though newer models like
| o4-mini are now capable of much more, especially in programming
| tasks.
|
| AI is advancing rapidly. With the latest models and the right
| tools, what's possible today far exceeds what was possible even
| just a short time ago (3-12 months ago even).
|
| Take a look at Cursor or Windsurf or Roo code or aider to
| "feed" AI with code and take a look at models like Google
| Gemini 2.5 Pro, Claude Sonnet 4, OpenAI o4mini. Also educate
| yourself about agents and MCP. Soon that will be standard for
| many/every programmer.
| timr wrote:
| I am using all of the models you're talking about, and I'm
| using agents, as I mentioned.
|
| There is no magic bullet.
| artursapek wrote:
| Which model? Are you having it write unit tests first? How
| large of a change at a time are you asking for? How specific
| are your prompts?
| wordofx wrote:
| Programmers who don't use ai will absolutely be put out of
| business.
|
| Ai is a tool that makes us go faster. Even if there is
| iteration and tidy up. You can still smash out feature in a
| fraction of the time it takes to manually roll it.
|
| Anyone who disagrees with this or thinks ai is not useful is
| simply not good at what they do to begin with and feel
| threatened. They will be replaced.
| anonymousab wrote:
| Or they work with languages, libraries, systems or problem
| areas where the LLMs fail to perform anywhere near as well as
| they do for you and me.
| wordofx wrote:
| Still haven't seen an example. It's always the same. People
| don't want to give hints or context. The moment you start
| doing things properly it's "oh no this is just a bad
| example. It still can't do what u do"
| SirHumphrey wrote:
| Nixos module for the latest version of Cura(3d slicer).
| layer8 wrote:
| To be fair, there is a substantial overlap with NDAs.
| therealmarv wrote:
| About libraries or systems unknown to AI: you can fine tune
| or RAG LLMs e.g. with a MCP server like Context7 about
| special knowledge/libraries to make it a more knowledgeable
| companion when it was not trained so well (or at all) about
| the topic you need for your work. Also own defined specs
| etc. help.
| foldr wrote:
| You need a good amount of example code to train it on. I
| find LLMs moderately useful for web dev, but fairly
| useless for embedded development. They'll pick up some
| project-specific code patterns, but they clearly have no
| concept of what it means to enable a pull-up on a GPIO
| pin.
| jakebasile wrote:
| Why is this line of thinking so common with AI folk? Is it
| just inconceivable to you that other people have different
| experiences with a technology that has only become widespread
| in the past couple years and that by its very nature is non
| deterministic?
| timr wrote:
| For what it's worth, I basically accept the premise of the
| GP comment, in the same way that I would accept a statement
| that "loggers who don't use a chainsaw will be put out of
| business". Sure, fine, whatever.
|
| I still think the tone is silly and polarizing,
| particularly when it's replying to a comment where I am
| very clearly _not_ arguing against use of the tools.
| jakebasile wrote:
| It assumes the result though. These comments presuppose
| that LLMs are universally good and useful and positive
| when that is the very argument that is being debated, and
| then uses the presupposition to belittle the other side
| of the debate.
| wordofx wrote:
| > These comments presuppose that LLMs are universally
| good and useful and positive when that is the very
| argument that is being debated
|
| No
|
| They can be good but people spend more time fighting them
| and throwing up imaginary walls and defending their
| skillset rather than actually learning how to use these
| tools to be successful.
| andrepd wrote:
| I've not yet been in a position where reading + cleaning up
| the LLMs bad code was faster and/or produced better code than
| if I wrote it by hand. I've tried. Every time someone comes
| up and says "yeah of course you're not using GPT4.7-turbo-
| plus-pro" I go and give a spin on the newfangled thing. Nope,
| hasn't happened yet.
|
| I admit my line of work may not be exactly generic crud work,
| but then again if it's not useful for anything just one step
| above implementing a user login for a website or something,
| then is it really gonna take over the world and put me out of
| a job in 6 months?
| TheRoque wrote:
| Same for me. My last try was with claude code on a fairly
| new and simple Angular 19 side project. Spew garbage code
| using the old angular stuff (without signals). Failed to
| reuse the code that was already here so needed refactor.
| The features I asked for were simple, so I clearly lost my
| time prompting + reading + refactoring the result. So I
| spent the credits and never used it again.
| TheRoque wrote:
| The thing is, the AI tools are so easy to use and can be
| picked up in a day or too by an experienced programmer
| without any productivity loss
|
| I don't get why people push this LLM fomo. The tools are
| evolving so fast anyways
| SirHumphrey wrote:
| As if going faster is the only goal of a programmer.
|
| Some simulation I worked on for 2 months were in total 400
| lines of code. Typing it out was never the bottleneck. I need
| to understand the code so that when I am studying the code
| for the next 1 1/2 months I can figure out if the problem is
| a bug in my code, or the underlying model is wrong.
| deergomoo wrote:
| Absurd take. Speed is not the issue! Optimising for speed of
| production is what got us into the utter quagmire that is
| modern software.
|
| Lack of correctness, lack of understanding and ability to
| reason about behaviour, and poor design that builds up from
| commercial pressure to move quickly are the problems we need
| to be solving. We're accelerating the rate at which we add
| levels to a building with utterly rotten foundations.
|
| God damn it, I'm growing to loathe this industry.
| wordofx wrote:
| It is absolutely hilarious to read the responses from people
| who can't use ai make attempts to justify their ability to
| code better than ai. These are the people who will be
| replaced. They are fighting so hard against it instead of
| learning how to use it.
|
| "I wrote 400 lines of code I don't understand and need months
| to understand it because ai obviously cant understand it or
| break it down and help me document it"
|
| "Speed is what caused problems! Because I don't know how to
| structure code and get ai to structure it the same it's
| obviously going rogue and doing random things I cannot
| control so it's wrong and causing a mess!!!"
|
| "I haven't been able to use it properly so don't know how to
| rein it in to do specific tasks so it produces alot of stuff
| that takes me ages to read! I could have written it
| faster!!!"
|
| I would love to see what these people are doing 1-2 years
| from now. If they eventually click or if they are unemployed
| complaining ai took their jobs.
| zaptrem wrote:
| Even on stuff it has no chance of doing on its own, I find it
| useful to basically git reset repeatedly and start with more
| and more specific instructions. At the very least it helps me
| think through my plan better.
| timr wrote:
| Yeah...I've toyed with that, but there's still a productivity
| maximum where throwing it all away and starting from scratch
| is a worse idea, probabilistically, than just fixing whatever
| thing is clearly wrong.
|
| Just to make it concrete, today I spent a few hours going
| through a bunch of HTML + embedded styles and removing gobs
| and gobs of random styles the LLMs glommed on that "worked",
| but was brittle and failed completely as soon as I wanted to
| do something slightly different than the original spec. The
| cycle I described above led to a lot of completely
| unnecessary markup, paired with unnecessary styles to
| compensate for the crappiness of the original DOM. I was able
| to refactor to a much saner overall structure, but it took
| some time and thinking. Was I net ahead? I don't really know.
|
| Given that LLMs almost always write this kind of "assembled
| from StackOverflow" code, I have precisely 0% confidence that
| I'd end up in a better place if I just reset the working
| branch and started from scratch.
| yodsanklai wrote:
| My workflow is similar. While the agent is running, I browse
| the web or day dream. If I'm lucky, the agent produced correct
| code (after possibly several cycles). If I'm not, I need to
| rewrite everything myself. I'm also not in any camp and I
| genuinely don't know if I'm more or less productive overall.
| But I think that a disciplined use of a well-integrated agent
| will make people more productive.
| andrepd wrote:
| > eventually get to a point where I have to re-write major
| pieces of the code to extract the agent from some major ditch
| it has driven into, leading to a failure to make forward
| progress.
|
| As it stands AI can't even get out of Lt Surge's gym in Pokemon
| Red. When an AI manages to beat Lance I'll start to think about
| using it for writing my code :-)
| ddoolin wrote:
| I have been using agentic AI to help me get started writing an
| OpenGL-targeted game from scratch (no engine). I have almost no
| background experience with computer graphics code, but I
| understand most of the fundamentals pretty well and I have almost
| 13 years of software experience. It's just that the exact syntax
| as well as the various techniques used to address common problems
| are not in my arsenal yet.
|
| My experience has been decent. I don't know that it has truly
| saved me much time but I can understand how it FEELS like it has.
| Because it's writing so much code (sometimes), it's hard to vet
| all of it and it can introduce subtle bugs based on faulty
| assumptions it made about different things. So, it will dump a
| lot of code at once, which will get me 90% of the way there, but
| I could spend an hour or two trying to nudge it to fix it to get
| it to 100%. And then I will probably still need to go back and
| reorganize it, or have it go back and reorganize it. And then
| sometimes it will make small adjustments to existing, committed
| code that will subtly break other things.
|
| Something that has surprised me (in hindsight, it isn't
| surprising) is that sometimes when I feel like it misunderstood
| something or made a faulty assumption, it was actually me that
| had the misunderstanding or ignorance which is humbling at times
| and a good learning experience. It is also pretty good at bug
| hunting and DEFINITELY very good at writing unit tests.
|
| I count myself as pretty lucky that this domain seems to be very
| well covered in training. Given the law of averages, most
| people's domains will probably be covered. I'm not sure how it
| would fare with a niche domain.
| jandrese wrote:
| > which will get me 90% of the way there
|
| This is roughly my experience as well. The AI is great at the
| first 90% of the work and actively counterproductive for the
| remaining 90%
| shinycode wrote:
| And wait until there is 500 million of generated loc no one
| read and the product needs to evolve every week.
| root_axis wrote:
| The problem with LLMs for code is that they are still way too
| slow and expensive to be _generally_ practical for non-trivial
| software projects. I 'm not saying that they aren't useful, they
| are excellent at filling out narrow code units that don't require
| a lot of context and can be quickly or automatically verified to
| be correct. You will save a lot of time using them this way.
|
| On the other hand, if you slip up and give it too much to chew on
| or just roll bad RNG, it will spin itself into a loop attempting
| many variations of crap, erasing and trying over, but never
| actually coming closer to a correct solution, eventually
| repeating obviously incorrect solutions over and over again that
| should have been precluded based on feedback from the previous
| failed solutions. If you're using a SOTA model, you can easily
| rack up $5 or more on a single task if you give it more than 30
| minutes of leeway to work it out. Sure, you could use a cheaper
| model, but all that does is make the fundamental problem worse -
| i.e. you're spending money but not actually getting any closer to
| completed work.
|
| Yes, the models are getting smarter and more efficient, but we're
| still at least a decade away from being able to run useful models
| at practical speeds locally. Aggressively quantized 70b models
| simply can't cut it, and even then, you need something like 10k
| tps to start building LLM tools that can overcome the LLM's lack
| of reasoning skills through brute force guess and check
| techniques.
|
| Perhaps some of the AI skeptics are a bit too harsh, but they're
| certainly not crazy in the context of breathless hype.
| kopecs wrote:
| > Meanwhile, software developers spot code fragments seemingly
| lifted from public repositories on Github and lose their shit.
| What about the licensing? If you're a lawyer, I defer. But if
| you're a software developer playing this card? Cut me a little
| slack as I ask you to shove this concern up your ass. No
| profession has demonstrated more contempt for intellectual
| property.
|
| Seriously? Is this argument in all earnestly "No profession has
| been more contemptuous therefore we should keep on keeping on"?
| Should we as an industry not bother to try and improve our
| ethics? Why don't we all just make munitions for a living and
| wash our hands of guilt because "the industry was always like
| this".
|
| Seems a bit ironic against the backdrop of
| <https://news.ycombinator.com/user?id=tptacek>:
|
| > All comments Copyright (c) 2010, 2011, 2012, 2013, 2015, 2018,
| 2023, 2031 Thomas H. Ptacek, All Rights Reserved.
|
| (although perhaps this is tongue-in-cheek given the last year)
| panorama wrote:
| It's a fib sequence
| holoduke wrote:
| "If you're making requests on a ChatGPT page and then pasting the
| resulting (broken) code into your editor, you're not doing what
| the AI boosters are doing"
|
| I am actually doing this the whole day long. For example i have
| setup today a fresh new debian vps for some interns. U had to
| provide them with a docker system, support for go, nginx stuff
| and i made a quick hello world app in angular with a go backend.
| I could have done it myself. But i asked chatgpt to provide me
| with all the commands and code. No idea how an agent could do
| this for me. I got everything running in like 30 minutes.
| panny wrote:
| Every time I read one of these it feels like I'm reading an AI
| generated sales pitch for AI.
| popalchemist wrote:
| This guy may be right about a lot of things said here but he's
| smug and it's off-putting. preaching to the choir.
| JoshTriplett wrote:
| > but the plagiarism
|
| This entire section reads like, oddly, the _reverse_ of the
| "special pleading" argument that I usually see from artists.
| Instead of "Oh, it's fine for other fields, but for _my_ field it
| 's a horrible plagiarism machine", it's the reverse: "Oh, it's a
| problem for those other fields, but for _my_ field get over it,
| you shouldn 't care about copyright anyway".
|
| I'm all for eliminating copyright. The day I can ignore the
| license on every single piece of proprietary software as I see
| fit, I'll be all for saying that AIs should be able to do the
| same. What I will continue to complain about is the _asymmetry_ :
| individual developers don't get to violate individual licenses,
| but oh, if we have an AI slurp up millions of codebases and
| ignore their licenses, _that 's_ fine.
|
| No. No, it isn't. If you want to ignore copyright, abolish it for
| everyone. If it still applies to everyone else, it should still
| apply to AIs. No special exceptions for mass-scale Open Source
| license violations.
| mwcampbell wrote:
| I think where tptacek is right, though, is that if we're going
| to hold this position without hypocrisy, then we need to
| respect copyright as long as it exists. He's right that many of
| us have not done that; it's been very common to violate
| copyright for mere entertainment. If we want the licenses of
| our own work to be respected, then we need to extend that
| respect to others as well, regardless of the size of the
| copyright holder.
| marginalia_nu wrote:
| It's so all-or-nothing this debate. If you're drawing a benefit
| from using AI tools, great. If you aren't, then maybe don't use
| them, or try some other approach to using them.
|
| Personally I find AI coding tools situationally useful. I
| certainly wouldn't use them to write all my code, but I also
| think I'd be a fool not to leverage them at all.
| rorylaitila wrote:
| I can sum it up like this: if I could know in advance the exact
| right thing to build, producing the physical code, has not for a
| long time, been the bottleneck. I've been vibe coding long before
| it was cool. It's sometimes called model driven development.
|
| For those that think only procedurally, I can see how it helps
| them. Because procedural first development has a lot of
| boilerplate logic.
|
| For those who think model first, the AI may help them rubber
| duck, but ultimately the physical writing of the characters is
| minimal.
|
| Most of my time is thinking about the data model. The AI writes
| almost all of my procedures against said data model. But that is
| about 20% speedup.
| rafavento wrote:
| There's something that seems to be missing in all these posts and
| that aligns with my personal experience trying to use AI coding
| assistants.
|
| I think in code.
|
| To me, having to translate the into natural language for the LLM
| to translate it back into code makes very little sense.
|
| Am I alone in this camp? What am I missing?
| ACCount36 wrote:
| I know what you mean. The thing is: if you already have the
| solution put together in your mind, it might be faster to just
| implement it by hand.
|
| But if you don't have the shape of a solution? Might be faster
| to have an AI find it. And then either accept AI's solution as
| is, or work off it.
| simonw wrote:
| If you think in code, try prompting them in code.
|
| I quite often prompt with code in a different language, or
| pseudo-code describing roughly what I am trying to achieve, or
| a Python function signature without the function body.
|
| Or I will paste in a bunch of code I have already written with
| a comment somewhere that says "TODO: retrieve the information
| from the GitHub API" and have the model finish it for me.
| davidclark wrote:
| >If you were trying and failing to use an LLM for code 6 months
| ago +, you're not doing what most serious LLM-assisted coders are
| doing.
|
| Here's the thing from the skeptic perspective: This statement
| keeps getting made on a rolling basis. 6 months ago if I wasn't
| using the life-changing, newest LLM at the time, I was also doing
| it wrong and being a luddite.
|
| It creates a never ending treadmill of boy-who-cried-LLM. Why
| should I believe anything outlined in the article is
| transformative _now_ when all the same vague claims about
| productivity increases were being made about the LLMs from 6
| months ago which we now all agree are bad?
|
| I don't really know what would actually unseat this epistemic
| prior at this point for me.
|
| In six months, I predict the author will again think the LLM
| products of 6 month ago (now) were actually not very useful and
| didn't live up to the hype.
| simonw wrote:
| tptacek wasn't making this argument six months ago.
|
| LLMs get better over time. In doing so they occasionally hit
| points where things that didn't work start working. "Agentic"
| coding tools that run commands in a loop hit that point within
| the past six months.
|
| If your mental model is "people say they got better every six
| months, therefore I'll never take them seriously because
| they'll say it again in six months time" you're hurting your
| own ability to evaluate this (and every other) technology.
| hn_throwaway_99 wrote:
| I felt this article was a lot of strawman-ing.
|
| Yes, there are people who think LLMs are just a fad, just like
| NFTs, and I agree these people are not really serious and that
| they are wrong. I think anyone who has used an AI coding agent
| recently knows that they are highly capable and can enhance
| productivity _in the right hands_.
|
| But, as someone who gets a lot of value in AI coding agents, my
| issue is not with gen AI as a productivity enhancing tool - it's
| with the absolute torrent of BS about how AI is soon going to
| make coders obsolete, and the way AI has been shoved onto many
| engineering teams _is_ like yet another incarnation of the latest
| management fad. My specific arguments:
|
| 1. As the author pretty much acknowledges, AI agents still
| basically suck at large, system-wide "thinking" and changes. And
| the way they work with their general "guess and check" method
| means they can churn out code that is kinda sorta right, but
| often leaves huge holes or outright laughable bugs.
|
| 2. Hallucinations are the worst possible failure modes - they
| _look_ correct, which makes it all the more difficult to
| determine they 're actually bullshit. I shudder to think about
| who will need to maintain the mountains of "vibe code" that is
| now being generated. Certainly not fucking me; I had a good
| career but I think now is definitely the right time to peace out.
|
| 3. Even if I could totally agree that there is a strong business
| case for AI, I can still, as an individual, think it makes my job
| generally shittier, and there is nothing wrong with having that
| opinion.
|
| I don't think I'd be so anti-AI if I saw a rational, cautious
| debate about how it can enhance productivity. But all I see are
| folks with a vested interest overselling its capabilities and
| minimizing its downsides, and it just feels really tiresome.
| sottol wrote:
| I think another thing that comes out of not knowing the codebase
| is that you're mostly relegated to being a glorified _tester_.
|
| Right now (for me) it's very frequent, depending on the type of
| project, but in the future it could be less frequent - but at
| some you've gotta test what you're rolling out. I guess you can
| use another AI to do that but I don't know...
|
| Anyway, my current workflow is:
|
| 1. write detailed specs/prompt,
|
| 2. let agent loose,
|
| 3. pull down and test... usually _something_ goes wrong.
|
| 3.1 converse with and ask agent to fix,
|
| 3.2 let agent loose again,
|
| 3.3 test again... if something goes wrong again:
|
| 3.3.1 ...
|
| Sometimes the Agent gets lost in the fixes but now have a better
| idea what can go wrong and you can start over with a better
| initial prompt.
|
| I haven't had a lot of success with pre-discussing (planning,
| PRDing) implementations, as in it worked, but not much better
| than directly trying to prompt what I want and takes a lot
| longer. But I'm not usually doing "normal" stuff as this is
| purely fun/exploratory side-project stuff and my asks are usually
| complicated but not complex if that makes sense.
|
| I guess development is always a lot of testing, but this feels
| different. I click around but don't gain a lot of insight. It
| feels more shallow. I can write a new prompt and explain what's
| different but I haven't furthered my understanding much.
|
| Also, not knowing the codebase, you might need a couple attempts
| at phrasing your ask just the right way. I probably had to ask my
| agent 5+ times, trying to explain in different ways how translate
| phone IMU yaw/pitch/roll into translations of the screen
| projection. Sometimes it's surprisingly hard to explain what you
| want to happen when you don't know the how it's implemented.
| mgraczyk wrote:
| We are beyond the point of trying to convince naysayers.
|
| I will simply not hire anybody who is not good at using LLMs, and
| I don't think I would ever work with anybody who thinks they
| aren't very useful. It's like working with somebody who things
| compilers are useless. Obviously wrong, not worth spending time
| trying to convince.
|
| To anyone who reads this article and disagrees with the central
| point: You are missing the most important thing that will happen
| in your career. You should reevaluate because you will be
| unemployable in a few years.
| andrepd wrote:
| I don't think most people with mixed feelings in LLMs (or
| heretic naysayers as you put it) would want to work in a place
| like that, so perhaps you are doing everyone a favour!
| mgraczyk wrote:
| It reminds me of many of the people I worked with early in my
| career.
|
| They were opposed to C++ (they thought C was all you need),
| opposed to git (they used IBM clearcase or subversion),
| opposed to putting internal tools in a web browser (why not
| use Qt and install the tool), opposed to using python or
| javascript for web services (it's just a script kiddie
| language), opposed to sublime text/pycharm/vscode (IDEs are
| for people who don't know how to use a CLI).
|
| I have encountered it over and over, and each time these
| people get stuck in late career jobs making less than 1/3 of
| what most 23 year old SWEs I know are making.
| SirHumphrey wrote:
| They were probably also opposed to some other things that
| failed.
|
| But then hindsight is 20/20.
| sanderjd wrote:
| Yes, but honestly, I was this way at the beginning of my
| career, and I can't think of any examples of things I was
| right about.
|
| My most successful "this is doomed to fail" grouchiness
| was social media games (like Farmville).
|
| But I just can't think of any examples in the dev tooling
| space.
| sanderjd wrote:
| I think this is a reasonable response. But I also think it's
| worth taking the parent's compiler analogy seriously as a
| thought experiment.
|
| Back when I was in college in the 00s, if I had developed a
| preference for not using compilers in my work, I might have
| been able to build a career that way, but my options would
| have been significantly limited. And that's not because
| people were just jerks who were biased against compiler
| skeptics, or evil executives squeezing the bottom line, or
| whatever. It's because the kind of software most people were
| making at that period of time would have been untenable to
| create without higher level languages.
|
| In my view, we clearly aren't at this point yet with llm-
| based tooling, and maybe we never will be. But it seems a lot
| more plausible to me that we will than it did a year or even
| six months ago.
| dnoberon wrote:
| This reads even more like an angry teenager than my angsty high
| school diary. I'm not sure how many more strawmans and dismissive
| remarks I can handle in one article.
| puttycat wrote:
| The language in this post is just terrible.
| thetwentyone wrote:
| The author posits that people don't like using LLMs with Rust
| because LLMs aren't good with Rust. Then people would migrate
| towards languages that do will with LLMs. However, if that were
| true, then Julia would be more popular since LLMs do very well
| with it: https://www.stochasticlifestyle.com/chatgpt-performs-
| better-...
| slg wrote:
| >We imagine artists spending their working hours pushing the
| limits of expression. But the median artist isn't producing
| gallery pieces. They produce on brief: turning out competent
| illustrations and compositions for magazine covers, museum
| displays, motion graphics, and game assets.
|
| One of the more eye-opening aspects of this technology is finding
| out how many of my peers seemingly have no understanding or
| respect for the concept of art.
| simonw wrote:
| How do you mean?
| slg wrote:
| Whole libraries have been written over millennia about the
| importance and purpose of art, and that specific quote
| reduced it all down to nothing more than the creation of a
| product with a specific and mundane function as part of some
| other product. I genuinely feel bad for people with that
| mindset towards art.
| pmdrpg wrote:
| This op ed suggests that it's easier to audit a huge amount of
| code before merging it in than is to write the code from scratch.
| I don't know about anyone else, but I generally find it easier to
| write exactly what I want than to mentally model what a huge
| volume of code I've never seen before will do?
|
| (Especially if that code was spit out by an alien copypasta that
| is really good at sounding plausible with zero actual
| intelligence or intent?)
|
| Like, if all I care about is: does it have enough unit tests and
| do they pass, then yeah I can audit that.
|
| But if I was trying to solve truly novel problems like modeling
| proteins, optimizing travel routes, or new computer rendering
| techniques, I wouldn't even know where to begin, it would take
| tons of arduous study to understand how the new project full of
| novel algorithms is going behave?
| DebtDeflation wrote:
| > Some of the smartest people I know share a bone-deep belief
| that AI is a fad -- the next iteration of NFT mania
|
| It's not that it's a fad. It's that the hype has gotten way ahead
| of the capability. CEOs laying off double digit percentages of
| their workforce because they believe that in 6 months AI will
| actually be able to do all those jobs and they want to get the
| message out to Wall St to juice the stock price today.
| sanderjd wrote:
| Both things can be true, and in my view, they are. I think
| there is a lot of "there" there with these tools, and
| increasingly so, and also that lots of people are out over
| their skis with the hype.
|
| The key is to learn the useful tools and techniques while
| remaining realistic and open-eyed about their limitations.
| keeganpoppen wrote:
| this is one of those fascinating cases where i agree with none of
| the arguments, but vehemently agree with the conclusion... it
| ordinarily would give me pause, but in this case i am reminded
| that nonsense arguments are equally applicable to both sides of
| the debate. if the arguments actually had logical connection to
| the conclusion, and i disliked the arguments but liked the
| conclusion, _that_ would be real cause for introspection.
| grey-area wrote:
| I'd love to see the authors of effusive praise of generative AI
| like this provide the proof of the unlimited powers of their
| tools in code. If GAI (or agents, or whatever comes next ...) is
| so effective it should be quite simple to prove that by creating
| an AI only company and in short order producing huge amounts of
| serviceable code to do useful things. So far I've seen no sign of
| this, and the best use case seems to be generating text or
| artwork which fools humans into thinking it has coherent meaning
| as our minds love to fill gaps and spot patterns even where there
| are none. It's also pretty good at reproducing things it has seen
| with variations - that can be useful.
|
| So far in my experience watching small to medium sized companies
| try to use it for real work, it has been occasionally useful for
| exploring apis, odd bits of knowledge etc, but overall wasted
| more time than it has saved. I see very few signs of progress.
|
| The time has come for llm users to put up or shut up - if it's so
| great, stop telling us and show and use the code it generated on
| its own.
| sanderjd wrote:
| What kind of proof are you looking for here, exactly? Lots of
| businesses are successfully using AI... There are many
| anecdotes of this, which you can read here, or even in the
| article you commented on.
|
| What else are you looking for?
| Hammershaft wrote:
| I'd like to see any actual case studies. So far I have only
| heard vague hype.
| ewild wrote:
| i mean i can state that i built a company wihtin the last
| year where id say 95% of my code involved using an LLM. I
| am an experienced dev so yes it makes mistakes and it
| requires my expertise to be sure the code works and to fix
| subtle bugs; however, i built this company me and 2 others
| in about 7 months for what wouldve easily taken me 3 years
| without the aid of LLMs. Is that an indictment of my
| ability? maybe, but we are doing quite well for ourselves
| at 3M arr already on only 200k expense.
| grey-area wrote:
| That's genuinely _far_ more interesting and exciting to
| me (and I'm sure others too) than this sort of breathless
| provocation, esp if code and prompts etc are shared. Have
| you written about it?
| frank_nitti wrote:
| What do you mean by "successfully using AI", do you just mean
| some employee used it and found it helpful at some stage of
| their dev process, e.g. in lieu of search engines or existing
| codegen tooling?
|
| Are there any examples of businesses deploying production-
| ready, nontrivial code changes without a human spending a
| comparable (or much greater) amount of time as they'd have
| needed to with the existing SOTA dev tooling outside of LLMs?
|
| That's my interpretation of the question at hand. In my
| experience, LLMs have been very useful for developers who
| don't know where to start on a particular task, or need to
| generate some trivial boilerplate code. But on nearly every
| occasion of the former, the code/scripts need to be heavily
| audited and revised by an experienced engineer before it's
| ready to deploy for real.
| jagged-chisel wrote:
| > ... if it's so great, stop telling us and show ...
|
| If you're selling shovels to gold miners, you don't need to
| demonstrate the shovel - you just need decent marketing to
| convince people there's gold in them thar hills.
| XorNot wrote:
| This is actually a great metaphor and phrasing and I'm filing
| it away for later btw.
| tsimionescu wrote:
| Note that it's a pretty common cliche, usually phrased
| something like "in a gold rush, the only people guaranteed
| to make money are the guys selling the shovels".
| kwertyoowiyop wrote:
| It's actually an offbeat take on that common cliche.
| citizenpaul wrote:
| Yeah exactly.
|
| Whats nuts is watching all these people shill for something
| that we all have used to mediocre results. Obviously Fly.io
| benefits if people start hosting tons of slopped together AI
| projects on their platform.
|
| Its kinda sad to watch what I thought was a good company shill
| for AI. Even if they are not directly getting money from some
| PR contract.
|
| We must not be prompting hard enough....
| blibble wrote:
| > Whats nuts is watching all these people shill for something
| that we all have used to mediocre results.
|
| this sort of post is the start of next phase in the battle
| for mindshare
|
| the tools are at the very best mediocre replacements for
| google, and the people with a vested interest in promoting
| them know this, so they switch to attacking critics of the
| approach
|
| > Its kinda sad to watch what I thought was a good company
| shill for AI.
|
| yeah, I was sad too, then I scrolled up and saw the author.
| double sadness.
| simonw wrote:
| Saying "this tool is genuinely useful to me and it's baffling
| how many people refuse to acknowledge that could possible be
| true" is not a sign that someone is being paid to "shill for
| AI".
|
| (If it is then damn, I've been leaving a ton of money on the
| table.)
| ofjcihen wrote:
| Honestly it's really unfortunate that LLMs seem to have picked
| up the same hype men that attached themselves to blockchains
| etc.
|
| LLMs are very useful. I use them as a better way to search the
| web, generate some code that I know I can debug but don't want
| to write and as a way to conversationally interact with data.
|
| The problem is the hype machine has set expectations so high
| and refused criticism to the point where LLMs can't possibly
| measure up. This creates the divide we see here.
| busymom0 wrote:
| I think LLM hype is more deserved and different from that of
| blockchain.
|
| There's still a significant barrier to entry to get involved
| with blockchain and most people don't even know what it is.
|
| LLMs on the other hand have very low barrier to at least use-
| one can just go to google, ChatGPT etc and use it and see its
| effectiveness. There's a reason why in the last year, a
| significant portion of school students are now using LLMs to
| cheat. Blockchains still don't have that kind of utilization.
| ofjcihen wrote:
| I agree with all of these points.
|
| Honestly I think that makes the argument stronger though
| that it's unfortunate they jumped on.
| vohk wrote:
| I think I agree with the general thrust but I have to say
| I've yet to be impressed with LLMs for web search. I think
| part of that comes from most people using Google as the
| benchmark, which has been hot garbage for years now. It's not
| hard to be better than having to dig 3 sponsored results deep
| to get started parsing the list of SEO spam, let alone the
| thing you were actually searching for.
|
| But compared to using Kagi, I've found found LLMs end up
| wasting more of my time by returning a superficial survey
| with frequent oversights and mistakes. At the final tally
| I've still found it faster to just do it myself.
|
| I will say I do love LLMs for getting a better idea of _what_
| to search for, and for picking details out of larger blocks.
| jcranmer wrote:
| > I think part of that comes from most people using Google
| as the benchmark, which has been hot garbage for years now.
|
| Honestly, I think part of the decline of Google Search is
| because it's trying to increase the amount of AI in search.
| antithesizer wrote:
| There's not much riding on convincing the broader public that
| AI is the real deal before it's proved itself beyond the
| shadow of any doubt. There's nothing they can do to prepare
| at this point.
| avanai wrote:
| A "eulogy" is a speech you make at a funeral in honor of the
| dead person. I think you meant "apology".
| grey-area wrote:
| Yes I think I was thinking more a paean or apology though not
| sure apology is used in that sense much nowadays - perhaps
| apologia is clearer. In praise of would be better, thanks
| will edit just now.
| antithesizer wrote:
| The Greek transliteration "apologia" is often used for that
| sense of "apology" to skirt any ambiguity.
| tsimionescu wrote:
| While that is the most common sense of eulogy, it's not the
| only one. A eulogy is also any speech that highly praises
| someone or something - which is most commonly done at
| funerals, which is how the funeral association came about
| (also probably by association with an elegy, which is an
| etymologically unrelated word that refers to a Greek poem
| dedicated to someone who passed away).
|
| In many romance languages, eulogy doesn't have the funeral
| connotation, only the high praise one - so the GP may be a
| native speaker of a romance language who didn't realize this
| meaning is less common in English.
| keybored wrote:
| > The time has come for llm users to put up or shut up - if
| it's so great, stop telling us and show and use the code it
| generated on its own.
|
| I'm open to that happening. I mean them showing me. I'm less
| open to the Nth "aww shucks, the very few doubters that are
| left at this point are about to get a rude awakening" FOMO
| concern trolling. I mean I guess it's nice for me that you are
| so concerned about my well-being, soon to be suffering-being?
|
| Now, AI can do a lot of things. Don't get me wrong. It has
| probably written a million variations on the above sentiment.
| geoduck14 wrote:
| You think that the only code that is valuable is code that is
| written by a professional SWE.
|
| There are LOADS of people who need "a program" but aren't
| equipped to write code or hire an SWE that are empowered by
| this. And example: last week, I saw a PM vibe code several
| different applications to demo what might get built after it
| gets prioritized by SWEs
| grey-area wrote:
| Not really I'm fine with anyone knocking stuff together but I
| think people should be aware of the limitations and dangers.
| Writing like this does nothing to inform and is overly
| positive IMO.
|
| It'd be like insisting llms will replace authors of novels.
| In some sense they could but there are serious shortcomings
| and things like agents etc just don't fix them.
| conradev wrote:
| Have you used a language model to program yet?
| grey-area wrote:
| Yes sure, I said so in the post, and have watched others try
| to do so too.
| mvdtnz wrote:
| > If GAI (or agents, or whatever comes next ...) is so
| effective it should be quite simple to prove that by creating
| an AI only company and in short order producing huge amounts of
| serviceable code to do useful things.
|
| I don't think this follows. Anyone can see that 10-ton
| excavator is hundreds or even thousands of times more efficient
| than a man with a shovel. That doesn't mean you can start a
| company up staffed only with excavators. Firstly you obviously
| need people operating the excavator. Secondly the excavator is
| incredibly efficient at moving lots of dirt around, but no crew
| could perform any non-trivial job without all the tasks that
| the excavator is not good out - planning, loading/unloading,
| prepping the site, fine work (shovelling dirt around pipes and
| wires), etc.
|
| AI is a tool. It will mean companies can run much leaner. This
| doesn't imply they can do everything a company needs to do.
| steego wrote:
| Approximately speaking, what do you want to see put up?
|
| I ask this because it reads like you have a _specific_
| challenge in mind when it comes to generative AI and it sounds
| like anything short of "proof of the unlimited powers" will
| fall short of being deemed "useful".
|
| Here's the deal: Reasonable people aren't claiming this stuff
| is a silver bullet or a panacea. They're not even suggesting it
| should be used without supervision. It's useful when used by
| people who understand its limitations and leverage its
| strengths.
|
| If you want to see how it's been used by someone who was happy
| with the results, and is willing to share their results, you
| can scroll down a few stories on the front-page and check the
| commit history of this project:
|
| https://github.com/cloudflare/workers-oauth-provider/commits...
|
| Now here's the deal: These people aren't trying to prove
| anything to you. They're just sharing the results of an
| experiment where a very talented developer used these tools to
| build something useful.
|
| So let me ask you this: Can we at least agree that these tools
| can be of _some_ use to talented developers?
| hooverd wrote:
| It's useful, but the promise of every AI company is very
| explicitly that they will burn the seed corn and choke off
| the pipeline that created those "very talented" developers
| who reviewed it!
| grey-area wrote:
| I'm less worried about this as the best way to learn to
| code is to read as well as write it IMO.
|
| If capabilities don't improve it's not replacing anyone, if
| they do improve and it can write good code, people can
| learn from reading that.
|
| I don't see a pathway to improvement though given how these
| models work.
| grey-area wrote:
| Yes sure I've checked in code generated by AI myself. I've
| not experienced the excitement this article exudes though and
| it seems very limited in usefulness due to the by now well-
| documented downsides. Frankly I haven't bothered using it
| much recently, it's just not there yet IME.
|
| What I'm interested in really is just case studies with
| prompts and code - that's a lot more interesting for hackers
| IMO than hype.
| marxism wrote:
| I think we're talking past each other. There's always been a
| threshold: above it, code changes are worth the effort; below
| it, they sit in backlog purgatory. AI tools so far seem to
| lower implementation costs, moving the threshold down so more
| backlog items become viable. The "5x productivity" crowd is
| excited about this expanded scope, while skeptics correctly
| note the highest value work hasn't fundamentally changed.
|
| I think what's happening is two groups using "productivity" to
| mean completely different things: "I can implement 5x more code
| changes" vs "I generate 5x more business value." Both
| experiences are real, but they're not the same thing.
|
| https://peoplesgrocers.com/en/writing/ai-productivity-parado...
| AnnaPali wrote:
| I agree 100%! It's amazing how few people grok this.
| surgical_fire wrote:
| > The "5x productivity" crowd is excited about this expanded
| scope, while skeptics correctly note the highest value work
| hasn't fundamentally changed.
|
| This is true, LLMs can speed up development (some asterisks
| are required here, but that is generally true).
|
| That said, I've seen, mainly here on HN, so many people
| hyping it up way beyond this. I've got into arguments here
| with people claiming it codes at "junior level". Which is an
| absurd level of bullshit.
| yencabulator wrote:
| You seem to think generating 5x more code results in _better_
| code, in the left column. I highly doubt this.
| grey-area wrote:
| Yes there are huge unstated downsides to this approach if
| this is production code (which prototypes often become).
| cube2222 wrote:
| I think this is actually a really good point. I was just
| recently thinking that LLMs are (amongst other things) great
| for streamlining these boring energy-draining items that "I
| just want done" and aren't particularly interesting, but at
| the same time they do very little to help us juggle more
| complex codebases right now.
|
| Sure, they might help you onboard into a complex codebase,
| but that's about it.
|
| They help in breadth, not depth, really. And to be clear, to
| me that's extremely helpful, cause working on "depth" is fun
| and invigorating, while working on "breadth" is more often
| than not a slog, which I'm happy to have Claude Code write up
| a draft for in 15 minutes, review, do a bunch of tweaks, and
| be done with.
| bicx wrote:
| This is exactly what I've experienced. For the top-end high-
| complexity work I'm responsible for, it often takes a lot
| more effort and research to write a granular, comprehensive
| product spec for the LLM than it does to just jump in and do
| it myself.
|
| On the flip side, it has allowed me to accomplish many lower-
| complexity backlog projects that I just wouldn't have even
| attempted before. It expands productivity on the low end.
|
| I've also used it many times to take on quality-of-life tasks
| that just would have been skipped before (like wrapping
| utility scripts in a helpful, documented command-line tool).
| rybosome wrote:
| Many, many people are in fact "using the code it generated on
| its own". I've been putting LLM-assisted PRs into production
| for months.
|
| With no disrespect meant, if you're unable to find utility in
| these tools, then you aren't using them correctly.
| surgical_fire wrote:
| > LLM-assisted PRs
|
| This does not counter what GP said. Using LLM as a code
| assistant is not the same as "I don't need to hire developers
| because LLMs code in their place"
| detaro wrote:
| Which one is the article talking about?
| kalkin wrote:
| Which one, in your understanding, is the OP advocating for?
| lando2319 wrote:
| yep I've used Devon and now Google Jules, for the big stuff,
| it has lots of wrong code, but it still end up giving my a
| much better start than starting from scratch certainly. When
| it all comes together it give me a 6X boost. But def fixing
| all the wrong code and thoroughly testing it is the time
| consuming part.
| lelandbatey wrote:
| If you read post, the article is mostly agreeing with you. What
| they're pointing out is not "the AI can do everything you do",
| it's that "an AI coder can do a lot of the boring typing a lot
| faster than you, leaving you right at the point of 'real
| implementation'".
|
| Having something else write a lot of the boring code that
| you'll need and then you finish up the final touches, that's
| amazing and a huge accelerator (so they claim).
|
| The claim is not "AI will replace us all", the claim of the
| parent article is "AI is a big deal and will change how we
| work, the same way IDEs/copy-paste/autocomplete/online
| documentation have radically changed our work."
| NotAnOtter wrote:
| The author's central argument seems to be that the current
| state of LLM development is such that 1 Senior + LLM === 1
| Senior + 4 juniors
|
| With that as a metric, 1 Senior + 4 juniors cannot build the
| company with the scope you are describing.
|
| A 50-eng company might have 1 CTO, 5 staff, 15 Seniors, and 29
| juniors. So the proposition is you could cut the company in
| ~half but would still require the most-expensive aspects of
| running a company.
| surgical_fire wrote:
| > The author's central argument seems to be that the current
| state of LLM development is such that 1 Senior + LLM === 1
| Senior + 4 juniors
|
| This is such an outlandish claim, to the point where I call
| it plain bullshit.
|
| LLMs are useful in a completely different way that a Junior
| developer is. It is an apples and oranges comparison.
|
| LLMs does things in some way that it helps me beyong what a
| Junior would. It also is completely useless to perform many
| tasks that a Junior developer can.
| protocolture wrote:
| >I'd love to see the authors of effusive praise of generative
| AI like this
|
| He spent a large tranche of the article specifically hanging a
| lantern on how mediocre the output is.
|
| >by creating an AI only company
|
| He specifically says that you need to review the code over and
| over and over.
| ghostly_s wrote:
| Did you even glance at the link? The author is advocating for a
| human-supervised LLM agent workflow.
| Karrot_Kream wrote:
| This 2 year old Goroutine pool implementation [1] is 95% GPT
| generated and has commit history showing what GPT did. It's an
| older example, but it is one.
|
| [1]: https://github.com/devchat-ai/gopool
| pj_mukh wrote:
| I think this is a misunderstanding coder productivity. A 10x
| engineer isn't 10x faster at popping out Unit tests, that stuff
| is mind-numbingly boring that turns out a next token predictor
| can do it with ease. In fact I would guess that really
| "productive" software engineers, slow down considerably when
| forced to do this important but slow work*.
|
| The 10x engineer is _really_ good at deducing the next most
| important thing to do is and doing it quickly. This involves
| quickly moving past 100 's of design decisions in a week to
| deliver something quickly. It requires you to think partly like
| a product manager and partly like a senior engineer but that's
| the game and LLM's are zero help there.
|
| Most engineering productivity is probably locked up in this. So
| yes, LLM's probably help a lot, just not in the way that would
| show on some Jira board?
|
| *One could claim that doing this slow work gives the brain a
| break to then be good at strategizing the higher order more
| important work. Not sure.
| SatvikBeri wrote:
| I don't think I would notice a 100% improvement in software
| productivity in most companies, from the outside. Most of the
| time, that would just translate to the company being able to
| hire fewer developers, and having slightly higher profit
| margins - but not enormously higher, because developers are
| only one part.
|
| I recently used Claude Code to develop & merge an optimization
| that will save about $4,000 a month. It was relatively simple
| but tedious, so I probably wouldn't have done it on my own. I
| don't even expect most of my coworkers to notice.
| georgemcbay wrote:
| I don't know if you are the same (S.G.) greyarea I'm familiar
| with but I hope so because the idea of having a couple of 90s
| era irc people take opposing viewpoints on LLMs in 2025 amuses
| me.
| xyst wrote:
| Ask me again in 15 years. Assuming the world hasn't already
| entered a war for the remaining resources on this planet.
| forty wrote:
| Why would anyone rather read and fix someone else code rather
| than writing the code themselves? I do a lot of code review for
| other human code and it use so much more energy than writing my
| own code (and surely, as I have competent colleagues, this is not
| even as bad as if I expected that the code that I'm reading could
| be totally random shit)
| pona-a wrote:
| > It's projection. People say "LLMs can't code" when what they
| really mean is "LLMs can't write Rust". Fair enough! But people
| select languages in part based on how well LLMs work with them,
| so Rust people should get on that.
|
| How is it the responsibility of the Rust community that there
| weren't enough metric tons of free code for the machine to slurp
| up? And the phrasing makes it sound like it's the community's
| fault for not feeding OpenAI enough code to be stripped of its
| license and authorship and get blended into a fine latent soup.
| It's a lot like people coming to a one-man FOSS project with a
| laundry list of demands, expecting to be treated with the
| religious reverence of a major enterprise contract.
|
| And the whole tone, the pervasive "use it or you'll be left
| behind"--where users saying they don't want or need it only
| proves further evidence of its imminent apotheosis--superficially
| reminds me of previous FUDs.
|
| And how is it not concerning that the thing described as
| intelligent needs billions of lines to generalize a language a
| human can learn from a single manual? Will it need hundreds of
| kLOC to internalize a new library, or even its new version,
| beyond in-context learning? The answer is yes; you are choosing
| to freeze the entire tech stack, when to its abstractions could
| actually save you from boilerplate, just so the machine can write
| it for you at $200 a month with a significant error rate.
| ChrisMarshallNY wrote:
| I use AI every day, basically as a "pair coder."
|
| I used it about 15 minutes ago, to help me diagnose a UI issue I
| was having. It gave me an answer that I would have figured out,
| in about 30 minutes, in about 30 seconds. My coding style (large
| files, with multiple classes, well-documented) works well for AI.
| I can literally dump the entire file into the prompt, and it can
| scan it in milliseconds.
|
| I also use it to help me learn about new stuff, and the "proper"
| way to do things.
|
| Basically, what I used to use StackOverflow for, but without the
| sneering, and _much_ faster turnaround. I 'm not afraid to ask
| "stupid" questions -That is _critical_.
|
| Like SO, I have to take what it gives me, with a grain of salt.
| It's usually too verbose, and doesn't always match my style, so I
| end up doing a lot of refactoring. It can also give rather
| "naive" answers, that I can refine. The important thing, is that
| I usually get something that works, so I can walk it back, and
| figure out a better way.
|
| I also won't add code to my project, that I don't understand, and
| the refactoring helps me, there.
|
| I have found the best help comes from ChatGPT. I heard that
| Claude was supposed to be better, but I haven't seen that.
|
| I don't use agents. I've not really ever found automated
| pipelines to be useful, in my case, and that's sort of what
| agents would do for me. I may change my mind on that, as I learn
| more.
| disambiguation wrote:
| Man the redbull is oozing off this post, talk about sipping
| rocket fuel.
|
| I mean a tool is a tool, nothing wrong with that - but most of
| the resistence stems from AI being shoved down our throats at
| warp speed. Its already everywhere and I can't opt out, that
| stinks.
|
| As for the skepticism in terms of adoption and usefulness, its
| mainly a question of whether or not it will continue improving -
| there's no way to no what lies ahead, but if it came to a
| grinding halt today well then the high water mark just isn't all
| that impressive.
|
| > Yeah, we get it. You don't believe in IPR. Then shut the fuck
| up about IPR. Reap the whirlwind.
|
| This is the point that matters, and I don't think everyone is on
| the same page that LLMs are essentially over glorified data
| laundering.
|
| The industry would get just as much "value" if we declared a
| jubilee and wiped out all licenses and allowed unlimited
| plagiarism (Looking at Zuckerburg and his 10 TB of pirated data).
| In fact, if AI owners published their training data sets with a
| capable search engine, I would bet money of it out performing
| LLMs in most cases. Why waste all that man power reinventing
| Netflix again? Just copy paste the code and give everyone their
| time back, sheesh.
|
| > Kids today don't just use agents; they use asynchronous agents.
| They wake up, free-associate 13 different things for their LLMs
| to work on, make coffee, fill out a TPS report, drive to the Mars
| Cheese Castle, and then check their notifications. They've got 13
| PRs to review. Three get tossed and re-prompted. Five of them get
| the same feedback a junior dev gets. And five get merged.
|
| I'm in a role that is behind the times, using a bespoke in-house
| framework that is immune to the benefits of LLMs, so I don't get
| to see what you see - so as a skeptic, I'm not convinced this
| isn't just the illusion of speed. I have not seen convincing
| results, show me the amazing things being made by AI (AI tooling
| itself does not count) - but yes, maybe that's because its all
| siloed into walled gardens.
|
| > But something real is happening. My smartest friends are
| blowing it off. Maybe I persuade you. Probably I don't. But we
| need to be done making space for bad arguments.
|
| Yeah all the arguments have been made, good and bad, we're all
| waiting to see how it plays out. But I'd rather take the side of
| being a skeptic - if I'm right then I'm in the right place. If
| I'm wrong, that's cool too, I don't mind playing catch-up. But
| fully embracing the hype is, IMO, tantamount to putting all your
| eggs in one basket, seems like a needless risk but if that's
| worth it to you to get ahead then by all means, slurp up the
| hype.
| TheRoque wrote:
| One of the biggest anti LLM arguments for me at the moments is
| about security. In case you don't know, if you open a file with
| copilot active or cursor, containing secrets, it might be sent to
| a server a thus get leaked. The companies say that if that file
| is in a cursorignore file, it won't be indexed, but it's still a
| critical security issue IMO. We all know what happened with the
| "smart home assistants" like Alexa.
|
| Sure, there might be a way to change your workflow and never ever
| open a secret file with those editors, but my point is that a
| software that sends your data without your consent, and without
| giving you the tools to audit it, is a no go for many companies,
| including mine.
| consumer451 wrote:
| I am just some shmoe, but I believe that devs fall into to major
| categories when it comes to LLMs: those with their own product
| ideas, and those without their own product ideas.
|
| The prior look upon Claude Code/Cursor/Windsurf much more
| favorably, as they are able to ship their ideas much faster.
|
| This is a bit of hot take, so I would love any replies to bring
| me back down to earth.
| abdullin wrote:
| My current workflow with Codex is (coding environment from
| OpenAI):
|
| (1) Ask to write an implementation plan for a specific change or
| a feature. It will go through the source code, look up
| references, make notes and produce a plan
|
| (2) Review the plan. Point out missing things, or stuff that
| needs improvement.
|
| (3) Once I'm satisfied with the plan - ask to draft PR. Launch a
| few attempts in parallel and pick the one that I like the most.
|
| (4) While drafting PR, Codex will run unit tests (even can run
| E2E tests in its container), linting and type checkers at every
| single step. This helps a lot with the stability.
|
| (5) I review the code and merge the PR if I like it. Ask to
| cleanup - if not.
|
| This feels like working with a remote team - very patient and
| diligent at that.
|
| Ultimately, I get to get more features done per day. But I also
| feel more tired by the end of the day due to a higher level of
| cognitive load. There are more decisions to make and less idle
| time (e.g. no more hours spent tidying up the code or doing
| relaxing and pretty refactoring).
|
| TLDR; this AI thing works really well at least for me. But it
| comes with trade-offs that might slow down its adoption by
| companies en masse.
| donatj wrote:
| And this goes at least part of the way towards explaining why
| Fly.io has been the second least reliable cloud provider I've
| ever used, after Azure.
| CraigJPerry wrote:
| That was a stellar read. I've had (at least parts of) many of
| these thoughts floating around my head over the past few weeks /
| months, but it'd have been beyond my ken to write them down as
| lucidly.
| jongjong wrote:
| I believe that AI is very useful in software development but I
| don't buy the narrative that AI is responsible for layoffs over
| the past few years (at least not most of them). I think that
| narrative is a convenient cover for systemic misallocation which
| created a need to contain inflation. I think big tech execs
| understood that, beyond increasing their company stock prices,
| they also need to work together to keep the monetary system
| itself under control. This is why they've been firing people
| whilst having record profits. They've reached such scale and the
| system has reached such fragility that they have to think and act
| like economists to keep the thing going. The economy itself has
| become the responsibility of big tech.
|
| But who knows, maybe AI will accelerate so rapidly that it will
| fix the economy. Maybe we'll have robots everywhere doing all the
| work. But I worry about the lack of market incentives for people
| to adapt AI to real world use cases.
|
| For example, I'm an open source developer who likes to tinker but
| I've been booted out of the opportunity economy. I can't afford
| to program robots. People like me are too busy using AI to parse
| spreadsheets and send targeted ads to even think about automating
| stuff. We work for companies and have no autonomy in the markets.
|
| If things had worked out differently for me, I'd probably own a
| farm now and I'd be programming robots to do my harvest and
| selling the robots or licensing the schematics (or maybe I'd have
| made them open source, if open source had worked out so well for
| me). I don't have access to such opportunity unfortunately. The
| developers who worked for big tech are good at politics but often
| disconnected from value-creation. Few of them have the skills or
| interest to do the work that needs to be done now... They will
| just continue leveraging system flaws to make money, so long as
| those flaws exist.
| aucisson_masque wrote:
| > Professional software developers are in the business of solving
| practical problems for people with code. We are not, in our day
| jobs, artisans. Steve Jobs was wrong: we do not need to carve the
| unseen feet in the sculpture. Nobody cares if the logic board
| traces are pleasingly routed. If anything we build endures, it
| won't be because the codebase was beautiful.
|
| I think it comes all down to that, do you have pride in what you
| do or you don't ?
|
| I make a wall with bricks, even if it will be covered with
| coating i will do my best to have regular joints and pacing.
|
| Could make it faster, no one would notice the difference but
| me... i hate that feeling when you done something and you know
| it's barely enough, just barely, it's kind of shit and you really
| don't want others to see it.
|
| On the opposite side, some people will take pride in building
| wall twice as fast as me and won't care it's horrendous.
|
| Both cases are valid, but me i know i can't do a work I'm not
| proud of.
| matt_s wrote:
| > This was the craftsman's 'Golden Age' and much time and trouble
| was taken over the design of tools. Craftsmen were being called
| upon to do more skilful and exacting work and the use of tools
| and the interest in development had become very widespread.
|
| Above pulled from A Brief History of the Woodworking Plane [0]. A
| woodworking tool that has evolved over 2,000 years. Now there are
| electric planers, handheld electric planers and lots of heavy
| machinery that do the same thing in a very automated way. If a
| company is mass producing kitchen cabinets, they aren't hand
| planing edges on boards, a machine is doing all that work.
|
| I feel like with AI we are on the cusp of moving beyond a "Golden
| age" and into an "industrial age" for coding, where it will
| become more important to have code that AI understands vs.
| something that is carefully crafted. Simple business pressure
| will demand it (whether we like it or not).
|
| ^ A comment I made just yesterday on a different thread.
|
| For software developers AI is like the cabinet maker that gets a
| machine to properly mill and produce cabinet panels, sure you can
| use a hand plane to do that but you're producing a very different
| product and likely one that not many people will care about,
| possibly not even your employer when they see all the other wood
| shops pumping out cabinetry and taking their market share.
|
| [0] https://www.handplane.com/879/a-brief-history-of-the-
| woodwor...
| puttycat wrote:
| > An LLM can be instructed to just figure all that shit out.
| Often, it will drop you precisely at that golden moment where
| shit almost works, and development means tweaking code and
| immediately seeing things work better.
|
| Well, except that in order to fix that 1% you'd need to read and
| understand whatever the LLM did and then look for that 1%. I get
| the shills just thinking about this, whether the original
| programmer human or not. I'd rather just write everything myself
| to begin with.
| nixpulvis wrote:
| I keep talking to people who've had a good bit of success using
| gemini or cluade to build quick prototype front ends for some
| applications. I think theres some questions in my head of how
| well the process scales when you want to keep adding features,
| but according to them it's not been hard getting it to make the
| needed changes.
|
| My issue with it is that it gates software development behind
| paid services with various levels of context supported.
| Absolutely not the dream I have of how more software should be
| open source and everyone should be empowered to make the changes
| they need.
| keybored wrote:
| Thankfully the uncrazy person is going to get us on that sane VC
| AI wavelength.
|
| > If you're making requests on a ChatGPT page and then pasting
| the resulting (broken) code into your editor, you're not doing
| what the AI boosters are doing. No wonder you're talking past
| each other.
|
| They're playing 3D chess while you're stuck at checkers.
|
| I do things suboptimally while learning the ropes or just doing
| things casually. That doesn't mean that I judge the approach
| itself by my sloppy workflow. I'm able to make inferences about
| what a serious/experienced person would do. And it wouldn't
| involve pasting things through three windows like I would do.
|
| So of course I don't judge AI by "ask chatty and paste the
| response".
|
| Yes indeed: "deploying agents" is what I would imagine the Ask
| Chatty And Paste workflow taken to Perfection to look like.
|
| > LLMs can write a large fraction of all the tedious code you'll
| ever need to write. And most code on most projects is tedious.
| LLMs drastically reduce the number of things you'll ever need to
| Google. They look things up themselves. Most importantly, they
| don't get tired; they're immune to inertia.
|
| Most Rube Goldberg machines are very tedious and consist of
| fifty-too-many parts. But we can automate most of that for you--
|
| I could not have ever imagined a more Flintstones meets Science
| Fiction clash than AI According To Software Engineers. You're
| using AI to generate code. And no one cares how much. It's just
| so tedious in any case.
|
| A wortwhile approach would have been to aspire to make or
| generate technology artifacts that could be hidden behind a black
| box surface with a legible interface in front. Is the code
| tedious? Then make the AI come up with something that is well-
| designed, where the obvious things you want is given freely,
| where minor customizations are just minor tweaks, and larger
| deviations require only proportionally larger changes. Uh, how
| about no? How about generating 20KLOC line "starter" some-
| framework project with all the 20KLOC "tedious" bits hanging out,
| then we can iterate from there. The AI made a Git log and
| everything so it's ya know audited.
|
| But maybe I'm being unfair. Maybe we are moving towards something
| not quite as stupid as Deploy ChatGPT 50X? Or maybe it's
| effectively going to behind a black box. Because ya know the AI
| will deal with it all by itself?
|
| > Are you a vibe coding Youtuber? Can you not read code? If so:
| astute point. Otherwise: what the fuck is wrong with you?
|
| > You've always been responsible for what you merge to main. You
| were five years go. And you are tomorrow, whether or not you use
| an LLM.
|
| No!, and what the fuck is wrong with you? We are Flintstone
| technologists and I'll be damned if I can't get my AI brain chip-
| injected, genetically enhanced for speed horsey cyborg for my
| modern horse-drawn carriage patent.
| nitwit005 wrote:
| > Does an intern cost $20/month? Because that's what Cursor.ai
| costs.
|
| > Part of being a senior developer is making less-able coders
| productive, be they fleshly or algebraic. Using agents well is
| both a both a skill and an engineering project all its own, of
| prompts, indices, and (especially) tooling. LLMs only produce
| shitty code if you let them.
|
| A junior developer often has negative value to a team, because
| they're sapping the time of more senior developers who have to
| help train them, review code, fix mistakes, etc. It can take a
| long while to break even.
|
| The raw cost of Cursor's subscription is surely dwarfed by your
| own efforts, given that description. The actual calculous here
| should be the cost to corral Cursor, against the value of the
| code it generated.
| omot wrote:
| I think of programming languages as an interface between humans
| and computers. If anything, the industry expanded because of this
| abstraction. Not everyone has to learn assembly to build cool
| shit. To me AI is the next step in this abstraction where you
| don't need to learn programming languages to potentially build
| cool projects. The hard part of software engineering is scale
| anyways. My bet is that this will expand the industry in
| unprecedented ways. Will there be contraction of traditional
| programming jobs? Absolutely. The growth in tech jobs over the
| last 20 years weren't more assembly programmers. They were
| abstraction experts. I'm sure the next wave will be even bigger,
| professional prompting will explode in size.
| TheRoque wrote:
| The C abstracting the assembly or the GC a abstracting away
| memory management work because they were possible to implement
| in a deterministic and reliable way (well, in the case of
| garbage collection, not all the time)
|
| But I don't think that's a similar situation for LLMs, where
| the hallucinations or failure to debug their own issues are way
| too frequent to just "vibe code"
| deadbabe wrote:
| I can't wait for the day when people no longer manually write
| text messages to each other, but instead just ask LLMs to read
| and respond from a few prompted words.
| Yossarrian22 wrote:
| Where is the counter argument to this not being sustainable?
| blibble wrote:
| > We're not East Coast dockworkers; we won't stop progress on our
| own.
|
| we could choose to be
|
| of course if you're a temporarily embarrassed billionaire like
| ptacek, you certainly don't want the workers doing this
| martythemaniak wrote:
| At 0 temperature an LLM is a Map<String,String> - a string input
| (key) will give you the same string output (value) every time.
| Hypothesis: there exists a key whose value is a complete,
| working, fully-tested application which meets your requirements
| 100% and fulfills your business need. This key is the smallest,
| most complete description of what your application does. It is
| written in natural language and represents a significantly
| compressed version of your application code.
|
| My part-time obsession over the last few months has been trying
| to demonstrate this and come up with a method for finding these
| magic keys (I even tried to get the LLMs to search for me, lol).
| What I really want is to give the latest thinking models (200k
| input, 100k output) a 5-6 page design doc (4k words, 5k tokens)
| and have them produce a complete 5kloc (50k tokens) microservice,
| which would show a 10x compression. It's hard, but I haven't seen
| any reason to think it wouldn't work.
|
| For better or worse, I think this will be close to what IC jobs
| will be like in few years. Fundamentally, our jobs are to try
| work with other functions to agree to some system that needs to
| exist, then we talk to the computers to actually implement this.
| If we switch kotlin+compiler for design doc+llm, it still going
| to be somewhat the same, but far more productive. Agents and such
| are somewhat of a stop-gap measure, you don't want people giving
| tasks to machines, you want to accurately describe some idea and
| then let the computers make it work. You can change your
| description and they can also figure out their own tasks to
| evolve the implementation.
| nostrademons wrote:
| Curious how he reconciles this:
|
| > If you build something with an LLM that people will depend on,
| read the code. In fact, you'll probably do more than that. You'll
| spend 5-10 minutes knocking it back into your own style.
|
| with Joel Spolsky's fundamental maxim:
|
| > It's harder to read code than to write it.
|
| https://www.joelonsoftware.com/2000/04/06/things-you-should-...
| fnordpiglet wrote:
| FWIW with proper MDC/ rules I've found LLM programming agents
| excellent at rust. There's a lot of complex and tedious minutiae
| in rust that I know but forget to apply everywhere it's helpful
| while a SOTA LLM agent does well, especially with proper rule
| guidance to remember to use it.
|
| Generally though I find LLMs have a pretty rapidly diminishing
| return on what you can expect out of them. They're like a 3-5
| year senior programmer that has really learned their domain well,
| but doesn't have the judgement of a principal engineer. You get
| to a point where you need to reach in and right things and really
| pay attention, and at that point the diminishing returns set it
| rapidly and you're better off just doing the rest yourself.
| Refactors and stuff can be delegated but that's about it.
|
| I find this true regardless of the language. None the less, I've
| been able to improve my overall velocity dramatically completing
| several projects in the last few months in the span of one
| typically. If tooling improves I hope to continue that but I'm
| already getting close to the limit of how fast I can conceive of
| useful creative things.
| morning-coffee wrote:
| Check, please.
| ChicagoDave wrote:
| There are many aspects to AI push back.
|
| - all creatives are flat against it because it's destroying their
| income streams and outright stealing their intellectual property
|
| - some technical leaders are skeptical because early returns were
| very bad and they have not updated their investigations to the
| latest tools and models, which are already significantly ahead of
| even six months ago
|
| - a tech concern is how do we mentor new developers if they don't
| know how to code or develop logic. LLMs are great IF you already
| know what you're doing
|
| - talent is deeply concerned that they will be reduced and
| replaced, going from high paying careers to fast food salaries
|
| We have a lot of work to balance productivity with the benefits
| to society. "Let them eat cake," is not going to work this time
| either.
| eqvinox wrote:
| > but the plagiarism [...] Cut me a little slack as I ask you to
| shove this concern up your ass. No profession has demonstrated
| more contempt for intellectual property.
|
| Speeding is quite common too, yet if you get caught -- especially
| overdoing it -- you'll have a problem.
|
| Also, in this case, presumably everything produced with AI is
| fair game too? The argument being made here isn't even "it's not
| plagiarism", rather "it's plagiarism but I don't care" -- why
| would anyone else respect such an author's copyrights?
| jjcm wrote:
| The most important thing in this article in my mind is in the
| level setting section - if you are basing your perspective on the
| state of AI from when you tested it 6mo+ ago, your perspective is
| likely not based on the current reality.
|
| This is kind of a first though for any kind of technology. The
| speed of development and change here is unreal. Never before has
| a couple months of not being on top of things led to you being
| considered "out of date" on a tool. The problem is that this kind
| of speed requires not just context, but a cultural shift on the
| speed of updating that context. Humanity just isn't equipped to
| handle this rate of change.
|
| Historically in tech, we'd often scoff at the lifecycle of other
| industries - Airlines haven't changed their software in 20
| years?? Preposterous! For the vast majority of us though, _we 're
| the other industry now_.
| munificent wrote:
| _" Kids today don't just use agents; they use asynchronous
| agents. They wake up, free-associate 13 different things for
| their LLMs to work on, make coffee, fill out a TPS report, drive
| to the Mars Cheese Castle, and then check their notifications.
| They've got 13 PRs to review. Three get tossed and re-prompted.
| Five of them get the same feedback a junior dev gets. And five
| get merged."_
|
| I would jump off a bridge before I accepted that as my full-time
| job.
|
| I've been programming for 20+ years and I've never wanted to move
| into management. I got into programming because I _like
| programming,_ not because I like asking others to write code on
| my behalf and review what they come up with. I 've been in a lead
| role, and I certainly do lots of code review and enjoy helping
| teammates grow. But the last fucking thing I want to do is
| delegate _all_ the code writing to someone or something else.
|
| I like writing code. Yes, sometimes writing code is tedious, or
| frustrating. Sometimes it's yak-shaving. Sometimes it's Googling.
| Very often, it's debugging. I'm happy to have AI help me with
| some of that drudgery, but if I ever get to the point that I feel
| like I spend my entire day in virtual meetings with AI agents,
| then I'm changing careers.
|
| I get up in the morning to make things, not to watch others make
| things.
|
| Maybe the kind of software engineering role I love is going to
| disappear, like stevedores and lamplighters. I will miss it
| dearly, but at least I guess I got a couple of good decades out
| of it. If this is what the job turns into, I'll have to find
| something else to do with my remaining years.
| mrbungie wrote:
| You know what's nuts? How so many articles about supporting LLMs
| and against skeptics are so full of fallacies and logical
| inconsistencies like strawmans, false dichotomies, appeals to
| emotion and to authority when they have supposedly almost AGI
| machines to assist them in their writing. They could at least do
| a "please take a look at my article and see if I'm commiting any
| logical fallacies" prompt iteration session if they trust these
| tools so much.
|
| These kinds of articles that heavily support LLM usage in
| programming seem to FOMO you or at least suggest that "you are
| using it wrong" in a weak way just to invalidate contrary or
| conservative opinions out of the discussion. These are pure
| rhetorics with such an empty discourse.
|
| I use these tools everyday and every hour in strange loops
| (between at least Cursor, ChatGPT and now Gemini) because I do
| see some value in them, even if only to simulate a peer or rubber
| duck to discuss ideas with. They are extremely useful to me due
| to my ADHD and because they actually support me through my
| executive disfunction and analysis paralysis even if they produce
| shitty code.
|
| Yet I'm still an AI skeptic because I've seen enough failure
| modes in my daily usage. I do not know how to feel when faced
| with these ideas because I feel out of the false dichotomy (pay
| for them, use them every day, but won't think them as valuable as
| the average AI bro). What's funny is that I'm yet to see an
| article that actually shows LLMs strengths and weaknesses in a
| serious manner and with actual examples. If you are going to
| defend a position, do it seriously ffs.
| creativenolo wrote:
| If you're leaning out, spend two weeks leaning in.
|
| I did, and learned a ton, and likely not going back to how I was
| before, or how I used it a week ago.
|
| The comments in the article about not reading the agent is good
| but it's more than that...
|
| Vibe coding is for non-coders. Yet, you get a feel for the vibe
| of the AI. With windsurf, you have two or three files open, and
| working in one. It starts smashing out the multi, interspersed,
| line edits and you know with a flutter of your eyes, it's got
| your vibe and correctly predicted your next ten lines. And for a
| moment you forgive it for leading you astray when you read what
| it said.
| kevinsync wrote:
| Not to derail, but NFT mania (part of the opening salvo in the
| article) was the giant shitshow that it was -not- because the
| concept of unique digital bits in the possession of a single
| owner was a bad idea (or, the concept of unique verification of
| membership in a club was a bad idea) -- it was a diarrhea-slicked
| nightmare because it was implemented via blockchains and their
| related tokens, which inherently peg fluctuating fiat value to
| the underlying mechanisms of assigning and verifying said
| ownership or membership, and encourages a reseller's market ..
| not to mention the perverse, built-in economic incentives
| required to get nodes to participate in that network to make the
| whole thing go.
|
| Had NFTs simply been deployed as some kind of protocol that could
| be leveraged for utility rather than speculation, I think the
| story would be a complete 180. No clue personally how to achieve
| that, but it feels like it could be done.. except that, too,
| would have been completely perverted and abused by centralized
| behemoths, leading to a different but terrible outcome. Can you
| imagine if all data became non-fungible? Convince all the big
| identity vendors (Google, Apple, etc) to issue key pairs to users
| that then get used by media companies to deliver audio and video
| keyed only to you that's embedded with maybe some kind of
| temporal steganographic signature that's hard to strip and can be
| traced back to your key? It's not just cracking AACS once and
| copying the bytes. It becomes this giant mess of you literally
| can't access anything without going through centralized
| authorities anymore. Then build more anti-patterns on top of that
| lol. Prolly better that it was mostly just monkey JPEGs and rug
| pulls.
|
| Anyways, I'm so far off topic from what's actually being
| discussed -- just couldn't help myself from veering into left
| field.
| bob1029 wrote:
| I am finding the most destructive aspect of LLM assistance to be
| the loss of flow state.
|
| Most of the time I can go faster than these tools if I have
| confidence in myself and allow the momentum to build up over the
| course of 20-30 minutes. Every time I tab out to an LLM is like a
| 5 minute penalty over what I could have done unaided on a good
| day.
|
| Getting the model prepared to help you in a realistic domain
| often takes a few minutes of arranging code & comments so that it
| is forced toward something remotely sane. I'll scaffold out
| entire BS type hierarchies just so I can throw a //TODO: ....
| line in the middle somewhere. Without this kind of structure, I
| would be handling unfiltered garbage most of the time.
|
| It's not that these tools are bad, it's that we need to recognize
| the true cost of engaging with them. ChatGPT is like a
| jackhammer. It will absolutely get you through that concrete
| slab. However, it tends to be quite obnoxious & distracting in
| terms of its operational principles.
| imiric wrote:
| > People coding with LLMs today use agents. Agents get to poke
| around your codebase on their own. They author files directly.
| They run tools.
|
| I'll be damned if I give up control of my machine to a tool that
| hallucinates actions to take using hastily put together and
| likely AI-generated "agents". I still want to be the primary user
| of my machine, and if that means not using cutting edge tools
| invented in the last 6 months, so be it. I don't trust the vast
| majority of tools in this space anyway.
|
| > I'm sure there are still environments where hallucination
| matters.
|
| Still? The output being correct matters in _most_ environments,
| except maybe art and entertainment. It especially matters in
| programming, where a 99% correct program probably won't compile.
|
| > But "hallucination" is the first thing developers bring up when
| someone suggests using LLMs, despite it being (more or less) a
| solved problem.
|
| No, it's not. It's _the_ problem that's yet to be solved. And yet
| every AI company prefers chasing benchmarks, agents, or whatever
| the trend du jour is.
|
| > I work mostly in Go. [...] LLMs kick ass generating it.
|
| I also work mostly in Go. LLMs do an awful job generating it,
| just as with any other language. I've had the same shitty
| experience generating Go, as I've had generating JavaScript or
| HTML. I've heard this excuse that the language matters, and IME
| it's just not the case.
|
| Sure, if you're working with an obscure and niche language for
| which there is less training data, I suppose that could be the
| case. But you're telling me that there is no good training data
| for Rust, the trendiest systems language of the past ~decade?
| C'mon. Comparing Rust to Brainfuck is comical.
|
| I won't bother responding to all points in this article. I will
| say this: just as AI doomsayers and detractors deserve criticism,
| so does this over-the-top praising. Yes, LLMs are a great
| technology. But it is also part of a wildly overhyped market that
| will inevitably crash as we approach the trough of
| disillusionment. Their real value is somewhere in the middle.
| lapcat wrote:
| > If you were trying and failing to use an LLM for code 6 months
| ago, you're not doing what most serious LLM-assisted coders are
| doing.
|
| This sounds like the "No true Scotsman" fallacy.
|
| > People coding with LLMs today use agents. Agents get to poke
| around your codebase on their own.
|
| That's a nonstarter for closed source, unless everything is
| running on-device, which I don't think it is?
|
| > Part of being a senior developer is making less-able coders
| productive
|
| Speak for yourself. It's not my job.
| paulsutter wrote:
| The best I can offer skeptics is the more you work with the tools
| the more productive you become. Because yes the tools are
| imperfect.
|
| If you've had a dog you know that "dog training" classes are
| actually owner training.
|
| Same with AI tools. I see big gains for people who spend time to
| train themselves to work within the limitations. When the next
| generation of tools come out they can adapt quickly.
|
| If this sounds tedious, thats becuase it is tedious. I spent many
| long weekends wrestling with tools silently wrecking my entire
| codebase, etc. And that's what I had to do to get the
| productivity improvements I have now.
| greybox wrote:
| > Does an intern cost $20/month? Because that's what Cursor.ai
| costs.
|
| So then where do the junior developers come from? And then where
| do the senior developers come from?
| api wrote:
| A big problem is that you're either hearing breathless over the
| top insane hype (or doomerism, which is breathless over the top
| hype taken to a dark place) or skepticism that considers AI/LLMs
| to be in the same league as NFTs.
|
| Neither of these is accurate, but I guess nuanced thinking or
| considering anything below surface vibes is out these days.
|
| So far after playing with them I'm using them as:
|
| 1. A junior intern that can google _really really fast_ and has
| memorized a large chunk of the Internet and the library, and can
| do rough first-pass research and dig for things.
|
| 2. Autocomplete 2.0 that can now generate things like boilerplate
| or fairly pedestrian unit tests.
|
| 3. Rubber duck debugging where the rubber duck talks back.
|
| 4. A helper to explain code, at least for a first pass. I can
| highlight a huge piece of code and ask it to summarize and then
| explain and walk me through it and it does a passable job. It
| doesn't get everything right but as long as you know that, it's a
| good way to break things down and get into it.
|
| For those things it's pretty good, and it's definitely a lot of
| fun to play with.
|
| I expect that it will get better. I don't expect it to replace
| programmers for anything but the most boring mindless tasks (the
| ones I hate doing), but I expect it to continue to become more
| and more useful as super-autocomplete and all the other things I
| listed.
| tedious-coder wrote:
| AI makes me sad. When I started my CS degree, I didn't even know
| what silicon valley was. I was unaware of what the SWE job
| landscape was like. I went to school in a no-name town.
|
| Computer science was an immensely fun subject to learn. I moved
| to one of the big cities and was bewildered with how much there
| was to learn, and loved every second of it. I gradually became
| good enough to help anyone with almost anything, and spent lots
| of my free time digging deeper and learning.
|
| I liked CS and programming - but I did not like products built by
| the companies where I was good enough to be employed. These were
| just unfortunate annoyances that allowed me to work close enough
| to what I actually enjoyed, which was just code, and the
| computer.
|
| Before LLMs, those like me could find a place within most
| companies - the person you don't go to for fast features, but for
| weird bugs or other things that the more product-minded people
| weren't interested in. There was still, however, an uncomfortable
| tension. And now that tension is even greater. I do not use an
| LLM to write all my code, because I enjoy doing things myself. If
| I do not have that joy, then it will be immensely difficult for
| me to continue the career I have already invested so much time
| in. If I could go back in time and choose another field I would -
| but since that's not possible, I don't understand why it's so
| hard for people to have empathy for people like me. I would never
| have gone down this path if I knew that one day, my hard-earned-
| knowledge would become so much less valuable, and I'd be forced
| to delegate the only part of the job I enjoyed to the computer
| itself.
|
| So Thomas, maybe your AI skeptic friends aren't nuts, they just
| have different priorities. I realize that my priorities are at
| odds for the companies I work for. I am just tightly gripping the
| last days that I can get by doing this job the way that I enjoy
| doing it.
| jleyank wrote:
| If they're regurgitating what's been learned, is there a risk of
| copyright/IP issues from whomever had the code used for training?
| Last time I checked, there's a whole lotta lawyers in the us
| who'd like the business.
| greybox wrote:
| > All this is to say: I write some Rust. I like it fine. If LLMs
| and Rust aren't working for you, I feel you. But if that's your
| whole thing, we're not having the same argument.
|
| Yes we are, because the kind of work you need to do in C++ or
| Rust is probably entirely different from the work this person
| manages to get the LLM to do in Go.
| terminatornet wrote:
| > Meanwhile, software developers spot code fragments seemingly
| lifted from public repositories on Github and lose their shit.
| What about the licensing? If you're a lawyer, I defer. But if
| you're a software developer playing this card? Cut me a little
| slack as I ask you to shove this concern up your ass. No
| profession has demonstrated more contempt for intellectual
| property.
|
| Loved this style of writing in 2005 from Maddox on the best site
| in the universe or whatever.
|
| Sorry if I don't want google and openAI stealing my or anyone
| else's work.
___________________________________________________________________
(page generated 2025-06-02 23:00 UTC)