[HN Gopher] AI is a floor raiser, not a ceiling raiser
___________________________________________________________________
AI is a floor raiser, not a ceiling raiser
Author : jjfoooo4
Score : 175 points
Date : 2025-07-31 17:01 UTC (5 hours ago)
(HTM) web link (elroy.bot)
(TXT) w3m dump (elroy.bot)
| amelius wrote:
| AI is an interpolator, not an extrapolator.
| throe23486 wrote:
| I read this as interloper. What's an extraloper?
| exasperaited wrote:
| Opposite of "inter-" is "intra-".
|
| Intraloper, weirdly enough, is a word in use.
| djeastm wrote:
| So we have also have the word "extra", but oddly the word
| "exter" is left out.
|
| I'm exter mad about that.
| jjk166 wrote:
| "inter-" means between, "intra-" means within, "extra-"
| means outside. "intra-" and "inter" aren't quite synonyms
| but they definitely aren't opposites of eachother.
| exasperaited wrote:
| Inter- implies relationships between entities, intra-
| implies relationships within entities.
|
| In any single sentence context they cannot refer to the
| same relationships, and that which they are _not_ is
| precisely the domain of the other word: they are true
| antonyms.
| shagie wrote:
| An interloper being someone who intrudes or meddles in a
| situation (inter "between or amid) + loper "to leap or run" -
| https://en.wiktionary.org/wiki/loper ), an extraloper would
| be someone who dances or leaps around the outside of a
| subject or meeting with similar annoyances.
| canadaduane wrote:
| Very concise, thank you for sharing this insight.
| lupire wrote:
| OP doesn't understand that almost everything is neither at the
| floor or the ceiling.
| givemeethekeys wrote:
| AI is a floor destroyer not a ceiling destroyer. Hang on for dear
| life!! :P
| bfigares wrote:
| AI raises everything - the ceiling is just being more productive.
| Productivity comes from adequacy and potency of tools. We got a
| hell of a strong tool in our hands, therefore, the more adequate
| the usage, the higher the leverage.
| infecto wrote:
| Surprised to see this downvoted. It feels true to me. Sure
| there are definitely novel areas where folks might not benefit
| but I can see a future where this tool becomes helpful for the
| vast majority of roles.
| andrenotgiant wrote:
| This tracks for other areas of AI I am more familiar with.
|
| Below average people can use AI to get average results.
| itsoktocry wrote:
| That explains why people here are against it, because everyone
| is above average I guess.
| falcor84 wrote:
| I'm not against it. I wonder where in the distribution it
| puts me.
| leptons wrote:
| At the "Someone willing to waste their time with slop" end?
| pcrh wrote:
| This is in line with another quip about AI: You need to know
| more than the LLM in order to gain any benefit from it.
| hirvi74 wrote:
| I am not certain that is entirely true.
|
| I suppose it's all a matter of what one is using an LLM for,
| no?
|
| GPT is great at citing sources for most of my requests --
| even if not always prompted to do so. So, in a way, I kind of
| use LLMs as a search engine/Wikipedia hybrid (used to follow
| links on Wiki a lot too). I ask it what I want, ask for
| sources if none are provided, and just follow the sources to
| verify information. I just prefer the natural language
| interface over search engines. Plus, results are not
| cluttered with SEO ads and clickbait rubbish.
| jononor wrote:
| Above average people can also use it to get average results.
| Which can actually be useful. For many tasks and usecases, the
| good enough threshold can actually be quite low.
| djeastm wrote:
| >Below average people can use AI to get average results.
|
| But that would shift the average up.
| bitwize wrote:
| AI is a shovel capable of breaking through the bottom of the
| barrel.
| erlend_sh wrote:
| Only for the people already affluent enough to afford the ever-
| more expensive subscriptions. Those most in need of a floor-
| raising don't have the disposable income to take a bet on AI.
| intended wrote:
| Either you are the item being sold or you are paying for the
| service.
|
| Nothing is free, and I for one prefer a subscription model, if
| only as a change from the ad model.
|
| I am sure we will see the worst of all worlds, but for now, for
| this moment in history, subscription is better than ads.
|
| Let's also never have ads in GenAi tools. The kind of invasive
| intent level influence these things can achieve, will make our
| current situation look like a paradise
| quirkot wrote:
| I'd never buy anything as overt as an advertisement in an AI
| tool. I just want to buy influence. Just coincidentally use
| my product as the example. Just suggest my preferred
| technology when asked a few % more often than my competitors.
| I'd never want someone to observe me pulling the strings
| LtWorf wrote:
| Normally even if you pay you're still the product anyway. See
| buying smartphones for example... you pay a lot but you're
| the product.
| pdntspa wrote:
| It's very easy to sign up for an API account and pay per-call,
| or even nothing. Free offerings out there are great (Gemini,
| OpenRouter...) and a few are even suitable for agentic
| development.
| LtWorf wrote:
| And how long until they raise the prices?
| pdntspa wrote:
| API prices have moving in a downward direction, not upward.
| recroad wrote:
| AI isn't a pit. AI is a ladder.
| anthk wrote:
| Yeah, like in Nethack, while being blind and stepping on a
| cockatrice.
| layer8 wrote:
| A ladder that doesn't reach the ceiling and sometimes ends up
| in imaginary universes.
| billyp-rva wrote:
| Mixing this with a metaphor from earlier: giving a child a credit
| card is also a floor raiser.
| manmal wrote:
| Since agents are good only at greenfield projects, the logical
| conclusion is that existing codebases have to be prepared such
| that new features are (opinionated) greenfield projects - let all
| the wiring dangle out of the wall so the intern just has to plug
| in the appliance. All the rest has to be done by humans, or the
| intern will rip open the wall to hang a picture.
| PaulHoule wrote:
| Hogwash. If you can't figure out how to do something with
| project Y from npm try checking it out from Github with
| WebStorm and asking Junie how to do it -- often you get a good
| answer right away. If not you can ask questions that can help
| you understand the code base. Don't understand some data
| structure which is a maze of Map<String, Objects>(s) it will
| scan how it is used and give you draft documentation.
|
| Sure you can't point it to a Jira ticket and get a PR but you
| certainly can use it as a pair programmer. I wouldn't say it is
| much faster than working alone but I end up writing more tests
| and arguing with it over error handling means I do a better job
| in the end.
| falcor84 wrote:
| > Sure you can't point it to a Jira ticket and get a PR
|
| You absolutely can. This is exactly what SWE-Bench[0]
| measures, and I've been amazed at how quickly AIs have been
| climbing those ladders. I personally have been using Warp [1]
| a lot recently and in quite a lot of low-medium difficulty
| cases it can one-shot a decent PR. For most of my work I
| still find that I need to pair with it to get sufficiently
| good results (and that's why I still prefer it to something
| cloud-based like Codex [2], but otherwise it's quite good
| too), and I expect the situation to flip over the coming
| couple of years.
|
| [0] https://www.swebench.com/
|
| [1] https://www.warp.dev/
|
| [2] https://openai.com/index/introducing-codex/
| esafak wrote:
| How does Warp compare to others you have tried?
| falcor84 wrote:
| I've not used it for long enough yet for this to be a
| strong opinion, but so far I'd say that it is indeed a
| bit better than Claude Code, as per the results on
| Terminal Bench[0]. And on a side note, I quite like the
| fact that I can type shell commands and chat commands
| interchangeably into the same input and it just knows
| whether to run it or respond to it (accidentally
| forgetting the leading exclamation mark has been a
| recurring mistake for me in Claude Code).
|
| [0] https://www.tbench.ai/
| manmal wrote:
| What you describe is not using agents at all, which my
| comment was aimed at if you read the first sentence again.
| PaulHoule wrote:
| Julie is marketed as an "agent" and it definitely works
| harder than the Jetbrains AI assistant.
| yoz-y wrote:
| They're not. They're good at many things and bad at many
| things. The more I use them the more I'm confused about which
| is which.
| spion wrote:
| I think agents have a curve where they're kinda bad at
| bootstrapping a project, very good if used in a small-to-
| medium-sized existing project and then it goes downhill from
| there as size increases, slowly.
|
| Something about a brand-new project often makes LLMs drop to
| "example grade" code, the kind you'd never put in production.
| (An example: claude implemented per-task file logging in my
| prototype project by pushing to an array of log lines,
| serializing the entire thing to JSON and rewriting the entire
| file, for every logged event)
| 42lux wrote:
| AI is chairs.
| furyofantares wrote:
| I feel like nobody remembers that facebook ad (Facebook is
| chairs), but it's seared into my own memory.
| manyaoman wrote:
| AI is a wall raiser.
| msgodel wrote:
| You'll find many people lack the willpower and confidence to even
| get on the floor though. If it weren't for that they'd already
| know a programming language and be selling something.
| TimPC wrote:
| People should be worried because right now AI is on an
| exponential growth trajectory and no-one knows when it will level
| off into an s-curve. AI is starting to get close to good enough.
| If it becomes twice as good in seven months then what?
| mattnewport wrote:
| What's the basis for your claim that it is on an exponential
| growth trajectory? That's not the way it feels to me as a
| fairly heavy user, it feels more like an asymptotic approach to
| expert human level performance where each new model gets a bit
| closer but is not yet reaching it, at least in areas where I am
| expert enough to judge. Improvements since the original ChatGPT
| don't feel exponential to me.
| samsartor wrote:
| This also tracks with my experience. Of course, technical
| progress never looks smooth through the steep part of the
| s-curve, more a sequence of jagged stair-steps (each their
| own little s-curve in miniature). We might only be at the top
| of a stair. But my feeling is that we're exhausting the form-
| factor of LLMs. If something new and impressive comes along
| it'll be shaped different and fill a different niche.
| nwienert wrote:
| Let's look:
|
| GPT-1 June 2018
|
| GPT-2 February 2019
|
| GPT-3 November 2021
|
| GPT-4 March 2023
|
| Claude tells me this is the rough improvement of each:
|
| GPT-1 to 2: 5-10x
|
| GPT-2 to 3: 10-20x
|
| GPT 3 to 4: 2-4x
|
| Now it's been 2.5 years since 4.
|
| Are you expecting 5 to be 2-4x better, or 10-20x better?
| esafak wrote:
| How are you measuring this improvement factor? We have
| numerous benchmarks for LLMs and they are all saturating. We
| are rapidly approaching AGI by that measure, and headed
| towards ASI. They still won't be "human" but they will be
| able to _do_ everything humans can, and more.
| LeftHandPath wrote:
| I was worried about that a couple of years ago, when there was
| a lot of hope that deeper reasoning skills and hallucination
| avoidance would simply arrive as emergent properties of a large
| enough model.
|
| More recently, it seems like that's not the case. Larger models
| sometimes even hallucinate more [0]. I think the entire sector
| is suffering from a Dunning Kruger effect -- making an LLM _is
| difficult_ , and they managed to get something incredible
| working in a much shorter timeframe than anyone really expected
| back in the early 2010s. But that led to overconfidence and
| hype, and I think there will be a much longer tail in terms of
| future improvements than the industry would like to admit.
|
| Even the more advanced reasoning models will struggle to play a
| _valid_ game of chess, much less win one, despite having plenty
| of chess games in their training data [1]. I think that,
| combined with the trouble of hallucinations, hints at where the
| limitations of the technology really are.
|
| Hopefully LLMs will scare society into planning how to handle
| mass automation of thinking and logic, before a more powerful
| technology that can really do it arrives.
|
| [0]: https://techcrunch.com/2025/04/18/openais-new-reasoning-
| ai-m...
|
| [1]: https://dev.to/maximsaplin/can-llms-play-chess-ive-
| tested-13...
| esafak wrote:
| really? I find newer models hallucinate less, and I think
| they have room for improvement, with better training.
|
| I believe hallucinations are partly an artifact of imperfect
| model training, and thus can be ameliorated with better
| technique.
| LeftHandPath wrote:
| Yes, really!
|
| Smaller models may hallucinate less: https://www.intel.com/
| content/www/us/en/developer/articles/t...
|
| The RAG technique uses a smaller model and an external
| knowledge base that's queried based on the prompt. The
| technique allows small models to outperform far larger ones
| in terms of hallucinations, at the cost of performance.
| That is, to eliminate hallucinations, we should alter how
| the model works, not increase its scale:
| https://highlearningrate.substack.com/p/solving-
| hallucinatio....
|
| Pruned models, with fewer parameters, generally have a
| lower hallucination risk: https://direct.mit.edu/tacl/artic
| le/doi/10.1162/tacl_a_00695.... "Our analysis suggests that
| pruned models tend to generate summaries that have a
| greater lexical overlap with the source document, offering
| a possible explanation for the lower hallucination risk."
|
| At the same time, all of this should be contrasted with the
| "Bitter Lesson" (https://www.cs.utexas.edu/~eunsol/courses/
| data/bitter_lesson...). IMO, making a larger LLMs does
| indeed produce a generally superior LLM. It produces more
| trained responses to a wider set of inputs. However, it
| does not change that it's an LLM, so fundamental traits of
| LLMs - like hallucinations - remain.
| roadside_picnic wrote:
| People don't consider that there are real
| physical/thermodynamic constraints on intelligence. It's easy
| to imagine some skynet scenario, but all evidence suggests that
| it takes significant increases in energy consumption to
| increase intelligence.
|
| Even in nature this is clear. Humans are a great example:
| cooked food predates _homo sapiens_ and it is largely
| considered to be a pre-requisite for having human level
| intelligence because of the enormous energy demands of our
| brains. And nature has given us wildly more efficient brains in
| almost every possible way. The human brain runs on about 20
| watts of power, my RTX uses 450 watts at full capacity.
|
| The idea of "runaway" super intelligence has baked in some very
| extreme assumptions about the nature of thermodynamics and
| intelligence, that are largely just hand waved away.
|
| On top of that, AI hasn't changed in a notable way for me
| personally in a year. The difference between 2022 and 2023 was
| wild, between 2023 and 2024 changed some of my workflows, 2024
| to today largely is just more options around which tooling I
| used and how these tools can be combined, but nothing really at
| a fundamental level feels improved for me.
| LeftHandPath wrote:
| There are some things that you still can't do with LLMs. For
| example, if you tried to learn chess by having the LLM play
| against you, you'd quickly find that it isn't able to track a
| series of moves for very long (usually 5-10 turns; the longest
| I've seen it last was 18) before it starts making illegal
| choices. It also generally accepts invalid moves from your side,
| so you'll never be corrected if you're wrong about how to use a
| certain piece.
|
| Because it can't actually model these complex problems, it really
| requires awareness from the user regarding what questions should
| and shouldn't be asked. An LLM can probably tell you how a knight
| moves, or how to respond to the London System. It probably can't
| play a full game of chess with you, and will virtually never be
| able to advise you on the best move given the state of the board.
| It probably can give you information about big companies that are
| well-covered in its training data. It probably can't give you
| good information about most sub-$1b public companies. But, if you
| ask, it will give a confident answer.
|
| They're a minefield for most people and use cases, because people
| aren't aware of how wrong they can be, and the errors take effort
| and knowledge to notice. It's like walking on a glacier and
| hoping your next step doesn't plunge through the snow and into a
| deep, hidden crevasse.
| smiley1437 wrote:
| > people aren't aware of how wrong they can be, and the errors
| take effort and knowledge to notice.
|
| I have friends who are highly educated professionals (PhDs,
| MDs) who just assume that AI\LLMs make no mistakes.
|
| They were shocked that it's possible for hallucinations to
| occur. I wonder if there's a halo effect where the perfect
| grammar, structure, and confidence of LLM output causes some
| users to assume expertise?
| bayindirh wrote:
| Computers are always touted as deterministic machines. You
| can't argue with a compiler, or Excel's formula editor.
|
| AI, in all its glory, is seen as an extension of that. A
| deterministic thing which is meticulously crafted to provide
| an undisputed truth, and it can't make mistakes because
| computers are deterministic machines.
|
| The idea of LLMs being networks with weights plus some
| randomness is both a vague and too complicated abstraction
| for most people. Also, companies tend to say this part very
| quietly, so when people read the fine print, they get
| shocked.
| rplnt wrote:
| Have they never used it? Majority of the responses that I can
| verify are wrong. Sometimes outright nonse, sometimes
| believable. Be it general knowledge or something where deeper
| expertise is required.
| viccis wrote:
| > I wonder if there's a halo effect where the perfect
| grammar, structure, and confidence of LLM output causes some
| users to assume expertise?
|
| I think it's just that LLMs are modeling generative
| probability distributions of sequences of tokens so well that
| what they actually are nearly infallible at is producing
| convincing results. Often times the correct result is the
| most convincing, but other times what seems most convincing
| to an LLM just happens to also be most convincing to a human
| regardless of correctness.
| throwawayoldie wrote:
| https://en.wikipedia.org/wiki/ELIZA_effect
|
| > In computer science, the ELIZA effect is a tendency to
| project human traits -- such as experience, semantic
| comprehension or empathy -- onto rudimentary computer
| programs having a textual interface. ELIZA was a symbolic
| AI chatbot developed in 1966 by Joseph Weizenbaum and
| imitating a psychotherapist. Many early users were
| convinced of ELIZA's intelligence and understanding,
| despite its basic text-processing approach and the
| explanations of its limitations.
| jasonjayr wrote:
| I worry that the way the models "Speak" to users, will cause
| users to drop their 'filters' about what to trust and not
| trust.
|
| We are barely talking modern media literacy, and now we have
| machines that talk like 'trusted' face to face humans, and
| can be "tuned" to suggest specific products or use any
| specific tone the owner/operator of the system wants.
| throwawayoldie wrote:
| My experience, speaking over a scale of decades, is that most
| people, even very smart and well-educated ones, don't know a
| damn thing about how computers work and aren't interested in
| learning. What we're seeing now is just one unfortunate
| consequence of that.
|
| (To be fair, in many cases, I'm not terribly interested in
| learning the details of their field.)
| yifanl wrote:
| If I wasn't familiar with the latest in computer tech, I
| would also assume LLMs never make mistakes, after hearing
| such excited praise for them over the last 3 years.
| dsjoerg wrote:
| > I have friends who are highly educated professionals (PhDs,
| MDs) who just assume that AI\LLMs make no mistakes.
|
| Highly educated professionals in my experience are often very
| bad at applied epistemology -- they have no idea what they do
| and don't know.
| physicsguy wrote:
| It's super obvious even if you try and use something like agent
| mode for coding, it starts off well but drifts off more and
| more. I've even had it try and do totally irrelevant things
| like indent some code using various Claude models.
| poszlem wrote:
| My favourite example is something that happens quite often
| even with Opus, where I ask it to change a piece of code, and
| it does. Then I ask it to write a test for that code, it
| dutifully writes one. Next, I tell it to run the test, and of
| course, the test fails. I ask it to fix the test, it tries,
| but the test fails again. We repeat this dance a couple of
| times, and then it seemingly forgets the original request
| entirely. It decides, "Oh, this test is failing because of
| that new code you added earlier. Let me fix that by removing
| the new code." Naturally, now the functionality is gone, so
| it confidently concludes, "Hey, since that feature isn't
| there anymore, let me remove the test too!"
| DougBTX wrote:
| Yeah, the chess example is interesting. The best specialised
| AIs for chess are all clearly better than humans, but our best
| general AIs are barely able to play legal moves. The ceiling
| for AI is clearly much higher than current LLMs.
| pharrington wrote:
| Large Language Models aren't general AIs. Its in the name.
| nomel wrote:
| > you'd quickly find that it isn't able to track a series of
| moves for very long (usually 5-10 turns; the longest I've seen
| it last was 18)
|
| In chess, previous moves are irrelevant, and LLM aren't good
| with filtering out irrelevant data [1]. For better performance,
| you should include only the relevant data in the context
| window: the current state of then board.
|
| [1] https://news.ycombinator.com/item?id=44724238
| og_kalu wrote:
| LLMs playing chess isn't a big deal. You can train a model on
| chess games and it will play at a decent ELO and very rarely
| make illegal moves(i.e 99.8% legal move rate). There are a few
| such models around. I think post training messes with chess
| ability and Open ai et al just don't really care about that.
| But LLMs can play chess just fine.
|
| [0] https://arxiv.org/pdf/2403.15498v2
|
| [1] https://github.com/adamkarvonen/chess_gpt_eval
| LeftHandPath wrote:
| Jeez, that arxiv paper invalidates my assumption that it
| can't model the game. Great read. Thank you for sharing.
|
| Insane that the model actually does seem to internalize a
| representation of the state of the board -- rather than just
| hitting training data with similar move sequences.
|
| ...Makes me wish I could get back into a research lab. Been a
| while since I've stuck to reading a whole paper out of
| legitimate interest.
|
| (Edit) At the same time, it's still worth noting the accuracy
| errors and the potential for illegal moves. That's still
| enough to prevent LLMs from being applied to problem domains
| with severe consequences, like banking, security, medicine,
| law, etc.
| tayo42 wrote:
| I was thinking about this sentiment on my long car drive today.
|
| it feels like when you need to paint walls in your house. If
| you've never done it before you'll probably reach for tape to
| make sure you don't ruin the ceiling and floors. the tape is a
| tool for amateur wall painters to get decent results somewhat
| efficiently compared to if they didn't. If your an actual good
| wall painter, tape only slows you down. You'll go faster without
| the "help".
| stillpointlab wrote:
| This mirrors insights from Andrew Ng's recent AI startup talk
| [1].
|
| I recall he mentions in this video that the new advice they are
| giving to founders is to throw away prototypes when they pivot
| instead of building onto a core foundation. This is because of
| the effects described in the article.
|
| He also gives some provisional numbers (see the section "Rapid
| Prototyping and Engineering" and slides ~10:30) where he suggests
| prototype development sees a 10x boost compared to a 30-50%
| improvement for existing production codebases.
|
| This feels vaguely analogous to the switch from "pets" to
| "livestock" when the industry switched from VMs to containers.
| Except, the new view is that your codebase is more like livestock
| and less like a pet. If true (and no doubt this will be a
| contentious topic to programmers who are excellent "pet" owners)
| then there may be some advantage in this new coding agent world
| to getting in on the ground floor and adopting practices that
| make LLMs productive.
|
| 1. https://www.youtube.com/watch?v=RNJCfif1dPY
| falcor84 wrote:
| Great point, but just mentioning (nitpicking?) that I never
| heard about machines/containers referred to as "livestock", but
| rather in my milieu it's always "pets" vs "cattle". I now
| wonder if it's a geographical thing.
| HPsquared wrote:
| Boxen? (Oxen)
| bayindirh wrote:
| AFAIK, Boxen is a permutation of Boxes, not Oxen.
| mananaysiempre wrote:
| There seems to be a pattern of humorous plurals in
| English where by analogy with ox ~ oxen you get -x ~
| -xen: boxen, Unixen, VAXen.
|
| Before you call this pattern silly, consider that the
| fairly normal plural "Unices" is by analogy with Latin
| plurals in -x = -c|s ~ -c|es, where I've expanded -x into
| -cs to make it clear that the Latin singular comprises a
| noun stem ending in -c- and a (nominative) _singular_
| ending -s, which does exist in Latin but is otherwise
| completely nonexistent in English. (This is extra funny
| for Unix < Unics < Multics.) Analogies are the order of
| the day in this language.
| bayindirh wrote:
| Yeah. After reading your comment, I thought "maybe the
| Xen hypervisor is named because of this phenomena". "xen"
| just means "many" in that context.
|
| Also, probably because of approaching graybeard
| territory, Thinking about boxen of VAXen running UNIXen
| makes me feel warm and fuzzy. :D
| bayindirh wrote:
| Yeah, the CERN talk* [0] coined the term Pets vs. Cattle
| analogy, and it was way before VMs were cheap on bare metal.
| I think the word just evolved as the idea got rooted in the
| community.
|
| We use the same analogy for the last 20 years or so.
| Provisioning 150 cattle servers take 15 minutes or so, and we
| can provision a pet in a couple of hours, at most.
|
| [0]: https://www.engineyard.com/blog/pets-vs-cattle/
|
| *: Engine Yard post notes that Microsoft's Bill Baker used
| the term earlier, though CERN's date (2012) checks out with
| our effort timeline and how we got started.
| lubujackson wrote:
| Oo, the "pets vs. livestock" analogy really works better than
| the "craftsmen vs. slop-slinger" arguments.
|
| Because using an LLM doesn't mean you devalue well-crafted or
| understandable results. But it does indicate a significant
| shift in how you view the code itself. It is more about the
| emotional attachment to code vs. code as a means to an end.
| recursive wrote:
| I don't think it's exactly emotional attachment. It's the
| likelihood that I'm going to get an escalated support ticket
| caused by this particular piece of slop/artisanally-crafted
| functionality.
| stillpointlab wrote:
| Not to slip too far into analogy, but that argument feels a
| bit like a horse-drawn carriage operator saying he can't
| wait to pick up all of the stranded car operators when
| their mechanical contraptions break down on the side of the
| road. But what happened instead was the creation of a brand
| new job: the mechanic.
|
| I don't have a crystal ball and I can't predict the actual
| future. But I can see the list of potential futures and I
| can assign likelihoods to them. And among the potential
| futures is one where the need for humans to fix the
| problems created by poor AI coding agents dwindles as the
| industry completely reshapes itself.
| recursive wrote:
| Both can be true. There were probably a significant
| number of stranded motorists that were rescued by horse-
| powered conveyance. And eventually cars got more
| convenient and reliable.
|
| I just wouldn't want to be responsible for servicing a
| guarantee about the reliability of early cars.
|
| And I'll feel no sense of vindication if I do get that
| support case. I will probably just sigh and feel a little
| more tired.
| stillpointlab wrote:
| Yes, the whole point that it _is_ true. But only for a
| short window.
|
| So consider differing perspectives. Like a teenage kid
| that is hanging around the stables, listening to the
| veteran coachmen laugh about the new loud, smoky
| machines. Proudly declaring how they'll be the ones
| mopping up the mess, picking up the stragglers, cashing
| it in.
|
| The career advice you give to the kid may be different
| than the advice you'd give to the coachman. That is the
| context of my post: Andrew Ng isn't giving you advice, he
| is giving advice to people at the AI school who hope to
| be the founders of tomorrow.
|
| And you are probably mistaken if you think the solution
| to the problems that arise due to LLMs will result in
| those kids looking at the past. Just like the ultimate
| solution to car reliability wasn't a return to horses but
| rather the invention of mechanics, the solution to
| problems caused by AI may not be the return to some
| software engineering past that the old veterans still
| hold dear.
| bluefirebrand wrote:
| AI is not a floor raiser
|
| It is a false confidence generator
| falcor84 wrote:
| I agree with most of TFA but not this:
|
| > This means cheaters will plateau at whatever level the AI can
| provide
|
| From my experience, the skill of using AI effectively is of
| treating the AI with a "growth mindset" rather than a "fixed"
| one. What I do is that I roleplay as the AI's manager, giving it
| a task, and as long as I know enough to tell whether its output
| is "good enough", I can lend it some of my metagcognition via
| prompting to get it to continue working through obstacles until
| I'm happy with the result.
|
| There are diminishing returns of course, but I found that I can
| get significantly better quality output than what it gave me
| initially without having to learn the "how" of the skill myself
| (i.e. I'm still "cheating"), and only focusing my learning on the
| boundary of what is hard about the task. By doing this, I feel
| that over time I become a better manager in that domain, without
| having to spend the amount of effort to become a practitioner
| myself.
| tailspin2019 wrote:
| I wouldn't classify what you're doing as "cheating"!
| righthand wrote:
| How do you know it's significantly better quality if you don't
| know any of the "how"? The quality increase seems relative to
| the garbage you start with. I guess as long as you impress
| yourself with the result it doesn't matter if it's not actually
| higher quality.
| fellowniusmonk wrote:
| The greatest use of LLMs is the ability to get accurate answers
| to queries in a normalized format without having to wade through
| UI distraction like ads and social media.
|
| It's the opposite of finding an answer on reddit, insta,
| tvtropes.
|
| I can't wait for the first distraction free OS that is a thinking
| and imagination helper and not a consumption device where I have
| to block urls on my router so my kids don't get sucked into a
| skinners box.
|
| I love being able to get answers from documentation and work
| questions without having to wade through some arbitrary UI bs a
| designer has implemented in adhoc fashion.
| leptons wrote:
| I don't find the "AI" answers all that accurate, and in some
| cases they are bordering on a liability even if way down below
| all the "AI" slop it says "AI responses may include mistakes".
|
| >It's the opposite of finding an answer on reddit, insta,
| tvtropes.
|
| Yeah it really is because I can tell when someone doesn't know
| the topic well on reddit, or other forums, but usually someone
| does and the answer is there. Unfortunately the "AI" was
| trained on all of this, and the "AI" is just as likely to spit
| out the wrong answer as the correct one. That is not an
| improvement on anything.
|
| > wade through UI distraction like ads and social media
|
| Oh, so you think "AI" is going to be free and clear forever?
| Enjoy it while it lasts, because these "AI" companies are in
| way over their heads, they are bleeding money like their aorta
| is a fire hose, and there will be plenty of ads and social
| whatever coming to brighten your day soon enough. The free ride
| won't go on forever - think of it as a "loss leader" to get you
| hooked.
| margalabargala wrote:
| I agree with the whole first half, but I disagree that LLM
| usage is doomed to ad-filled shittyness. AI companies may be
| hemmoraging money, but that's because their product costs so
| much to run; it's not like they don't have revenue. The thing
| that will bring profitability isn't ads, it will be
| innovations that let current-gen-quality LLMs run at a
| fraction of the electricity and power cost.
|
| Will some LLMs have ads? Sure, especially at a free tier. But
| I bet the option to pay $20/month for ad-free LLM usage will
| always be there.
| leptons wrote:
| Silicon will improve, but not fast enough to calm
| investors. And better silicon won't change the fact that
| the current zeitgeist is basically a word guessing game.
|
| $20 month won't get you much, if you're paying above what
| it costs to run the "AI", and for what? Answers that are in
| the ballpark of suspicious and untrustworthy?
|
| Maybe they just need to keep spending until all the people
| who can tell slop from actual knowledge are all dead and
| gone.
| LtWorf wrote:
| "accurate"
| gruez wrote:
| The blog post has a bunch of charts, which gives it a veneer of
| objectivity and rigor, but in reality it's just all vibes and
| conjecture. Meanwhile recent empirical studies actually point in
| the opposite direction, showing that AI use increases inequality,
| not decrease it.
|
| https://www.economist.com/content-assets/images/20250215_FNC...
|
| https://www.economist.com/finance-and-economics/2025/02/13/h...
| Calavar wrote:
| The graphic has four studies that show increased inequality and
| six that show reduced inequality.
| gruez wrote:
| Read my comment again. keyword here is "recent". The second
| link also expands on why it's relevant. It's best to read the
| whole article, but here's a paragraph that captures the
| argument:
|
| >The shift in recent economic research supports his
| observation. Although early studies suggested that lower
| performers could benefit simply by copying AI outputs, newer
| studies look at more complex tasks, such as scientific
| research, running a business and investing money. In these
| contexts, high performers benefit far more than their lower-
| performing peers. In some cases, less productive workers see
| no improvement, or even lose ground.
| jjk166 wrote:
| All of the studies were done 2023-2024 and are not listed
| in order that they were conducted. The studies showing
| reduced equality all apply to uncommon tasks like material
| discovery and debate points, whereas the ones showing
| increased equality are broader and more commonly
| applicable, like writing, customer interaction, and coding.
| gruez wrote:
| >All of the studies were done 2023-2024 and are not
| listed in order that they were conducted
|
| Right, the reason why I pointed out "recent" is that it's
| new evidence that people might not be aware of, given
| that there were also earlier studies showing AI had the
| opposite effect on inequality. The "recent" studies also
| had varied methodology compared to the earlier studies.
|
| >The studies showing reduced equality all apply to
| uncommon tasks like material discovery and debate points
|
| "Debating points" is uncommon? Maybe not everyone was in
| the high school debate club, but "debating points" is
| something that anyone in a leadership position does on a
| daily basis. You're also conveniently omitting
| "investment decisions" and "profits and revenue", which
| basically everyone is trying to optimize. You might be
| tempted to think "Coding efficiency" represents a high
| complexity task, but the abstract says the test involved
| "Recruited software developers were asked to implement an
| HTTP server in JavaScript as quickly as possible". The
| same is true of the task used in the "legal analysis"
| study, which involved drafting contracts or complaints.
| This seems exactly like the type of cookie cutter tasks
| that the article describes would become like cashiers and
| have their wages stagnate. Meanwhile the studies with
| negative results were far more realistic and measured
| actual results. Otis et al 2023 measured profits and
| revenue of actual Kenyan SMBs. Roldan-Mones measured
| debate performance as judged by humans.
| bgwalter wrote:
| Thanks for the links. That should be obvious to anyone who
| believes that $70 billion datacenters (Meta) are needed and the
| investment will be amortized by subscriptions (in the case of
| Meta also by enhanced user surveillance).
|
| The means of production are in a small oligopoly, the rest will
| be redundant or exploitable sharecroppers.
|
| (All this under the assumption that "AI" works, which its
| proponents affirm in public at least.)
| devonbleak wrote:
| Yeah, the graphs make some really big assumptions that don't
| seem to be backed up anywhere except AI maximalist head canon.
|
| There's also a gap in addressing vibe coded "side projects"
| that get deployed online as a business. Is the code base super
| large and complex? No. Is AI capable of taking input from a
| novice and making something "good enough" in this space? Also
| no.
| skhameneh wrote:
| The later remarks are very strong assumptions underestimating
| the power AI tools offer.
|
| AI tools are great at unblocking and helping their users
| explore beyond their own understanding. The tokens in are
| limited to the users' comprehension, but the tokens out are
| generated from a vast collection of greater comprehension.
|
| For the novice, it's great at unblocking and expanding
| capabilities. "Good enough" results from novices are
| tangible. There is no doubt the volume of "good enough" is
| perceived as very low by many.
|
| For large and complex codebases, unfortunately the effects of
| tech debt (read: objectively subpar practices) translate into
| context rot at development time. A properly architected and
| documented codebase that adheres to common well structured
| patterns can easily be broken down into small easily
| digestible contexts. i.e. a fragmented codebase does not
| scale well with LLMs, because the fragmentation is seeding
| the context for the model. The model reflects and acts as an
| amplifier to what it's fed.
| verelo wrote:
| Oh man i love this take. It's how I've been selling what I do
| when I speak with a specific segment of my audience: "My goal
| isn't to make the best realtors better, it's to make the worst
| realtors acceptable".
|
| And my client is often the brokerage, they just want their agents
| to produce commissions so they make a cut. They know their top
| producers probably wont get much from what I offer, but we all
| see that their worst performers could easily double their
| business.
| cropher wrote:
| Really liked this article.
|
| I wonder: the graphs treat learning with and without AI as two
| different paths. But obviously people can switch between learning
| methods or abandon one of them.
|
| Then again, I wonder how many people go from learning about a
| topic using LLMs to then leaving them behind to continue the old
| school way. I think the early spoils of LLM usage could poison
| your motivation to engage with the topic on your own later on.
| serial_dev wrote:
| I learn about different subjects mixing traditional resources
| and AI.
|
| I can watch a video about the subject, when I want to go
| deeper, I go to LLMs, throw a bunch of questions at it, because
| thanks to the videos I now know what to ask. Then the AI
| responses tell me what I need to understand deeper, so I pick a
| book that addresses those subjects. Then as I read the book and
| I don't understand something, or I have some questions that I
| want the answer for immediately, I consult ChatGPT (or any
| other tool I want to try). At different points in the journey,
| I find something I could build myself to deepen my
| understanding. I google open source implementations, read them,
| ask LLMs again, I watch summary videos, and work my way through
| the problem.
|
| LLMs serve as a "much better StackOverflow / Google".
| bloomca wrote:
| I use a similar approach. I tried to experiment going into a
| topic with no knowledge and it kinda fumbles, I highly
| recommend to have an overview.
|
| But once you know basics, LLMs are really good to deepen the
| knowledge, but using only them is quite challenging. But as a
| complementary tool I find them excellent.
| precompute wrote:
| I'd argue that AI reduces the distance between the floor and the
| ceiling, only both the floor and ceiling move -- the floor moves
| up, the ceiling downwards. Just using AI makes the floor move up,
| while over-reliance on it (a very personal metric) pushes the
| ceiling downwards.
|
| Unlike the telephone (telephones excited a certain class of
| people into believing that world-wide enlightenment was on their
| doorstep), LLMs don't just reduce reliance on visual tells and
| mannerisms, they reduce reliance on thinking itself. And that's a
| very dangerous slope to go down on. What will happen to the next
| generation when their parents supply substandard socially-
| computed results of their mental work (aka language)? Culture
| will decay and societal norms will veer towards anti-
| civilizational trends. And that's exactly what we're witnessing
| these days. The things that were commonplace are now rare and
| sometimes mythic.
|
| Everyone has the same number of hours and days and years. Some
| people master some difficult, arcane field while others while it
| away in front of the television. LLMs make it easier for the
| television-watchers to experience "entertainment nirvana" while
| enticing the smart, hard-workers to give up their toil and engage
| "just a little" rest, which due to the insidious nature of AI-
| based entertainment, meshes more readily with their more
| receptive minds.
| guywithahat wrote:
| Wouldn't it be both by this definition? It raises the bar for
| people who maybe have a lower IQ ("mastery"), but people who can
| us AI can then do more than ever before, raising the ceiling as
| well.
| kruffalon wrote:
| Wouldn't "more" in this house metaphor be like expanding the
| floor rather than raising the ceiling?
| sabakhoj wrote:
| In things that I am comparatively good at (e.g., coding), I can
| see that it helps 'raise the ceiling' as a result of allowing me
| to complete more of the low level tasks more effectively. But it
| is true as well that it hasn't raised my personal bar in
| capability, as far as I can measure.
|
| When it comes to things I am not good at at, it has given me the
| illusion of getting 'up to speed' faster. Perhaps that's a
| personal ceiling raise?
|
| I think a lot of these upskilling utilities will come down to
| delivery format. If you use a chat that gives you answers, don't
| expect to get better at that topic. If you use a tool that forces
| you to come up with answers yourself and get personalized
| validation, you might find yourself leveling up.
| resters wrote:
| AI will be both a floor and a ceiling raiser, since there is a
| practical limit to how many domains one person or team can be
| expert in, and AI does/will have very strong levels of
| expertise/competency across a large number of domains and will
| thus _offer significant level-ups in areas where cross-domain
| synthesis is crucial_ or where the limits of human working memory
| and pattern recognition make cross-domain synthesis unlikely to
| occur.
|
| AI also enables much more efficient early stage idea validation,
| the point at which ideas/projects are the least anchored in
| established theory/technique. Thus AI will be a great aid in idea
| generation and early stage refinement, which is where most novel
| approaches stall or sit on a shelf as a hobby project because the
| progenitor doesn't have enough spare time to work through it.
| righthand wrote:
| It's definitely about wage stagnation.
| buffzebra wrote:
| Only the first two mastery-time graphs make sense.
| resters wrote:
| AI is going to cause a regression to the most anodyne output
| across many industries. As humans who had to develop analytical
| skills, writing skills, etc., we struggle to imagine the
| undeveloped brains of those who come of age in the zero-
| intellectual-gravity world of AI. OpenAI's study mode is at best
| a fig leaf.
|
| edit: this comment was posted tongue-in-cheek after my comment
| reflecting my actual opinion was downvoted with no rebuttals:
|
| https://news.ycombinator.com/item?id=44749957
| elcritch wrote:
| I would say the modern digital world itself has already had the
| bigger impact on human thinking, at least at work.
|
| It seems with computers we often think and reason far less than
| without. Everything required thought previously, now we can
| just copy and paste out word docs for everything. PowerPoints
| are how key decisions are communicated in most professional
| settings.
|
| Before modern computers and especially the internet we also had
| more time for deep thinking and reasoning. The sublimity of
| deep thought in older books amazes me and it feels like modern
| authors are just slightly less deep on average.
|
| So then LLMs are in my view an incremental change rather than a
| stepwise change with respect to its effects on human cognitive.
|
| In some ways LLMs allow us to return a bit to more humanistic
| deep thinking. Instead of spending hours looking up minutia on
| Google, StackOverflow, etc now we can ask our favorite LLM
| instead. It gives us responses with far less noise.
|
| Unlike with textbooks we can have dialogues and have it take
| different perspectives. Whereas textbooks only gave you that
| authors perspective.
|
| Of course, it's up to individuals to use it well and as a tool
| to sharpen thinking rather than replace it.
| michaelhoney wrote:
| I think all of this is true, but the shape of the chart changes
| as AI gets better.
|
| Think of how a similar chart for chess/go/starcraft-playing
| proficiency has changed over the years.
|
| There will come a time when the hardest work is being done by AI.
| Will that be three years from now or thirty? We don't know yet,
| but it will come.
___________________________________________________________________
(page generated 2025-07-31 23:01 UTC)