[HN Gopher] A bear case: My predictions regarding AI progress
       ___________________________________________________________________
        
       A bear case: My predictions regarding AI progress
        
       Author : suryao
       Score  : 176 points
       Date   : 2025-03-10 04:20 UTC (18 hours ago)
        
 (HTM) web link (www.lesswrong.com)
 (TXT) w3m dump (www.lesswrong.com)
        
       | stego-tech wrote:
       | > At some point there might be massive layoffs due to ostensibly
       | competent AI labor coming onto the scene, perhaps because OpenAI
       | will start heavily propagandizing that these mass layoffs must
       | happen. It will be an overreaction/mistake. The companies that
       | act on that will crash and burn, and will be outcompeted by
       | companies that didn't do the stupid.
       | 
       | We're already seeing this with tech doing RIFs and not
       | backfilling domestically for developer roles (the whole, "we're
       | not hiring devs in 202X" schtick), though the not-so-quiet secret
       | is that a lot of those roles just got sent overseas to save on
       | labor costs. The word from my developer friends is that they are
       | _sick and tired_ of having to force a (often junior /outsourced)
       | colleague to explain their PR or code, only to be told "it works"
       | and for management to overrule their concerns; this is embedding
       | AI slopcode into products, which I'm sure won't have _any lasting
       | consequences_.
       | 
       | My bet is that software devs who've been keeping up with their
       | skills will have another year or two of tough times, then back
       | into a cushy Aeron chair with a sparkling new laptop to do what
       | they do best: _write readable, functional, maintainable code_ ,
       | albeit in more targeted ways since - and I hate to be _that
       | dinosaur_ - LLMs produce passable code, provided a competent
       | human is there to smooth out its rougher edges and rewrite it to
       | suit the codebase and style guidelines (if any).
        
         | dartharva wrote:
         | One could argue that's not strictly "AI labor", just cheap (but
         | real) labor using shortcuts because they're not paid enough to
         | give a damn.
        
           | stego-tech wrote:
           | Oh, no, you're 100% right. One of these days I will pen my
           | essay on the realities of outsourced labor.
           | 
           | Spoiler alert: they are giving just barely enough to not get
           | prematurely fired, because they know if you're cheap enough
           | to outsource in the first place, you'll give the contract to
           | whoever is cheapest at renewal anyway.
        
         | carlosdp wrote:
         | I'll take that bet, easily.
         | 
         | There's absolutely no way that we're not going to see a massive
         | reduction in the need for "humans writing code" moving forward,
         | given how good LLMs are getting at writing code.
         | 
         | That doesn't mean people won't need devs! I think there's a
         | real case where increased capabilities from LLMs leads to
         | bigger demand for people that know how to direct the tools
         | effectively, of which most would probably be devs. But thinking
         | we're going back to humans "writing readable, functional,
         | maintainable code" in two years is cope.
        
           | rahimnathwani wrote:
           | increased capabilities from LLMs leads to bigger demand for
           | people that know how to direct the tools effectively
           | 
           | This is the key thing.
        
           | crote wrote:
           | > There's absolutely no way that we're not going to see a
           | massive reduction in the need for "humans writing code"
           | moving forward, given how good LLMs are getting at writing
           | code.
           | 
           | Sure, but in the same way that Squarespace and Wix killed web
           | development. LLMs are going to replace a decent bunch of low-
           | hanging fruit, but those jobs were always at risk of being
           | outsourced to the lowest bidder over in India anyways.
           | 
           | The real question is, what's going to happen to the interns
           | and the junior developers? If 10 juniors can create the same
           | output as a single average developer equipped with a LLM,
           | _who 's going to hire the juniors_? And if nobody is hiring
           | juniors, how are we supposed to get the next generation of
           | seniors?
           | 
           | Similarly, what's going to happen to outsourcing? Will it be
           | able to compete on quality and price? Will it secretly turn
           | into nothing more than a proxy to some LLM?
        
           | torginus wrote:
           | Hate to be the guy to bring it up but Jevons paradox - in my
           | experience, people are much more eager to build software in
           | the LLM age, and projects are getting started (and done!)
           | that were considered 'too expensive to build' or people
           | didn't have the necessary subject matter expertise to build
           | them.
           | 
           | Just a simple crud-ish project needs frontend, backend,
           | infra, cloud, ci/cd experience, and people who could build
           | that as one man shows were like unicorns - a lot of people
           | had a general how most of this stuff worked, but lacked the
           | hands on familiarity with them. LLMs made that knowledge easy
           | and accessible. They certainly did for me.
           | 
           | I've shipped more software in the past 1-2 years than the 5
           | years before that. And gained tons of experience doing it.
           | LLMs helped me figure out the necessary software, and helped
           | me gain a ton of experience, I gained all those skills, and I
           | feel quite confident in that I could rebuild all these apps,
           | but this time without the help of these LLMs, so even the
           | fearmongering that LLMs will ;make people forget how to code'
           | doesn't seem to ring true.
        
         | fragmede wrote:
         | What lasting consequences? Crowdstrike and the 2017 Equifax
         | hack that leaked all our data didn't stop them. The shares of
         | crowdstrike after it happened I bought are up more than the
         | SP500. Elon went through Twitter and fired everybody but it
         | hasn't collapsed. A carpenter has a lot of opinions about the
         | woodworking used on cheap IKEA cabinets, but mass manufacturing
         | and plastic means that building a good solid high quality chair
         | is no longer the craft it used to be.
        
       | bloomingkales wrote:
       | Let's imagine that we all had a trillion dollars. Then we would
       | all sit around and go "well dang, we have everything, what should
       | we do?". I think you'll find that just about everyone would
       | agree, "we oughta see how far that LLM thing can go". We could be
       | in nuclear fallout shelters for decades, and I think you'll still
       | see us trying to push the LLM thing underground, through duress.
       | We dream of this, so the bear case is wrong in spirit. There's no
       | bear case when the spirit of the thing is that strong.
        
         | mola wrote:
         | Wdym all of us? I certainly would find much better usages for
         | the money.
         | 
         | What about reforming democracy? Use the corrupt system to buy
         | the votes, then abolish all laws allowing these kind of
         | donations that allow buying votes.
         | 
         | I'll litigate the hell out of all the oligarchs now that they
         | can't out pay justice.
         | 
         | This would pay off more than a moon shot. I would give a bit of
         | money for the moon shot, why not, but not all of it.
        
           | jayemar wrote:
           | "So, after Rome's all yours you just give it back to the
           | people? Tell me why."
        
         | bookofjoe wrote:
         | leave dang out of this
        
       | liuliu wrote:
       | I think all these articles begging the question: what's author's
       | credential to claim these things.
       | 
       | Be careful about consuming information from chatters, not doers.
       | There is only knowledge from doing, not from pondering.
        
         | wavemode wrote:
         | I'm generally more skeptical when reading takes and predictions
         | from people working at AI companies, who have a financial
         | interest in making sure the hype train continues.
         | 
         | To make an analogy - most people who will tell you not to
         | invest in cryptocurrency are not blockchain engineers. But does
         | that make their opinion invalid?
        
           | JackYoustra wrote:
           | The crypto people have no coherent story about why crypto is
           | fundamentally earth-shaking more than a story about either
           | gambling or regulatory avoidance, whereas the story for AI,
           | if you believe it, is a second industrial revolution and
           | labor automation where, to at least some small extent, it is
           | undeniable.
        
           | liuliu wrote:
           | Of course I trust people who working on L2 chains to tell me
           | how to scale Bitcoin and people who working on cryptography
           | to walk me through the ETH PoS algorithms.
           | 
           | You cannot lead to truth by learning from people who don't
           | know. People who know can be biased, sure, so the best way to
           | learn is to learn the knowledge, not the "hot-takes" or
           | "predictions".
        
         | friendzis wrote:
         | > Be careful about consuming information from chatters, not
         | doers
         | 
         | The doers produce a new javascript framework every week,
         | claiming it finally solves all the pains of previous
         | frameworks, whereas the chatters pinpoint all the deficiencies
         | and pain points.
         | 
         | One group has an immensely better track record than the other.
        
           | liuliu wrote:
           | I would listen to people who used the previous frameworks
           | about the deficiencies and pain points, not people who just
           | casually browse the documentation about their high-flying
           | ideas why these have deficiencies and pain points.
           | 
           | One group has an immensely more convincing power to me.
        
         | usaar333 wrote:
         | LW isn't a place that cares about credentialism.
         | 
         | He has tons of links for the objective statements. You either
         | accept the interpretation or you don't.
        
           | NitpickLawyer wrote:
           | > He has tons of links for the objective statements.
           | 
           | I stopped at this quote
           | 
           | > LLMs still seem as terrible at this as they'd been in the
           | GPT-3.5 age.
           | 
           | This is so plainly, objectively and quantitatively wrong that
           | I need not bother. I get hyperbole, but this isn't it. This
           | shows a doubling-down on biases that the author has, and no
           | amount of proof will change their mind. Not an article /
           | source for me, then.
        
       | viccis wrote:
       | Regarding "AGI", is there any evidence of true synthetic a priori
       | knowledge from an LLM?
        
         | cheevly wrote:
         | Produce true synthetic a priori knowledge of your own, and ill
         | show you an automated LLM workflow that can arrive at the same
         | outcome without hints.
        
           | viccis wrote:
           | Build an LLM on a corpus with all documents containing
           | mathematical ideas removed. Not a single one about numbers,
           | geometry, etc. Now figure out how to get it to tell you what
           | the shortest path between two points in space is.
        
       | dartharva wrote:
       | >At some point there might be massive layoffs due to ostensibly
       | competent AI labor coming onto the scene, perhaps because OpenAI
       | will start heavily propagandizing that these mass layoffs must
       | happen. It will be an overreaction/mistake. The companies that
       | act on that will crash and burn, and will be outcompeted by
       | companies that didn't do the stupid.
       | 
       | (IMO) Apart from programmer assistance (which is already
       | happening), AI agents will find the most use in secretarial,
       | ghostwriting and customer support roles, which generally have a
       | large labor surplus and won't immediately "crash and burn"
       | companies even if there are failures. Perhaps if it's a new
       | startup or a small, unstable business on shaky grounds this could
       | become a "last straw" kind of a factor, but for traditional
       | corporations with good leeway I don't think just a few mistakes
       | about AI deployment can do too much harm. The potential benefits,
       | on the other hand, far outmatch the risk taken.
        
         | hattmall wrote:
         | I see engineering, not software, but the other technical areas
         | that have the biggest threat. High paid, knowledge based
         | fields, but not reliant on interpersonal communication.
         | Secretarial and customer support less so, they aren't terribly
         | high paid and anything that relies on interacting with people
         | is going to meet a lot of pushback. US based call centers is
         | already a big selling point for a lot of companies and chat
         | bots have been around for years in customer support and people
         | hate them and there's a long way to go to change that
         | perception.
        
       | readthenotes1 wrote:
       | LLMs seem less hyped than block chains were back in the day
        
         | kfarr wrote:
         | Agreed and unlike blockchain people actually use this product
        
           | randomNumber7 wrote:
           | Some people use blockchain to buy drugs...
        
       | n_ary wrote:
       | Hmm, I didn't read the article but from the gist of other
       | comments, we seem to have bought into Sama's "agents so good, you
       | don't need developers/engineers/support/secretaries/whatever
       | anymore". Issue is, it is almost same as claiming, pocket
       | calculators so good, we don't need accountants anymore, even
       | computers so good, we don't need accountants anymore. This AI
       | seems to claim to be that motor car moment when horse cart got
       | replaced. But a horse cart got replaced with a Taxi(and they also
       | have unions protecting them!). With AI, all these "to be
       | replaced" people are like accountants, more productive, same as
       | with higher level languages compared to assembly, many new devs
       | are productive. Despite cars replacing the horse carts of the
       | long past, we still fail to have self driving cars and still
       | someone needs to learn to drive that massive hunk of metal, same
       | as whoever plans to deploy LLM to layoff devs must learn to drive
       | those LLMs and know what it is doing.
       | 
       | I believe it is high time we come out this madness and reveal the
       | lies of the marketers and grifters of AI for what it is. If AI
       | can replace anyone, it should begin with doctors, they work with
       | rote knowledge and service based on explicit(though ambiguous)
       | inputs, same as an LLM needs, but I still have doctors and wait
       | for hours on end in the waiting room to get prescribed a cough
       | hard candy only to later comeback again because it was actually
       | covid and my doctor had a brain fart.
        
       | a-dub wrote:
       | > LLMs are not good in some domains and bad in others. Rather,
       | they are incredibly good at some specific tasks and bad at other
       | tasks. Even if both tasks are in the same domain, even if tasks A
       | and B are very similar, even if any human that can do A will be
       | able to do B.
       | 
       | i think this is true of ai/ml systems in general. we tend to
       | anthropomorphise their capability curves to match the cumulative
       | nature of human capabilities, where often times the capability
       | curve of the machine is discontinuous and has surprising gaps.
        
       | worik wrote:
       | > It blows Google out of the water at being Google
       | 
       | That is enough for me.
        
         | mandevil wrote:
         | I sincerely wonder how long that will be true. Google was
         | amazing and didn't have more than small, easily ignorable ads
         | in 1999, and they weren't really tracking you the way they are
         | today, just an all-around better experience than Google
         | delivers today.
         | 
         | I'm not sure that it's a technology difference that makes LLM a
         | better experience than search today, it's that the VC's are
         | still willing to subsidize user experience today, and won't
         | start looking for return on their investment for a few more
         | years. Give OpenAI 10 years to pull all the levers to pay back
         | the VC investment and what will it be like?
        
           | timmy-turner wrote:
           | They will sell "training data slots". So that when I'm
           | looking for a butter cookie recipe, ChatGPT says I'll have to
           | use 100g of "Brand (TM) Butter" instead of just "Butter".
           | 
           | Ask it how to deploy an app to the cloud and it will insist
           | you need to deploy it to Azure.
           | 
           | These ads would be easily visible though. You can probably
           | sell far more malicious things.
        
       | andsoitis wrote:
       | This poetic statement by the author sums it up for me:
       | 
       |  _"People are extending LLMs a hand, hoping to pull them up to
       | our level. But there 's nothing reaching back."_
        
         | blitzar wrote:
         | When you (attempt to) save a person from drowning there is
         | ridiculously high chance of them drowning you.
        
           | nakedneuron wrote:
           | Haha.
           | 
           | Shame on you for making me laugh. That was very
           | inappropriate.
        
       | csomar wrote:
       | > LLMs still seem as terrible at this as they'd been in the
       | GPT-3.5 age. Software agents break down once the codebase becomes
       | complex enough, game-playing agents get stuck in loops out of
       | which they break out only by accident, etc.
       | 
       | This has been my observation. I got into Github Copilot as early
       | as it launched back when GPT-3 was the model. By that time (late
       | 2021) copilot can already write tests for my Rust functions, and
       | simple documentation. _This_ was revolutionary. We didn 't have
       | another similar moment since then.
       | 
       | The Github copilot vim plugin is always on. As you keep typing,
       | it keeps suggesting in faded text the rest of the context.
       | Because it is always on, I kind of can read into the AI "mind".
       | The more I coded, the more I realized it's just search with
       | structured results. The results got better with 3.5/4 but after
       | that only slightly and sometimes not quite (ie: 4o or o1).
       | 
       | I don't care what anyone says, as yesterday I made a comment that
       | truth has essentially died:
       | https://news.ycombinator.com/item?id=43308513 If you have a
       | revolutionary intelligence product, why is it not working for me?
        
         | kiratp wrote:
         | You're not using the best tools.
         | 
         | Claude Code, Cline, Cursor... all of them with Claude 3.7.
        
           | csomar wrote:
           | Nope. I try the latest models as they come and I have a self-
           | made custom setup (as in a custom lua plugin) in Neovim. What
           | I am not, is selling AI or AI-driven solutions.
        
             | hattmall wrote:
             | Similar experience, I try so hard to make AI useful, and
             | there are some decent spots here and there. Overall though
             | I see the fundamental problem being that people need
             | information. Language isn't strictly information, and the
             | LLMs are very good at language, but they aren't great at
             | information. I think anything more than the novelty of
             | "talking" to the AI is very over hyped.
             | 
             | There is some usefulness to be had for sure, but I don't
             | know if the usefulness is there with the non-subsidized
             | models.
        
               | fragmede wrote:
               | what does subsidization have to do with your use of a
               | thing?
        
             | cheevly wrote:
             | Perhaps we could help if you shared some real examples of
             | what walls you're hitting. But it sounds like you've
             | already made up your mind.
        
             | RamtinJ95 wrote:
             | Do you mean that you have successfully managed to get the
             | same experience in cursor but in neovim? I have been
             | looking for something like that to move back to my neovim
             | setup instead of using cursor. Any hints would be greatly
             | appreciated!
        
               | csomar wrote:
               | Start with Avante or CopilotChat. Create your own Lua
               | config/plugin (easy with Claude 3.5 ;) ) and then use
               | their chat window to run copilot/models. Most of my
               | custom config was built with Claude 3.5 and some
               | trial/error/success.
        
             | demosthanos wrote:
             | It's worth actually trying Cursor, because it _is_ a
             | valuable step change over previous products and you might
             | find it 's better in some ways than your custom setup. The
             | processes they use for creating the context seems to be
             | really good. And their autocomplete is far better than
             | Copilot's in ways that could provide inspiration.
             | 
             | That said, you're right that it's not as overwhelmingly
             | revolutionary as the internet would lead you to believe.
             | It's a step change over Copilot.
        
             | kiratp wrote:
             | The entire wrapped package of tested prompts, context
             | management etc. is a whole step change from what you can
             | build yourself.
             | 
             | There is a reason Cursor is the fastest startup to $100M in
             | revenue, ever.
        
         | roncesvalles wrote:
         | The last line has been my experience as well. I only trust what
         | I've verified firsthand now because the Internet is just so
         | rife with people trying to influence your thoughts in a way
         | that benefits them, over a good faith sharing of the truth.
         | 
         | I just recently heard this quote from a clip of Jeff Bezos:
         | "When the data and the anecdotes disagree, the anecdotes are
         | usually right.", and I was like... wow. That quote is the
         | zeitgeist.
         | 
         | If it's so revolutionary, it should be immediately obvious to
         | me. I knew Uber, Netflix, Spotify were revolutionary the first
         | time I used them. With LLMs for coding, it's like I'm groping
         | in the dark trying to find what others are seeing, and it's
         | just not there.
        
           | roenxi wrote:
           | > I knew Uber, Netflix, Spotify were revolutionary the first
           | time I used them.
           | 
           | Maybe re-tune your revolution sensor. None of those are
           | revolutionary companies. Profitable and well executed, sure,
           | but those turn up all the time.
           | 
           | Uber's entire business model was running over the legal
           | system so quickly that taxi licenses didn't have time to
           | catch up. Other than that it was a pretty obvious idea. It is
           | a taxi service. The innovations they made were almost
           | completely legal ones; figuring out how to skirt employment
           | and taxi law.
           | 
           | Netflix was anticipated online by and is probably inferior to
           | YouTube except for the fact that they have a pretty
           | traditional content creator lab tacked on the side to do
           | their own programs. And torrenting had been a thing for a
           | long time already showing how to do online distribution of
           | video content.
        
             | roncesvalles wrote:
             | They were revolutionary as product genres, not necessary
             | individual companies. Ordering a cab without making a phone
             | call was revolutionary. Netflix at least with its initial
             | promise of having all the world's movies and TV was
             | revolutionary, but it didn't live up to that. Spotify
             | because of how cheap and easy it was to have access to
             | _all_ the music, this was the era when people were paying
             | 99c per song on iTunes.
             | 
             | I've tried some AI code completion tools and none of them
             | hit me that way. My first reaction was "nobody is actually
             | going to use this stuff" and that opinion hasn't really
             | changed.
             | 
             | And if you think those 3 companies weren't revolutionary
             | then AI code completion is even less than that.
        
               | xnx wrote:
               | > Ordering a cab without making a phone call was
               | revolutionary.
               | 
               | With the power of AI, soon you'll be able to say "Hey
               | Siri, get me an Uber to the airport". As easy as making a
               | phone call.
        
               | jemmyw wrote:
               | And end up at an airport in an entirely different city.
        
               | roncesvalles wrote:
               | There was a gain in precision going from phone call to
               | app. There is a loss of precision going from app to
               | voice. The tradeoff of precision for convenience is
               | rarely worth it.
               | 
               | Because if it were, Uber would just make a widget asking
               | "Where do you want to go?" and you'd enter "Airport" and
               | that would be it. If a widget of some action is a bad
               | idea, so is the voice command.
        
               | esafak wrote:
               | Easier, because you don't have to search for a phone
               | number.
        
               | alabastervlog wrote:
               | And they'll be able to tack an extra couple dollars onto
               | the price because that's a good signal you're not gonna
               | comparison shop.
               | 
               | Innovation!
        
               | nitwit005 wrote:
               | You can book a flight or a taxi with a personal assistant
               | app like Siri today. People don't seem very interested in
               | doing so.
               | 
               | Barring some sort of accessibility issue, it's far easier
               | to deal with a visual representation of complex schedule
               | information.
        
               | immibis wrote:
               | "Do something existing with a different mechanism" is
               | innovative, but not revolutionary, and certainly not a
               | new "product genre". My parents used to order pizza by
               | phone calls, then a website, then an app. It's the same
               | thing. (The friction is a little bit less, but maybe
               | forcing another human to bring food to you because you're
               | feeling lazy _should_ have a little friction. And as a
               | side effect, we all stopped being as comfortable talking
               | to real people on phone calls!)
               | 
               | Napster came before Spotify.
        
               | HelloMcFly wrote:
               | > innovative, but not revolutionary
               | 
               | The experience of Netflix, Spotify, and Uber were
               | revolutionary. It felt like the future, and it worked as
               | expected. Sure, we didn't realize the poison these
               | products were introducing into many creative and labor
               | ecosystems, nor did we fully appreciate how they would
               | operate as means to widen the income inequality gap by
               | concentrating more profits to executives. But they fit
               | cleanly into many of our lives immediately.
               | 
               | Debating whether that's "revolutionary" or "innovative"
               | or "whatever-other-word" is just a semantic sideshow
               | common to online discourse. It's missing the point. I'll
               | use whatever word you want, but it doesn't change the
               | point.
        
               | immibis wrote:
               | Making simple, small improvements _feel_ revolutionary is
               | good marketing.
        
               | HelloMcFly wrote:
               | "Simple, small" and "good marketing" seem like obvious
               | undersells considering the titanic impacts Netflix and
               | Spotify (for instance) have had on culture, personal
               | media consumption habits, and the economics of
               | industries. But if that's the semantic construction that
               | works for you, so be it.
        
               | rchaud wrote:
               | > They were revolutionary as product genres, not
               | necessary individual companies.
               | 
               | Even then, they were evolutionary at best.
               | 
               | Before Netflix and Spotify, streaming movies and music
               | were already there as a technology, ask anybody with a
               | Megaupload or Sopcast account. What changed was that DMCA
               | acquired political muscle and cross-border reach, wiping
               | out waves of torrent sites and P2P networks. That left a
               | new generation of users with locked-down mobile devices
               | no option but to use legitimate apps who had deals in
               | place with the record labels and movie studios.
               | 
               | Even the concept of "downloading MP3s" disappeared
               | because every mobile OS vendor hated the idea of giving
               | their customers access to the filesystem, and iOS didn't
               | even have a file manager app until well into the next
               | decade (2017).
        
               | _Algernon_ wrote:
               | >every mobile OS vendor
               | 
               | Maybe half? Android has consistently had this capability
               | since its inception.
        
               | jimbokun wrote:
               | > streaming movies and music were already there as a
               | technology, ask anybody with a Megaupload or Sopcast
               | account.
               | 
               | You can't have a revolution without users. It's the
               | ability to reach a large audience, through superior UX,
               | superior business model, superior marketing, etc. which
               | creates the possibility for revolutionary impact.
               | 
               | Which is why Megaupload and Sopcast didn't revolutionize
               | anything.
        
               | Izkata wrote:
               | > What changed was that DMCA acquired political muscle
               | and cross-border reach, wiping out waves of torrent sites
               | and P2P networks.
               | 
               | Half true - that was happening some, but wasn't why music
               | piracy mostly died out. DMCA worked on centralized
               | platforms like YouTube, but the various avenues for
               | downloading music people used back then still exist,
               | they're just not used as much anymore. Spotify was proof
               | that piracy is mostly a service problem: it was suddenly
               | easier for most people to get the music they wanted
               | through official channels than through piracy.
        
             | csomar wrote:
             | > None of those are revolutionary companies.
             | 
             | Not only Uber/Grab (or delivery app) were revolutionary,
             | they are still revolutionary. I could live without LLMs and
             | my life will be slightly impacted when coding. If delivery
             | apps are not available, my life is _severely_ degraded. The
             | other day I was sick. I got medicine and dinner with Grab.
             | Delivered to the condo lobby which is as far as I can get.
             | That is revolutionary.
        
               | InfiniteTitan wrote:
               | Is it revolutionary to order from a screen rather than
               | calling a restaurant for delivery? I don't think so.
        
               | Dakizhu wrote:
               | Honestly, yes. Calling in an order can result in the
               | restaurant botching the order and you have no way to
               | challenge it unless you recorded the call. Also, as
               | someone who's been on both sides of the transaction, some
               | people have poor audio quality or speak accented English,
               | which is difficult to understand. Ordering from a screen
               | saves everyone valuable time and reduces confusion.
        
               | philwelch wrote:
               | I've had app delivery orders get botched, drivers get
               | lost on their way to my apartment, food show up cold or
               | ruined, etc.
               | 
               | The worst part is that when DoorDash fucks up an order,
               | the standard remediation process every other business
               | respects--either a full refund or come back, pick up the
               | wrong order, and bring you the correct order--is just not
               | something they ever do. And if you want to avoid
               | DoorDash, you can't because if you order from the
               | restaurant directly it often turns out to be white label
               | DoorDash.
               | 
               | Some days I wish there was a corporate death penalty and
               | that it could be applied to DoorDash.
        
               | fragmede wrote:
               | Practically or functionally? Airbnb was invented by
               | people posting on craigslist message boards, and even
               | existed before the Internet, if you had rich friends with
               | spare apartments. But by packaging it up into an online
               | platform it became a company with 2.5 billion in revenue
               | last year. So you can dismiss ordering from a screen
               | instead of looking at a piece of paper and using the
               | phone as not being revolutionary, because of you squint,
               | they're the same thing, but I can now order take out for
               | restaurants I previously would never have ordered from,
               | and Uber Eats generated $13.7 billion in revenue last
               | year, up from 12.2.
        
               | rlnvlc wrote:
               | Were you not able to order food before Uber/Grab?
        
               | csomar wrote:
               | I am not in the US and yes, it is not a thing (though
               | there was a pizza place that had phone order, but that's
               | rather an exception).
        
               | sjsdaiuasgdia wrote:
               | Before the proliferation of Uber Eats, Doordash, GrubHub,
               | etc, most of the places I've lived had 2 choices for
               | delivered food: pizza and Chinese.
               | 
               | It has absolutely massively expanded the kinds of food I
               | can get delivered living in a suburban bordering on rural
               | area. It might be a different experience in cities where
               | the population size made delivery reasonable for many
               | restaurants to offer on their own.
        
               | Rediscover wrote:
               | FWIW, local Yellow Cab et al, in the U.S., has been doing
               | that for /decades/ in the areas I've lived.
               | 
               | Rx medicine delivery used to be quite standard for taxis.
        
             | jimbokun wrote:
             | > The innovations they made were almost completely legal
             | ones; figuring out how to skirt employment and taxi law.
             | 
             | The impact of this was quite revolutionary.
             | 
             | > except for the fact that they have a pretty traditional
             | content creator lab tacked on the side to do their own
             | programs.
             | 
             | The way in which they did this was quite innovative, if not
             | "revolutionary". They used the data they had from the
             | watching habits of their large user base to decide what
             | kinds of content to invest in creating.
        
           | fragmede wrote:
           | > it's just not there
           | 
           | Build the much maligned Todo app with Aider and Claude for
           | yourself. give it one sentence and have it spit out working,
           | if imperfect code. iterate. add a graph for completion or
           | something and watch it pick and find a library without you
           | having to know the details of that library. fine, sure, it's
           | just a Todo app, and it'll never work for a "real" codebase,
           | whatever that means, but holy shit, just how much programming
           | did you need to get down and dirty with to build that
           | "simple" Todo app? Obviously building a Todo app before LLMs
           | was possible, but abstracted out, the fact that it can be
           | generated like that's not a game changer?
        
           | mlsu wrote:
           | Revolutionary things are things that change how society
           | actually works at a fundamental level. I can think of four
           | technologies of the past 40 years that fit that bill:
           | 
           | the personal computer
           | 
           | the internet
           | 
           | the internet connected phone
           | 
           | social media
           | 
           | those technologies are revolutionary, because they caused
           | fundamental changes to how people behave. People who behaved
           | differently in the "old world" were _forced_ to adapt to a
           | "new world" with those technologies, whether they wanted to
           | or not. A newer more convenient way of ordering a taxicab or
           | watching a movie or music are great consumer product stories,
           | and certainly big money makers. They don't cause complex and
           | not fully understood changes to way people work, play,
           | interact, self-identify, etc. the way that revolutionary
           | technologies do.
           | 
           | Language models _feel_ like they have the potential to be a
           | full blown sociotechnological phenomenon like the above four.
           | They don 't have a convenient consumer product story beyond
           | ChatGPT today. But they are slowly seeping into the fabric of
           | things, especially on social media, and changing the way
           | people apply to jobs, draft emails, do homework, maybe
           | eventually communicate and self-identify at a basic level.
           | 
           | I'd almost say that the lack of a smash bang consumer product
           | story is even more evidence that the technology is diffusing
           | all over the place.
        
         | grumbel wrote:
         | While I don't disagree with that observation, it falls into the
         | "well, duh!"-category for me. The models are build with no
         | mechanism for long term memory and thus suck at tasks that
         | require long term memory. There is nothing surprising here.
         | There was never any expectation that LLMs magically develop
         | long term memory, as that's impossible given the architecture.
         | They predict the next word and once the old text moves out of
         | the context window, it's gone. The models neither learn as they
         | work nor can they remember the past.
         | 
         | It's not even like humans are all that different here. Strip a
         | human of their tools (pen&paper, keyboard, monitor, etc.) and
         | have them try solving problems with nothing but the power of
         | their brain and they'll struggle a hell of a lot too, since our
         | memory ain't exactly perfect either. We don't have perfect
         | recall, we look things up when we need to, a large part of our
         | "memory" is out there in the world around us, not in our head.
         | 
         | The open question is how to move forward. But calling AI
         | progress a dead end before we even started exploring long term
         | memory, tool use and on-the-fly learning is a tad little
         | premature. It's like calling quits on the development of the
         | car before you put the wheels on.
        
         | _huayra_ wrote:
         | Ultimately, every AI thing I've tried in this era seems to want
         | to make me happy, even if it's wrong, instead of helping me.
         | 
         | I describe it like "an eager intern who can summarize a 20-min
         | web search session instantly, but ultimately has insufficient
         | insight to actually help you". (Note to current interns: I'm
         | mostly describing myself some years ago; you may be fantastic
         | so don't take it personally!)
         | 
         | Most of my interactions with it via text prompt or builtin code
         | suggestions go like this:
         | 
         | 1. Me: I want to do X in C++. Show me how to do it only using
         | stdlib components (no external libraries).
         | 
         | 2. LLM: Gladly! Here is solution X
         | 
         | 3. Me: Remove the undefined behavior from foo() and fix the
         | methods that call it
         | 
         | 4. LLM: Sure! Here it is (produces solution X again)
         | 
         | 5. Me: No you need to remove the use of uninitialized variables
         | as the out parameters.
         | 
         | 6. LLM: Oh certainly! Here is the correct solution (produces a
         | completely different solution that also has issues)
         | 
         | 7. Me: No go back to the first one
         | 
         | etc
         | 
         | For the ones that suggest code, it can at least suggest some
         | very simple boilerplate very easily (e.g. gtest and gmock stuff
         | for C++), but asking it to do anything more significant is a
         | real gamble. Often I end up spending more time scrutinizing the
         | suggested code than writing a version of it myself.
        
           | rchaud wrote:
           | The difference is that interns can learn, and can benefit
           | from reference items like a prior report, whose format and
           | structure they can follow when working on the revisions.
           | 
           | AI is just AI. You can upload a reference file for it to
           | summarize, but it's not going to be able to look at the
           | structure of the file and use that as a template for future
           | reports. You'll still have to spoon-feed it constantly.
        
           | red-iron-pine wrote:
           | interns can generally also tell me "tbh i have no damn idea",
           | while AI just talks out it's virtual ass, and I can't read
           | from it's voice or behavior that maybe it's not sure.
           | 
           | interns can also be clever and think outside the box. this is
           | mostly not good, but sometimes they will surprise you in a
           | good way. the AI by definition can only copy what someone
           | else has done.
        
           | yifanl wrote:
           | 7 is the worst part about trying to review my coworker's code
           | that I'm 99% confident is copilot output - and to be clear, I
           | don't really care how someone chooses to write their code,
           | I'll still review it as evenly as I can.
           | 
           | I'll very rarely ask someone to completely rewrite a patch,
           | but so often a few minor comments get addressed with an
           | entire new block of code that forces me to do a full re-
           | review, and I can't get it across to him that that's not what
           | I'm asking for.
        
         | kledru wrote:
         | github copilot is a bit outdated technology to be fair...
        
       | colonCapitalDee wrote:
       | Yeah, I'd buy it. I've been using Claude pretty intensively as a
       | coding assistant for the last couple months, and the limitations
       | are obvious. When the path of least resistance happens to be a
       | good solution, Claude excels. When the best solution is off the
       | beaten track, Claude struggles. When all the good solutions lay
       | off the beaten track, Claude falls flat on its face.
       | 
       | Talking with Claude about design feels like talking with that one
       | coworker who's familiar with every trendy library and framework.
       | Claude knows the general sentiment around each library and has
       | gone through the quickstart, but when you start asking detailed
       | technical questions Claude just nods along. I wouldn't bet money
       | on it, but my gut feeling is that LLMs aren't going to be a
       | straight or even curved shot to AGI. We're going to see plenty
       | more development in LLMs, but it'll be just be that. Better LLMs
       | that remain LLMs. There will be areas where progress is fast and
       | we'll be able to get very high intelligence in certain
       | situations, but there will also be many areas where progress is
       | slow, and the slow areas will cripple the ability of LLMs to
       | reach AGI. I think there's something fundamentally missing, and
       | finding what that "something" is is going to take us decades.
        
         | randomNumber7 wrote:
         | Yes, but on the other hand I don't understand why people think
         | something that you can train something on pattern matching and
         | it magically becomes intelligent.
        
           | danielbln wrote:
           | We don't know what exactly makes us humans as intelligent as
           | we are. And while I don't think that LLMs will be general
           | intelligent without some other advancements, I don't get the
           | confident statements that "clearly pattern matching can't
           | lead to intelligence" when we don't really know what leads to
           | intelligence to begin with.
        
             | nyrikki wrote:
             | We can't even define what intelligence is.
             | 
             | We know or have strong hints at the limits of
             | math/computation related to LLMs + CoT
             | 
             | Note how PARITY and MEDIAN is hard here:
             | 
             | https://arxiv.org/abs/2502.02393
             | 
             | We also know HALT == open frame == symbol grounding ==
             | system identification problems.
             | 
             | The definition of AGI is also not well defined, but given
             | the following:
             | 
             | > Strong AI, also called artificial general intelligence,
             | refers to machines possessing generalized intelligence and
             | capabilities on par with human cognition.
             | 
             | We know enough for _any mechanical methods_ with either
             | current machines or even quantum machines, what is needed
             | is impossible with the above definition.
             | 
             | Walter Pitts drank himself to death, in part because of the
             | failure of the perceptron model.
             | 
             | Humans and machines are better at different things, and
             | while ANNs are inspired by biology, they are very
             | different.
             | 
             | There are some hints that the way biological neurons work
             | is incompatible with math as we know it.
             | 
             | https://arxiv.org/abs/2311.00061
             | 
             | Computation and machine learning are incredibly powerful
             | and useful, but are fundamentally different, and that
             | different is both a benefit and a limit.
             | 
             | There are dozens of 'no effective procedure', 'no
             | approximation', etc .. results that demonstrate that ML as
             | we know it today is possible of most definitions of AGI.
             | 
             | That is why particular C* types shift the goal post,
             | because we know that the traditional definition of strong
             | AI is equivalent to solving HALT.
             | 
             | https://philarchive.org/rec/DIEEOT-2
             | 
             | There is another path following PAC Learning as compression
             | an NP being about finding parsimonious reductions (P being
             | in NP)
        
               | zero_bias wrote:
               | Humans can't solve NP-hard problems either, so definition
               | of intelligence shouldn't lie here, and these particular
               | limits shouldn't matter too
        
           | throw4847285 wrote:
           | This is the difference between the scientific approach and
           | the engineering approach. Engineers just need results. If
           | humans had to mathematically model gravity first, there would
           | be no pyramids. Plus, look up how many psychiatric
           | medications are demonstrated to be very effective, but the
           | action mechanisms are poorly understood. The flip side is
           | Newton doing alchemy or Tesla claiming to have built an
           | earthquake machine.
           | 
           | Sometimes technology far predates science and other times you
           | need a scientific revolution to develop new technology. In
           | this case, I have serious doubts that we can develop
           | "intelligent" machines without understanding the scientific
           | and even philosophical underpinnings of human intelligence.
           | But sometimes enough messing around yields results. I guess
           | we'll see.
        
         | danielbln wrote:
         | A tip: ask Claude to put a critical hat on. I find the output
         | afterwards to be improved.
        
           | mehphp wrote:
           | Do you have an example?
        
         | Paradigma11 wrote:
         | I am not so sure about that. Using Claude yesterday it gave me
         | a correct function that returned an array. But the algorithm it
         | used did not return the items sorted in one pass so it had run
         | a separate sort at the end. The fascinating thing is that it
         | realized that, commented on it and went on and returned a
         | single pass function.
         | 
         | That seems a pretty human thought process and shows that
         | fundamental improvements might not depend as much on the
         | quality of the LLM itself but on the cognitive structure it is
         | embedded.
        
           | jemmyw wrote:
           | I've been writing code that implements tournament algorithms
           | for games. You'd think an LLM would excel at this because it
           | can explain the algorithms to me. I've been using cline on
           | lots of other tasks to varying success. But it just totally
           | failed with this one: it kept writing edge cases instead of a
           | generic implementation. It couldn't write coherent enough
           | tests across a whole tournament.
           | 
           | So I wrote tests thinking it could implement the code from
           | the tests, and it couldn't do that either. At one point it
           | went so far with the edge cases that it just imported the
           | test runner into the code so it could check the test name to
           | output the expected result. It's like working with a VW
           | engineer.
           | 
           | Edit: I ended up writing the code and it wasn't that hard, I
           | don't know why it struggled with this one task so badly. I
           | wasted far more time trying to make the LLM work than just
           | doing it myself.
        
       | gymbeaux wrote:
       | Yeah agree 100%. LLMs are overrated. I describe them as the "Jack
       | of all, master of none" of AI. LLMs are that jackass guy we all
       | know who has to chime in to every topic like he knows everything,
       | but in reality he's a fraud with low self-esteem.
       | 
       | I've known a guy since college who now has a PhD in something
       | niche, supposedly pulls a $200k/yr salary. One of our first
       | conversations (in college, circa 2014) was how he had this clever
       | and easy way to mint money- by selling Minecraft servers
       | installed on Raspberry Pis. Some of you will recognize how
       | asinine this idea was and is. For everyone else- back then,
       | Minecraft only ran on x86 CPUs (and I doubt a Pi would make a
       | good Minecraft server today, even if it were economical). He had
       | no idea what he was talking about, he was just spewing shit like
       | he was God's gift. Actually, the problem wasn't that he had _no_
       | idea- it was that he knew a tiny bit- enough to sound smart to an
       | idiot (remind you of anyone?).
       | 
       | That's an LLM. A jackass with access to Google.
       | 
       | I've had great success with SLMs (small language models), and
       | what's more I don't need a rack of NVIDIA L40 GPUs to train and
       | use them.
        
       | usaar333 wrote:
       | Author also made a highly upvoted and controversial comment about
       | o3 in the same vein that's worth reading:
       | https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3?comment...
       | 
       | Oh course lesswrong, being heavily AI doomers, may be slightly
       | biased against near term AGI just from motivated reasoning.
       | 
       | Gotta love this part of the post no one has yet addressed:
       | 
       | > At some unknown point - probably in 2030s, possibly tomorrow
       | (but likely not tomorrow) - someone will figure out a different
       | approach to AI. Maybe a slight tweak to the LLM architecture,
       | maybe a completely novel neurosymbolic approach. Maybe it will
       | happen in a major AGI lab, maybe in some new startup. By default,
       | everyone will die in <1 year after that
        
         | demaga wrote:
         | I would expect similar doom predictions in the era of nuclear
         | weapon invention, but we've survived so far. Why do people
         | assume AGI will be orders of magnitude more dangerous than what
         | we already have?
        
           | amoss wrote:
           | Nuclear weapons are not self-improving or self-replicating.
        
             | colonial wrote:
             | Self-improvement (in the "hard takeoff" sense) is hardly a
             | given, and hostile self-replication is nothing special in
             | the software realm (see: worms.)
             | 
             | Any technically competent human knows the foolproof
             | strategy for malware removal - pull the plug, scour the
             | platter clean, and restore from backup. What makes an out-
             | of-control pile of matrix math any different from WannaCry?
             | 
             | AI doom scenarios _seem_ scary, but most are premised on
             | the idea that we can create an uncontainable, undefeatable
             | "god in a box." I reject such premises. The whole idea is
             | silly - Skynet Claude or whatever is not going to last very
             | long once I start taking an axe to the nearest power pole.
        
               | dsign wrote:
               | You have a point that a powerful malicious AI can still
               | be unplugged, if you are close to each and every power
               | cord that would feed it, and react and do the right thing
               | each and every time. Our world is far too big and too
               | complicated to guarantee that.
        
               | colonial wrote:
               | Again, that's the "god in a box" premise. In the real
               | world, you wouldn't need a perfectly timed and
               | coordinated response, just like we haven't needed one for
               | human-programmed worms.
               | 
               | Any threat can be physically isolated case-by-case at the
               | link layer, neutered, and destroyed. Sure, it could cause
               | some destruction in the meantime, but our digital
               | infrastructure can take a _lot_ of heat and bounce back -
               | the CrowdStrike outages didn 't destroy the world, now
               | did they?
        
           | usaar333 wrote:
           | More ability to kill everyone. That's harder to do with
           | nukes.
           | 
           | That said, the actual forecast odds on metaculus are pretty
           | similar for nuclear and AI catastrophies:
           | https://possibleworldstree.com/
        
           | randomNumber7 wrote:
           | Most people are just ignorant and dumb, dont listen to it.
        
         | HelloMcFly wrote:
         | Was that comment intended seriously? I thought it was a wry
         | joke.
        
           | usaar333 wrote:
           | I think so. Thane is aligned with the high p doom folks.
           | 
           | 1 year may be slightly exaggerated, but it aligns with his
           | view
        
         | gwern wrote:
         | I never thought I'd see the day that LessWrong would be accused
         | of being biased _against_ near-term AGI forecasts (and for none
         | of the 5 replies to question this description either). But here
         | we are. Indeed do many things come to pass.
        
       | cglace wrote:
       | The thing I can't wrap my head around is that I work on extremely
       | complex AI agents every day and I know how far they are from
       | actually replacing anyone. But then I step away from my work and
       | I'm constantly bombarded with "agents will replace us".
       | 
       | I wasted a few days trying to incorporate aider and other tools
       | into my workflow. I had a simple screen I was working on for
       | configuring an AI Agent. I gave screenshots of the expected
       | output. Gave a detailed description of how it should work. Hours
       | later I was trying to tweak the code it came up with. I scrapped
       | everything and did it all myself in an hour.
       | 
       | I just don't know what to believe.
        
         | hattmall wrote:
         | There are some fields though where they can replace humans in
         | significant capacity. Software development is probably one of
         | the least likely for anything more than entry level, but A LOT
         | of engineering has a very very real existential threat. Think
         | about designing buildings. You basically just need to know a
         | lot of rules / tables and how things interact to know what's
         | possible and the best practices. A purpose built AI could
         | develop many systems and back test them to complete the design.
         | A lot of this is already handled or aided by software, but a
         | main role of the engineer is to interface with the non-
         | technical persons or other engineers. This is something where
         | an agent could truly interface with the non-engineer to figure
         | out what they want, then develop it and interact with the
         | design software quite autonomously.
         | 
         | I think though there is a lot of focus on AI agents in software
         | development though because that's just an early adopter market,
         | just like how it's always been possible to find a lot of
         | information on web development on the web!
        
           | drysine wrote:
           | >a main role of the engineer is to interface with the non-
           | technical persons or other engineers
           | 
           | The main role of the engineer is being responsible for the
           | building not collapsing.
        
             | randomNumber7 wrote:
             | ChatGPT will probably take more responsibility than Boeing
             | for their airplane software.
        
             | tobr wrote:
             | I keep coming back to this point. Lots of jobs are
             | fundamentally about taking responsibility. Even if AI were
             | to replace most of the work involved, only a human can
             | meaningfully take responsibility for the outcome.
        
               | dogmayor wrote:
               | I think about this a lot when it comes to self-driving
               | cars. Unless a manufacturer assumes liability, why would
               | anyone purchase one and subject themselves to potential
               | liability for something they by definition did not do?
               | This issue will be a big sticking point for adoption.
        
           | arkh wrote:
           | > just
           | 
           | In my experience this word means you don't know whatever
           | you're speaking about. "Just" almost always hide a ton of
           | unknown unknowns. After being burned enough times nowadays
           | when I'm going to use it I try to stop and start asking more
           | questions.
        
             | fragmede wrote:
             | It's a trick of human psychology. Asking "why don't you
             | just..." leads to one reaction, when asking "what are the
             | road blocks to completing..." leads to a different but same
             | answer. But thinking "just" is good when you see it as a
             | learning opportunity.
        
           | gerikson wrote:
           | Most engineering fields are _de jure_ professional, which
           | means they can and probably will enforce limitations on the
           | use of GenAI or its successor tech before giving up that kind
           | of job security. Same goes for the legal profession.
           | 
           | Software development does not have that kind of protection.
        
             | red-iron-pine wrote:
             | for ~3 decades IT could pretend it didn't need unions
             | because wages and opportunities were good. now the pendulum
             | is swinging back -- maybe they do need those kinds of
             | protections.
             | 
             | and professional orgs are more than just union-ish cartels,
             | they exist to ensure standards, and enforce responsibility
             | on their members. you do shitty unethical stuff as a lawyer
             | and you get disbarred; doctors lose medical licenses, etc.
        
           | ForHackernews wrote:
           | Good freaking luck! The inconsistencies of the software world
           | pale in comparison to trying to construct any real world
           | building: http://johnsalvatier.org/blog/2017/reality-has-a-
           | surprising-...
        
           | seanhunter wrote:
           | > "you basically just need to know a lot of rules..."
           | 
           | This comment commits one of the most common fallacies that I
           | see really often in technical people, which is to assume that
           | any subject you don't know anything about must be really
           | simple.
           | 
           | I have no idea where this comment comes from, but my father
           | was a chemical engineer and his father was mechanical
           | engineer. A family friend is a structural engineer. I don't
           | have a perspective about AI replacing people's jobs in
           | general that is any more valuable than anyone elses, but I
           | can say with a great deal of confidence that in those three
           | engineering disciplines specifically literally none of any of
           | their jobs are about knowing a bunch of rules and best
           | practices.
           | 
           | Don't make the mistake of thinking that just because you
           | don't know what someone does, that their job is easy and/or
           | unnecessary or you could pick it up quickly. It may or may
           | not be true but assuming it to be the case is unlikely to
           | take you anywhere good.
        
         | spaceman_2020 wrote:
         | You're biased because if you're here, you're likely an A-tier
         | player used to working with other A-tier players.
         | 
         | But the vast majority of the world is not A players. They're B
         | and C players
         | 
         | I don't think the people evaluating AI tools have ever worked
         | in wholly mediocre organizations - or even know how many
         | mediocre organizations exist
        
           | code_for_monkey wrote:
           | wish this didnt resonate with me so much. Im far from a 10x
           | developer, and im in an organization that feels like a giant,
           | half dead whale. Sometimes people here seem like they work on
           | a different planet.
        
         | cheevly wrote:
         | I promise the amount of time, experiments and novel approaches
         | you've tested are .0001% of what others have running in stealth
         | projects. Ive spent an average of 10 hours per day constantly
         | since 2022 working on LLMs, and I know that even what I've
         | built pales in comparison to other labs. (And im well beyond
         | agents at this point). Agentic AI is what's popular in the
         | mainstream, but it's going to be trounced by at least 2 new
         | paradigms this year.
        
           | cglace wrote:
           | So what is your prediction?
        
           | handfuloflight wrote:
           | Say more.
        
             | zanfr wrote:
             | seems like OP ran out of tokens
        
         | aaronbaugher wrote:
         | It kind of reminds me of the Y2K scare. Leading up to that,
         | there were a _lot_ of people in groups like
         | comp.software.year-2000 who claimed to be doing Y2K fixes at
         | places like the IRS and big corporations. They said they were
         | just doing triage on the most critical systems, and that most
         | things wouldn 't get fixed, so there would be all sorts of
         | failures. The "experts" who were closest to the situation,
         | working on it in person, turned out to be completely wrong.
         | 
         | I try to keep that in mind when I hear people who work with
         | LLMs, who usually have an emotional investment in AI and often
         | a financial one, speak about them in glowing terms that just
         | don't match up with my own small experiments.
        
         | naasking wrote:
         | > But then I step away from my work and I'm constantly
         | bombarded with "agents will replace us".
         | 
         | An assembly language programmer might have said the same about
         | C programming at one point. I think the point is, that once you
         | depend on a more abstract interface that permits you to ignore
         | certain details, that permits decades of improvements to that
         | backend without you having to do anything. People are still
         | experimenting with what this abstract interface is and how it
         | will work with AI, but they've already come leaps and bounds
         | from where they were only a couple of years ago, and it's only
         | going to get better.
        
       | tibbar wrote:
       | LLMs make it very easy to cheat, both academically and
       | professionally. What this looks like in the workplace is a junior
       | engineer not understanding their task or how to do it but
       | stuffing everything into the LLM until lint passes. This breaks
       | the trust model: there are many requirements that are a little
       | hard to verify than an LLM might miss, and the junior engineer
       | can now represent to you that they "did what you ask" without
       | really certifying the work output. I believe that this kind of
       | professional cheating is just as widespread as academic cheating,
       | which is an epidemic.
       | 
       | What we really need is people who can certify that a task was
       | done correctly, who can use LLMs as an aid. LLMs simply cannot be
       | responsible for complex requirements. There is no way to hold
       | them accountable.
        
       | swazzy wrote:
       | I see no reason to believe the extraordinary progress we've seen
       | recently will stop or even slow down. Personally, I've benefited
       | so much from AI that it feels almost alien to hear people
       | downplaying it. Given the excitement in the field and the sheer
       | number of talented individuals actively pushing it forward, I'm
       | quite optimistic that progress will continue, if not accelerate.
        
         | danielbln wrote:
         | I hear you, I feel constantly bewildered by comments like "LLMs
         | haven't changed really since GPT3.5.", I mean really? It went
         | from an exciting novelty to a core pillar of my daily work,
         | it's allowed me and my entire (granted , quote senior) org to
         | be incredibly more productive and creative with our solutions.
         | 
         | And the I stumble across a comment where some LLM hallucinated
         | a library that means clearly AI is useless.
        
         | Workaccount2 wrote:
         | If LLM's are bumpers on a bowling lane, HN is a forum of pro
         | bowlers.
         | 
         | Bumpers are not gonna make you a pro bowler. You aren't going
         | to be hitting tons of strikes. Most pro bowlers won't notice
         | any help from bumpers, except in some edge cases.
         | 
         | If you are an average joe however, and you need to knock over
         | pins with some level of consistency, then those bumpers are a
         | total revolution.
        
           | esafak wrote:
           | That is not a good analogy. They are closer to assistants to
           | me. If you know how and what to delegate, you can increase
           | your productivity.
        
       | spaceman_2020 wrote:
       | The impression I get from using all cutting edge AI tools:
       | 
       | 1. Sonnet 3.7 is a mid-level web developer at least
       | 
       | 2. DeepResearch is about as good an analyst as an MBA from a
       | school ranked 50+ nationally. Not lower than that. EY, not
       | McKinsey
       | 
       | 3. Grok 3/GPT-4.5 are good enough as $0.05/word article writers
       | 
       | Its not replacing the A-players but its good enough to replace B
       | players and definitely better than C and D players
        
         | tcoff91 wrote:
         | A midlevel web developer should do a whole lot more than just
         | respond to chat messages and do exactly what they are told to
         | do and no more.
        
           | danielbln wrote:
           | When I use LLMs that what it does. Spawns commands, edits
           | files, runs tests, evaluates outputs, iterates and solutions
           | under my guidance.
        
             | weweersdfsd wrote:
             | The key here is "under your guidance". LLM's are a major
             | productivity boost for many kinds of jobs, but can LLM-
             | based agents be trusted to act fully autonomously for tasks
             | with real world consequence? I think the answer is still
             | no, and will be for a long time. I wouldn't trust LLM to
             | even order my groceries without review, let alone push code
             | into production.
             | 
             | To reach anything close to definition of AGI, LLM agents
             | should be able to independently talk to customers,
             | iteratively develop requirements, produce and test
             | solutions, and push them to production once customers are
             | happy. After that, they should be able to fix any issues
             | arising in production. All this without babysitting /
             | review / guidance from human devs, reliably
        
         | id00 wrote:
         | I'd expect mid-level developer to show more understanding and
         | better reasoning. So far it looks like a junior dev who read a
         | lot of books and good at copy pasting from stackoverflow.
         | 
         | (Based on my everyday experience with Sonet and Cursor)
        
       | roenxi wrote:
       | This seems to be ignoring the major force driving AI right now -
       | hardware improvements. We've barely seen a new hardware
       | generation since ChatGPT was released to the market, we'd
       | certainly expect it to plateau fairly quickly on fixed hardware.
       | My personal experience of AI models is going to be a series of
       | step changes every time the VRAM on my graphics card doubles. Big
       | companies are probably going to see something similar each time a
       | new more powerful product hits the data centre. The algorithms
       | here aren't all that impressive compared to the creeping FLOPS/$
       | metric.
       | 
       | Bear cases always welcome. This wouldn't be the first time in
       | computing history that progress just falls off the exponential
       | curve suddenly. Although I would bet money on there being a few
       | years left and AGI is achieved.
        
         | notTooFarGone wrote:
         | hardware improvements don't strike me as the horse to bet on.
         | 
         | LLM Progression seems to be linear and compute needed
         | exponential. And I don't see exponential hardware improvements
         | besides some new technology (that we should not bet on coming
         | ayntime soon).
        
           | redlock wrote:
           | Moore's law is exponential
        
             | jimbokun wrote:
             | Was.
        
         | greazy wrote:
         | > Although I would bet money on there being a few years left
         | and AGI is achieved.
         | 
         | Yeah? I'll take you up on that offer. $100AUD AGI won't happen
         | this decade.
        
       | gmt2027 wrote:
       | The typical AI economic discussion always focuses on job loss,
       | but that's only half the story. We won't just have corporations
       | firing everyone while AI does all the work - who would buy their
       | products then?
       | 
       | The disruption goes both ways. When AI slashes production costs
       | by 10-100x, what's the value proposition of traditional capital?
       | If you don't need to organize large teams or manage complex
       | operations, the advantage of "being a capitalist" diminishes
       | rapidly.
       | 
       | I'm betting on the rise of independents and small teams. The idea
       | that your local doctor or carpenter needs VC funding or an IPO
       | was always ridiculous. Large corps primarily exist to organize
       | labor and reduce transaction costs.
       | 
       | The interesting question: when both executives and frontline
       | workers have access to the same AI tools, who wins? The manager
       | with an MBA or the person with practical skills and domain
       | expertise? My money's on the latter.
        
         | randomNumber7 wrote:
         | Idk where you live, but in my world "being a capitalist"
         | requires you to own capital. And you know what, AI makes it
         | even better to own capital. Now you have these fancey machines
         | doing stuff for you and you dont even need any annoying
         | workers.
        
           | gmt2027 wrote:
           | By "capitalist," I'm referring to investors whose primary
           | contribution is capital, not making a political statement
           | about capitalism itself.
           | 
           | Capital is crucial when tools and infrastructure are
           | expensive. Consider publishing: pre-internet, starting a
           | newspaper required massive investment in printing presses,
           | materials, staff, and distribution networks. The web reduced
           | these costs dramatically, allowing established media to cut
           | expenses and focus on content creation. However, this also
           | opened the door for bloggers and digital news startups to
           | compete effectively without the traditional capital
           | requirements. Many legacy media companies are losing this
           | battle.
           | 
           | Unless AI systems remain prohibitively expensive (which seems
           | unlikely given current trends), large corporations will face
           | a similar disruption. When the tools of production become
           | accessible to individuals and small teams, the traditional
           | advantage of having deep pockets diminishes significantly.
        
       | HenryBemis wrote:
       | My predictions on the matter:                 LLMs are already
       | super useful.        It does all my coding and scripting for me
       | @home       It does most of the coding and scripting at the
       | workplace       It creates 'fairly good' checklists for work (not
       | perfect, but it takes a 4 hour effort and makes it 25mins - but
       | the "Pro" is still needed to make this or that checklist usable -
       | I call this a win)(need both the tech AND the human)
       | If/when you train an 'in-house' LLM it can make some easy wins
       | (on mega-big-companies with 100k staff they can get quick answers
       | on "which Policy writes about XYZ, which department can I talk to
       | about ABC, etc.)       We won't have the "AGI"/Skynet anytime
       | soon, and when one will exist the company (let's use OpenAI for
       | example) will split in two. Half will give LLMs for the masses at
       | $100 per month, the "Skynet" will go to the DOD and we will never
       | hear about it again, except in the Joe Rogan podcast as a rumor.
       | It is a great 'idea generator' (search engine and results
       | aggregator): give me a list of 10 things I can do _that_ weekend
       | in _city_I_will_be_traveling_to so if/when I go to (e.g. London):
       | here are the cool concerts, theatrical performances, parks, blah
       | blah blah
        
       | audessuscest wrote:
       | > It seems to me that "vibe checks" for how smart a model feels
       | are easily gameable by making it have a better personality.
       | 
       | I don't buy that at all, most of my use cases don't involve
       | model's personality, if anything I usually instruct to skip any
       | commentary and give the result excepted only. I'm sure most
       | people using AI models seriously would agree.
       | 
       | > My guess is that it's most of the reason Sonnet 3.5.1 was so
       | beloved. Its personality was made much more appealing, compared
       | to e. g. OpenAI's corporate drones.
       | 
       | I would actually guess it's mostly because it was good at code,
       | which doesn't involve much personnality
        
       | orangebread wrote:
       | I think the author provides an interesting perspective to the AI
       | hype, however, I think he is really downplaying the effectiveness
       | of what you can do with the current models we have.
       | 
       | If you've been using LLMs effectively to build agents or AI-
       | driven workflows you understand the true power of what these
       | models can do. So in some ways the author is being a little
       | selective with his confirmation bias.
       | 
       | I promise you that if you do your due diligence in exploring the
       | horizon of what LLMs can do you will understand what I'm saying.
       | If ya'll want a more detailed post I can get into the AI systems
       | I have been building. Don't sleep on AI.
        
         | aetherson wrote:
         | I don't think he is downplaying the effectiveness of what you
         | can do with the current models. Rather, he's in a milieu
         | (LessWrong), which is laser-focused on "transformative" AI,
         | AGI, and ASI.
         | 
         | Current AI is clearly economically valuable, but if we freeze
         | everything at the capabilities it has today it is also clearly
         | not going to result in mass transformation of the economy from
         | "basically being about humans working" to "humans are
         | irrelevant to the economy." Lots of LW people believe that in
         | the next 2-5 years humans will become irrelevant to the
         | economy. He's arguing against that belief.
        
         | mbil wrote:
         | I agree with you. I recently wrote up my perspective here:
         | https://news.ycombinator.com/item?id=43308912
        
       | lackoftactics wrote:
       | >GPT-5 will be even less of an improvement on GPT-4.5 than
       | GPT-4.5 was on GPT-4. The pattern will continue for GPT-5.5 and
       | GPT-6, the ~1000x and 10000x models they may train by 2029 (if
       | they still have the money by then). Subtle quality-of-life
       | improvements and meaningless benchmark jumps, but nothing
       | paradigm-shifting.
       | 
       | It's easy to spot people who secretly hate LLMs and feel
       | threatened by them these days. GPT-5 will be a unified model,
       | very different from 4o or 4.5. Throwing around numbers related to
       | scaling laws shows a lack of proper research. Look at what
       | DeepSeek accomplished with far fewer resources; their paper is
       | impressive.
       | 
       | I agree that we need more breakthroughs to achieve AGI. However,
       | these models increase productivity, allowing people to focus more
       | on research. The number of highly intelligent people currently
       | working on AI is astounding, considering the number of papers and
       | new developments. In conclusion, we will reach AGI. It's a race
       | with high stakes, and history shows that these types of races
       | don't stop until there is a winner.
        
         | contagiousflow wrote:
         | > In conclusion, we will reach AGI
         | 
         | I'm a little confused by this confidence? Is there more
         | evidence aside from the number of smart people working on it?
         | We have a lot of smart people working on a lot of big problems,
         | that doesn't guarantee a solution nor a timeline.
        
         | rscho wrote:
         | It's also easy to spot irrational zealots. Your statement is no
         | more plausible than OP's. No one knows whether we'll achieve
         | AGI, especially since the definition is very blurry.
        
         | radioactivist wrote:
         | Some hard problems have remain unsolved in basically every
         | field of human interest for decades/centuries/millennia --
         | despite the number of intelligent people and/or resources that
         | have been thrown at them.
         | 
         | I really don't understand the level optimism that seems to
         | exist for LLMs. And speculating that people "secretly hate
         | LLMs" and "feel threatened by them" isn't an answer (frankly,
         | when I see arguments that start with attacks like that alarm
         | bells start going off in my head).
        
         | ath3nd wrote:
         | I logged in to specifically downvote this comment, because it
         | attacks the OP's position with unjustified and unsubstantiated
         | confidence in the reverse.
         | 
         | > It's easy to spot people who secretly hate LLMs and feel
         | threatened by them these days.
         | 
         | I don't think OP is threatened or hates LLM, if anything, OP is
         | on the position that LLM are so far away from intelligence that
         | it's laughable to consider it threatening.
         | 
         | > In conclusion, we will reach AGI
         | 
         | The same way we "cured" cancer and Alzheimer's, two arguably
         | much more important inventions than a glorified text
         | predictor/energy guzzler. But I like the confidence, it's
         | almost as much as OP's confidence that nothing substantial will
         | happen.
         | 
         | > It's a race with high stakes, and history shows that these
         | types of races don't stop until there is a winner.
         | 
         | So is the existential threat to humanity in the race to phase
         | out fossil fuels/stop global warming, and so far I don't see
         | anyone "winning".
         | 
         | > However, these models increase productivity, allowing people
         | to focus more on research
         | 
         | The same way the invention of the computer, the car, the vacuum
         | cleaner and all the productivity increasing inventions in the
         | last centuries allowed us to idle around, not have a job, and
         | focus on creative things.
         | 
         | > It's easy to spot people who secretly hate LLMs and feel
         | threatened by them these days
         | 
         | It's easy to spot e/acc bros feeling threatened that all the
         | money they sunk into crypto, AI, the metaverse, web3 are gonna
         | go to waste and try to fan the hype around it so they can cash
         | in big. How does that sound?
        
           | lackoftactics wrote:
           | I appreciate the pushback and acknowledge that my earlier
           | comment might have conveyed too much certainty--skepticism
           | here is justified and healthy.
           | 
           | However, I'd like to clarify why optimism regarding AGI isn't
           | merely wishful thinking. Historical parallels such as
           | heavier-than-air flight, Go, and protein folding illustrate
           | how sustained incremental progress combined with competition
           | can result in surprising breakthroughs, even where previous
           | efforts had stalled or skepticism seemed warranted. AI isn't
           | just a theoretical endeavor; we've seen consistent and
           | measurable improvements year after year, as evidenced by
           | Stanford's AI Index reports and emergent capabilities
           | observed at larger scales.
           | 
           | It's true that smart people alone don't guarantee success.
           | But the continuous feedback loop in AI research--where
           | incremental progress feeds directly into further research--
           | makes it fundamentally different from fields characterized by
           | static or singular breakthroughs. While AGI remains ambitious
           | and timelines uncertain, the unprecedented investment,
           | diversity of research approaches, and absence of known
           | theoretical barriers suggest the odds of achieving
           | significant progress (even short of full AGI) remain strong.
           | 
           | To clarify, my confidence isn't about exact timelines or
           | certainty of immediate success. Instead, it's based on
           | historical lessons, current research dynamics, and the
           | demonstrated trajectory of AI advancements. Skepticism is
           | valuable and necessary, but history teaches us to stay open
           | to possibilities that seem improbable until they become
           | reality.
           | 
           | P.S. I apologize if my comment particularly triggered you and
           | compelled you to log in and downvote. I am always open to
           | debate, and I admit again that I started too strongly.
        
             | ath3nd wrote:
             | I am with you that when smart people combine their efforts
             | together and build on previous research + learnings,
             | nothing is impossible.
        
               | lackoftactics wrote:
               | I started the conversation off on the wrong foot.
               | Commenting with "ad hominem" shuts down open discussion.
               | 
               | I hope we can have a nice talk in future conversations.
        
       | mark_l_watson wrote:
       | I have used neural networks for engineering problems since the
       | 1980s. I say this as context for my opinion: I cringe at most
       | applications of LLMs that attempt mostly autonomous behavior, but
       | I love using LLMs as 'side kicks' as I work. If I have a bug in
       | my code, I will add a few printout statements where I think my
       | misunderstanding of my code is, show an LLM my code and output,
       | explain the error: I very often get useful feedback.
       | 
       | I also like practical tools like NotebookLM where I can pose some
       | questions, upload PDFs, and get a summary based in what my
       | questions.
       | 
       | My point is: my brain and experience are often augmented in
       | efficient ways by LLMs.
       | 
       | So far I have addressed practical aspects of LLMs. I am retired
       | so I can spend time on non practical things: currently I am
       | trying to learn how to effectively use code generated by gemini
       | 2.0 flash at runtime; the gemini SDK supports this fairly well so
       | I am just trying to understand what is possible (before this I
       | spent two months experimenting with writing my own
       | tools/functions in Common Lisp and Python.)
       | 
       | I "wasted" close to two decades of my professional life on old
       | fashioned symbolic AI (but I was well paid for the work) but I am
       | interested in probabilistic approaches, such as in a book I
       | bought yesterday "Causal AI" that was just published.
       | 
       | Lastly, I think some of the recent open source implementations of
       | new ideas from China are worth carefully studying.
        
         | hangonhn wrote:
         | I'll add this in case it's helpful to anyone else: LLMs are
         | really good at regex and undoing various encodings/escaping,
         | especially nested ones. I would go so far to say that it's
         | better than a human at the latter.
         | 
         | I once spend over an hour trying to unescape JSON containing
         | UTF8 values that's been escaped prior to being written to AWS's
         | Cloudwatch Logs for MySQL audit logs. It was a horrific level
         | of pain until I just asked ChatGPT to do it and it figured out
         | all the series of escapes and encoding immediately and gave me
         | the step to reverse them all.
         | 
         | LLM as a sidekick has saved me so much time. I don't really use
         | it to generate code but for some odd tasks or API look up, it's
         | a huge time saver.
        
       | carlosdp wrote:
       | > At some point there might be massive layoffs due to ostensibly
       | competent AI labor coming onto the scene, perhaps because OpenAI
       | will start heavily propagandizing that these mass layoffs must
       | happen. It will be an overreaction/mistake. The companies that
       | act on that will crash and burn, and will be outcompeted by
       | companies that didn't do the stupid.
       | 
       | Um... I don't think companies are going to perform mass layoffs
       | because "OpenAI said they must happen". If that were to happen
       | it'd be because they are genuinely able to automate a ton of jobs
       | using LLMs, which would be a bull case (not for AGI necessarily,
       | but for the increased usefulness of LLMs)
        
         | AlotOfReading wrote:
         | I don't think LLMs need to be able to genuinely fulfill the
         | duties of a job to replace the human. Think call center workers
         | and insurance reviewers where the point is to meet metrics
         | without regard for the quality of the work performed. The main
         | thing separating those jobs from say, HR (or even programmers)
         | is how much the company cares about the quality of the work.
         | It's not hard to imagine a situation where misguided people try
         | to replace large numbers of federal employees with LLMs, as an
         | entirely hypothetical example.
        
       | Timber-6539 wrote:
       | AI has no meaningful input to real world productivity because it
       | is a toy that is never going to become the real thing that every
       | person who has naively bought the AI hype expects it to be. And
       | the end result of all the hype looks almost too predictable
       | similar to how the also once promising crypto & blockchain
       | technology turned out.
        
       | mcintyre1994 wrote:
       | > Scaling CoTs to e. g. millions of tokens or effective-
       | indefinite-size context windows (if that even works) may or may
       | not lead to math being solved. I expect it won't.
       | 
       | > (If math is solved, though, I don't know how to estimate the
       | consequences, and it might invalidate the rest of my
       | predictions.)
       | 
       | What does it mean for math to be solved in this context? Is it
       | the idea that an AI will be able to generate any mathematical
       | proof? To take a silly example, would we get a proof of whether
       | P=NP from an AI that had solved math?
        
         | daveguy wrote:
         | I think "math is solved" refers more to AI performing math
         | studies at the level of a mathematics graduate student.
         | Obviously "math" won't ever be "solved" but the problem of AI
         | getting to a certain math proficiency level could be. No matter
         | how good an AI is, if P != NP it won't be able to prove P=NP.
         | 
         | Regardless I don't think our AI systems are close to a
         | proficiency breakthrough.
         | 
         | Edit: it is odd that "math is solved" is never explained. But
         | "proficient to do math research" makes the most sense to me.
        
       | Imnimo wrote:
       | >Test-time compute/RL on LLMs: >It will not meaningfully
       | generalize beyond domains with easy verification.
       | 
       | To me, this is the biggest question mark. If you could get good
       | generalized "thinking" from just training on math/code problems
       | with verifiers, that would be a huge deal. So far, generalization
       | seems to be limited. Is this because of a fundamental limitation,
       | or because the post-training sets are currently too small (or
       | otherwise deficient in some way) to induce good thinking
       | patterns? If the latter, is that fixable?
        
         | trashtester wrote:
         | > Is this because of a fundamental limitation, or because the
         | post-training sets are currently too small (or otherwise
         | deficient in some way) to induce good thinking patterns?
         | 
         | "Thinking" isn't a singular thing. Humans learn to think in
         | layer upon layer of understandig the world, physical, social
         | and abstract, all at many different levels.
         | 
         | Embodiment will allow them to use RL on the physical world, and
         | this in combination with access to not only means of
         | communication but also interacting in ways where there is skin
         | in the game, will help them navigate social and digital spaces.
        
       | bilsbie wrote:
       | I have times when I use an LLM and it's completely brain dead and
       | can't handle the simplest questions.
       | 
       | Then other times it blows me away. Even figuring out things that
       | can't possibly have been in its training data.
       | 
       | I think there are groups of people that have either had all of
       | the first experience or all of the latter. And that's why we see
       | over optimistic and over pessimistic takes (like this one)
       | 
       | I think the reality is current LLM's are better than he realizes
       | and even if we plateau I really don't see how we don't make more
       | breakthroughs in the next few years.
        
       | klik99 wrote:
       | This almost exactly what I've been saying while everyone was
       | saying we're on the path to AGI in the next couple of years.
       | We're an innovation / tweak / or paradigm shift away from AGI.
       | His estimate in the 2030s that could happen is possible but
       | optimistic- you can't time new techniques, you can only time
       | progress on iterative progress.
       | 
       | This is all the standard timeline for new technology - we enter
       | the diminishing returns period, investment slows down a year or
       | so afterwards, layoffs, contraction of industry, but when the
       | hype dies down the real utilitarian part of the cycle begins. We
       | start seeing it get integrated into the use cases it actually
       | fits well with and by five years time its standard practice.
       | 
       | This is a normal process for any useful technology (notably
       | crypto never found sustainable use cases so it's kind of the
       | exception, it's in superposition of lingering hype and complete
       | dismissal), so none of this should be a surprise to anyone. It's
       | funny that I've been saying this for so long that I've been
       | pegged an AI skeptic, but in a couple of years when everyone is
       | burnt out on AI hype it'll sound like a positive view. The truth
       | is, hype serves a purpose for new technology, since it kicks off
       | a wide search for every crazy use case, most of which won't work.
       | But the places where it does work will stick around
        
       | JTbane wrote:
       | Anyone else feel like AI is a trap for developers? I feel like
       | I'm alone in the opinion it decreases competence. I guess I'm a
       | mid-level dev (5 YOE at one company) and I tend to avoid it.
        
       ___________________________________________________________________
       (page generated 2025-03-10 23:02 UTC)