[HN Gopher] A bear case: My predictions regarding AI progress
___________________________________________________________________
A bear case: My predictions regarding AI progress
Author : suryao
Score : 176 points
Date : 2025-03-10 04:20 UTC (18 hours ago)
(HTM) web link (www.lesswrong.com)
(TXT) w3m dump (www.lesswrong.com)
| stego-tech wrote:
| > At some point there might be massive layoffs due to ostensibly
| competent AI labor coming onto the scene, perhaps because OpenAI
| will start heavily propagandizing that these mass layoffs must
| happen. It will be an overreaction/mistake. The companies that
| act on that will crash and burn, and will be outcompeted by
| companies that didn't do the stupid.
|
| We're already seeing this with tech doing RIFs and not
| backfilling domestically for developer roles (the whole, "we're
| not hiring devs in 202X" schtick), though the not-so-quiet secret
| is that a lot of those roles just got sent overseas to save on
| labor costs. The word from my developer friends is that they are
| _sick and tired_ of having to force a (often junior /outsourced)
| colleague to explain their PR or code, only to be told "it works"
| and for management to overrule their concerns; this is embedding
| AI slopcode into products, which I'm sure won't have _any lasting
| consequences_.
|
| My bet is that software devs who've been keeping up with their
| skills will have another year or two of tough times, then back
| into a cushy Aeron chair with a sparkling new laptop to do what
| they do best: _write readable, functional, maintainable code_ ,
| albeit in more targeted ways since - and I hate to be _that
| dinosaur_ - LLMs produce passable code, provided a competent
| human is there to smooth out its rougher edges and rewrite it to
| suit the codebase and style guidelines (if any).
| dartharva wrote:
| One could argue that's not strictly "AI labor", just cheap (but
| real) labor using shortcuts because they're not paid enough to
| give a damn.
| stego-tech wrote:
| Oh, no, you're 100% right. One of these days I will pen my
| essay on the realities of outsourced labor.
|
| Spoiler alert: they are giving just barely enough to not get
| prematurely fired, because they know if you're cheap enough
| to outsource in the first place, you'll give the contract to
| whoever is cheapest at renewal anyway.
| carlosdp wrote:
| I'll take that bet, easily.
|
| There's absolutely no way that we're not going to see a massive
| reduction in the need for "humans writing code" moving forward,
| given how good LLMs are getting at writing code.
|
| That doesn't mean people won't need devs! I think there's a
| real case where increased capabilities from LLMs leads to
| bigger demand for people that know how to direct the tools
| effectively, of which most would probably be devs. But thinking
| we're going back to humans "writing readable, functional,
| maintainable code" in two years is cope.
| rahimnathwani wrote:
| increased capabilities from LLMs leads to bigger demand for
| people that know how to direct the tools effectively
|
| This is the key thing.
| crote wrote:
| > There's absolutely no way that we're not going to see a
| massive reduction in the need for "humans writing code"
| moving forward, given how good LLMs are getting at writing
| code.
|
| Sure, but in the same way that Squarespace and Wix killed web
| development. LLMs are going to replace a decent bunch of low-
| hanging fruit, but those jobs were always at risk of being
| outsourced to the lowest bidder over in India anyways.
|
| The real question is, what's going to happen to the interns
| and the junior developers? If 10 juniors can create the same
| output as a single average developer equipped with a LLM,
| _who 's going to hire the juniors_? And if nobody is hiring
| juniors, how are we supposed to get the next generation of
| seniors?
|
| Similarly, what's going to happen to outsourcing? Will it be
| able to compete on quality and price? Will it secretly turn
| into nothing more than a proxy to some LLM?
| torginus wrote:
| Hate to be the guy to bring it up but Jevons paradox - in my
| experience, people are much more eager to build software in
| the LLM age, and projects are getting started (and done!)
| that were considered 'too expensive to build' or people
| didn't have the necessary subject matter expertise to build
| them.
|
| Just a simple crud-ish project needs frontend, backend,
| infra, cloud, ci/cd experience, and people who could build
| that as one man shows were like unicorns - a lot of people
| had a general how most of this stuff worked, but lacked the
| hands on familiarity with them. LLMs made that knowledge easy
| and accessible. They certainly did for me.
|
| I've shipped more software in the past 1-2 years than the 5
| years before that. And gained tons of experience doing it.
| LLMs helped me figure out the necessary software, and helped
| me gain a ton of experience, I gained all those skills, and I
| feel quite confident in that I could rebuild all these apps,
| but this time without the help of these LLMs, so even the
| fearmongering that LLMs will ;make people forget how to code'
| doesn't seem to ring true.
| fragmede wrote:
| What lasting consequences? Crowdstrike and the 2017 Equifax
| hack that leaked all our data didn't stop them. The shares of
| crowdstrike after it happened I bought are up more than the
| SP500. Elon went through Twitter and fired everybody but it
| hasn't collapsed. A carpenter has a lot of opinions about the
| woodworking used on cheap IKEA cabinets, but mass manufacturing
| and plastic means that building a good solid high quality chair
| is no longer the craft it used to be.
| bloomingkales wrote:
| Let's imagine that we all had a trillion dollars. Then we would
| all sit around and go "well dang, we have everything, what should
| we do?". I think you'll find that just about everyone would
| agree, "we oughta see how far that LLM thing can go". We could be
| in nuclear fallout shelters for decades, and I think you'll still
| see us trying to push the LLM thing underground, through duress.
| We dream of this, so the bear case is wrong in spirit. There's no
| bear case when the spirit of the thing is that strong.
| mola wrote:
| Wdym all of us? I certainly would find much better usages for
| the money.
|
| What about reforming democracy? Use the corrupt system to buy
| the votes, then abolish all laws allowing these kind of
| donations that allow buying votes.
|
| I'll litigate the hell out of all the oligarchs now that they
| can't out pay justice.
|
| This would pay off more than a moon shot. I would give a bit of
| money for the moon shot, why not, but not all of it.
| jayemar wrote:
| "So, after Rome's all yours you just give it back to the
| people? Tell me why."
| bookofjoe wrote:
| leave dang out of this
| liuliu wrote:
| I think all these articles begging the question: what's author's
| credential to claim these things.
|
| Be careful about consuming information from chatters, not doers.
| There is only knowledge from doing, not from pondering.
| wavemode wrote:
| I'm generally more skeptical when reading takes and predictions
| from people working at AI companies, who have a financial
| interest in making sure the hype train continues.
|
| To make an analogy - most people who will tell you not to
| invest in cryptocurrency are not blockchain engineers. But does
| that make their opinion invalid?
| JackYoustra wrote:
| The crypto people have no coherent story about why crypto is
| fundamentally earth-shaking more than a story about either
| gambling or regulatory avoidance, whereas the story for AI,
| if you believe it, is a second industrial revolution and
| labor automation where, to at least some small extent, it is
| undeniable.
| liuliu wrote:
| Of course I trust people who working on L2 chains to tell me
| how to scale Bitcoin and people who working on cryptography
| to walk me through the ETH PoS algorithms.
|
| You cannot lead to truth by learning from people who don't
| know. People who know can be biased, sure, so the best way to
| learn is to learn the knowledge, not the "hot-takes" or
| "predictions".
| friendzis wrote:
| > Be careful about consuming information from chatters, not
| doers
|
| The doers produce a new javascript framework every week,
| claiming it finally solves all the pains of previous
| frameworks, whereas the chatters pinpoint all the deficiencies
| and pain points.
|
| One group has an immensely better track record than the other.
| liuliu wrote:
| I would listen to people who used the previous frameworks
| about the deficiencies and pain points, not people who just
| casually browse the documentation about their high-flying
| ideas why these have deficiencies and pain points.
|
| One group has an immensely more convincing power to me.
| usaar333 wrote:
| LW isn't a place that cares about credentialism.
|
| He has tons of links for the objective statements. You either
| accept the interpretation or you don't.
| NitpickLawyer wrote:
| > He has tons of links for the objective statements.
|
| I stopped at this quote
|
| > LLMs still seem as terrible at this as they'd been in the
| GPT-3.5 age.
|
| This is so plainly, objectively and quantitatively wrong that
| I need not bother. I get hyperbole, but this isn't it. This
| shows a doubling-down on biases that the author has, and no
| amount of proof will change their mind. Not an article /
| source for me, then.
| viccis wrote:
| Regarding "AGI", is there any evidence of true synthetic a priori
| knowledge from an LLM?
| cheevly wrote:
| Produce true synthetic a priori knowledge of your own, and ill
| show you an automated LLM workflow that can arrive at the same
| outcome without hints.
| viccis wrote:
| Build an LLM on a corpus with all documents containing
| mathematical ideas removed. Not a single one about numbers,
| geometry, etc. Now figure out how to get it to tell you what
| the shortest path between two points in space is.
| dartharva wrote:
| >At some point there might be massive layoffs due to ostensibly
| competent AI labor coming onto the scene, perhaps because OpenAI
| will start heavily propagandizing that these mass layoffs must
| happen. It will be an overreaction/mistake. The companies that
| act on that will crash and burn, and will be outcompeted by
| companies that didn't do the stupid.
|
| (IMO) Apart from programmer assistance (which is already
| happening), AI agents will find the most use in secretarial,
| ghostwriting and customer support roles, which generally have a
| large labor surplus and won't immediately "crash and burn"
| companies even if there are failures. Perhaps if it's a new
| startup or a small, unstable business on shaky grounds this could
| become a "last straw" kind of a factor, but for traditional
| corporations with good leeway I don't think just a few mistakes
| about AI deployment can do too much harm. The potential benefits,
| on the other hand, far outmatch the risk taken.
| hattmall wrote:
| I see engineering, not software, but the other technical areas
| that have the biggest threat. High paid, knowledge based
| fields, but not reliant on interpersonal communication.
| Secretarial and customer support less so, they aren't terribly
| high paid and anything that relies on interacting with people
| is going to meet a lot of pushback. US based call centers is
| already a big selling point for a lot of companies and chat
| bots have been around for years in customer support and people
| hate them and there's a long way to go to change that
| perception.
| readthenotes1 wrote:
| LLMs seem less hyped than block chains were back in the day
| kfarr wrote:
| Agreed and unlike blockchain people actually use this product
| randomNumber7 wrote:
| Some people use blockchain to buy drugs...
| n_ary wrote:
| Hmm, I didn't read the article but from the gist of other
| comments, we seem to have bought into Sama's "agents so good, you
| don't need developers/engineers/support/secretaries/whatever
| anymore". Issue is, it is almost same as claiming, pocket
| calculators so good, we don't need accountants anymore, even
| computers so good, we don't need accountants anymore. This AI
| seems to claim to be that motor car moment when horse cart got
| replaced. But a horse cart got replaced with a Taxi(and they also
| have unions protecting them!). With AI, all these "to be
| replaced" people are like accountants, more productive, same as
| with higher level languages compared to assembly, many new devs
| are productive. Despite cars replacing the horse carts of the
| long past, we still fail to have self driving cars and still
| someone needs to learn to drive that massive hunk of metal, same
| as whoever plans to deploy LLM to layoff devs must learn to drive
| those LLMs and know what it is doing.
|
| I believe it is high time we come out this madness and reveal the
| lies of the marketers and grifters of AI for what it is. If AI
| can replace anyone, it should begin with doctors, they work with
| rote knowledge and service based on explicit(though ambiguous)
| inputs, same as an LLM needs, but I still have doctors and wait
| for hours on end in the waiting room to get prescribed a cough
| hard candy only to later comeback again because it was actually
| covid and my doctor had a brain fart.
| a-dub wrote:
| > LLMs are not good in some domains and bad in others. Rather,
| they are incredibly good at some specific tasks and bad at other
| tasks. Even if both tasks are in the same domain, even if tasks A
| and B are very similar, even if any human that can do A will be
| able to do B.
|
| i think this is true of ai/ml systems in general. we tend to
| anthropomorphise their capability curves to match the cumulative
| nature of human capabilities, where often times the capability
| curve of the machine is discontinuous and has surprising gaps.
| worik wrote:
| > It blows Google out of the water at being Google
|
| That is enough for me.
| mandevil wrote:
| I sincerely wonder how long that will be true. Google was
| amazing and didn't have more than small, easily ignorable ads
| in 1999, and they weren't really tracking you the way they are
| today, just an all-around better experience than Google
| delivers today.
|
| I'm not sure that it's a technology difference that makes LLM a
| better experience than search today, it's that the VC's are
| still willing to subsidize user experience today, and won't
| start looking for return on their investment for a few more
| years. Give OpenAI 10 years to pull all the levers to pay back
| the VC investment and what will it be like?
| timmy-turner wrote:
| They will sell "training data slots". So that when I'm
| looking for a butter cookie recipe, ChatGPT says I'll have to
| use 100g of "Brand (TM) Butter" instead of just "Butter".
|
| Ask it how to deploy an app to the cloud and it will insist
| you need to deploy it to Azure.
|
| These ads would be easily visible though. You can probably
| sell far more malicious things.
| andsoitis wrote:
| This poetic statement by the author sums it up for me:
|
| _"People are extending LLMs a hand, hoping to pull them up to
| our level. But there 's nothing reaching back."_
| blitzar wrote:
| When you (attempt to) save a person from drowning there is
| ridiculously high chance of them drowning you.
| nakedneuron wrote:
| Haha.
|
| Shame on you for making me laugh. That was very
| inappropriate.
| csomar wrote:
| > LLMs still seem as terrible at this as they'd been in the
| GPT-3.5 age. Software agents break down once the codebase becomes
| complex enough, game-playing agents get stuck in loops out of
| which they break out only by accident, etc.
|
| This has been my observation. I got into Github Copilot as early
| as it launched back when GPT-3 was the model. By that time (late
| 2021) copilot can already write tests for my Rust functions, and
| simple documentation. _This_ was revolutionary. We didn 't have
| another similar moment since then.
|
| The Github copilot vim plugin is always on. As you keep typing,
| it keeps suggesting in faded text the rest of the context.
| Because it is always on, I kind of can read into the AI "mind".
| The more I coded, the more I realized it's just search with
| structured results. The results got better with 3.5/4 but after
| that only slightly and sometimes not quite (ie: 4o or o1).
|
| I don't care what anyone says, as yesterday I made a comment that
| truth has essentially died:
| https://news.ycombinator.com/item?id=43308513 If you have a
| revolutionary intelligence product, why is it not working for me?
| kiratp wrote:
| You're not using the best tools.
|
| Claude Code, Cline, Cursor... all of them with Claude 3.7.
| csomar wrote:
| Nope. I try the latest models as they come and I have a self-
| made custom setup (as in a custom lua plugin) in Neovim. What
| I am not, is selling AI or AI-driven solutions.
| hattmall wrote:
| Similar experience, I try so hard to make AI useful, and
| there are some decent spots here and there. Overall though
| I see the fundamental problem being that people need
| information. Language isn't strictly information, and the
| LLMs are very good at language, but they aren't great at
| information. I think anything more than the novelty of
| "talking" to the AI is very over hyped.
|
| There is some usefulness to be had for sure, but I don't
| know if the usefulness is there with the non-subsidized
| models.
| fragmede wrote:
| what does subsidization have to do with your use of a
| thing?
| cheevly wrote:
| Perhaps we could help if you shared some real examples of
| what walls you're hitting. But it sounds like you've
| already made up your mind.
| RamtinJ95 wrote:
| Do you mean that you have successfully managed to get the
| same experience in cursor but in neovim? I have been
| looking for something like that to move back to my neovim
| setup instead of using cursor. Any hints would be greatly
| appreciated!
| csomar wrote:
| Start with Avante or CopilotChat. Create your own Lua
| config/plugin (easy with Claude 3.5 ;) ) and then use
| their chat window to run copilot/models. Most of my
| custom config was built with Claude 3.5 and some
| trial/error/success.
| demosthanos wrote:
| It's worth actually trying Cursor, because it _is_ a
| valuable step change over previous products and you might
| find it 's better in some ways than your custom setup. The
| processes they use for creating the context seems to be
| really good. And their autocomplete is far better than
| Copilot's in ways that could provide inspiration.
|
| That said, you're right that it's not as overwhelmingly
| revolutionary as the internet would lead you to believe.
| It's a step change over Copilot.
| kiratp wrote:
| The entire wrapped package of tested prompts, context
| management etc. is a whole step change from what you can
| build yourself.
|
| There is a reason Cursor is the fastest startup to $100M in
| revenue, ever.
| roncesvalles wrote:
| The last line has been my experience as well. I only trust what
| I've verified firsthand now because the Internet is just so
| rife with people trying to influence your thoughts in a way
| that benefits them, over a good faith sharing of the truth.
|
| I just recently heard this quote from a clip of Jeff Bezos:
| "When the data and the anecdotes disagree, the anecdotes are
| usually right.", and I was like... wow. That quote is the
| zeitgeist.
|
| If it's so revolutionary, it should be immediately obvious to
| me. I knew Uber, Netflix, Spotify were revolutionary the first
| time I used them. With LLMs for coding, it's like I'm groping
| in the dark trying to find what others are seeing, and it's
| just not there.
| roenxi wrote:
| > I knew Uber, Netflix, Spotify were revolutionary the first
| time I used them.
|
| Maybe re-tune your revolution sensor. None of those are
| revolutionary companies. Profitable and well executed, sure,
| but those turn up all the time.
|
| Uber's entire business model was running over the legal
| system so quickly that taxi licenses didn't have time to
| catch up. Other than that it was a pretty obvious idea. It is
| a taxi service. The innovations they made were almost
| completely legal ones; figuring out how to skirt employment
| and taxi law.
|
| Netflix was anticipated online by and is probably inferior to
| YouTube except for the fact that they have a pretty
| traditional content creator lab tacked on the side to do
| their own programs. And torrenting had been a thing for a
| long time already showing how to do online distribution of
| video content.
| roncesvalles wrote:
| They were revolutionary as product genres, not necessary
| individual companies. Ordering a cab without making a phone
| call was revolutionary. Netflix at least with its initial
| promise of having all the world's movies and TV was
| revolutionary, but it didn't live up to that. Spotify
| because of how cheap and easy it was to have access to
| _all_ the music, this was the era when people were paying
| 99c per song on iTunes.
|
| I've tried some AI code completion tools and none of them
| hit me that way. My first reaction was "nobody is actually
| going to use this stuff" and that opinion hasn't really
| changed.
|
| And if you think those 3 companies weren't revolutionary
| then AI code completion is even less than that.
| xnx wrote:
| > Ordering a cab without making a phone call was
| revolutionary.
|
| With the power of AI, soon you'll be able to say "Hey
| Siri, get me an Uber to the airport". As easy as making a
| phone call.
| jemmyw wrote:
| And end up at an airport in an entirely different city.
| roncesvalles wrote:
| There was a gain in precision going from phone call to
| app. There is a loss of precision going from app to
| voice. The tradeoff of precision for convenience is
| rarely worth it.
|
| Because if it were, Uber would just make a widget asking
| "Where do you want to go?" and you'd enter "Airport" and
| that would be it. If a widget of some action is a bad
| idea, so is the voice command.
| esafak wrote:
| Easier, because you don't have to search for a phone
| number.
| alabastervlog wrote:
| And they'll be able to tack an extra couple dollars onto
| the price because that's a good signal you're not gonna
| comparison shop.
|
| Innovation!
| nitwit005 wrote:
| You can book a flight or a taxi with a personal assistant
| app like Siri today. People don't seem very interested in
| doing so.
|
| Barring some sort of accessibility issue, it's far easier
| to deal with a visual representation of complex schedule
| information.
| immibis wrote:
| "Do something existing with a different mechanism" is
| innovative, but not revolutionary, and certainly not a
| new "product genre". My parents used to order pizza by
| phone calls, then a website, then an app. It's the same
| thing. (The friction is a little bit less, but maybe
| forcing another human to bring food to you because you're
| feeling lazy _should_ have a little friction. And as a
| side effect, we all stopped being as comfortable talking
| to real people on phone calls!)
|
| Napster came before Spotify.
| HelloMcFly wrote:
| > innovative, but not revolutionary
|
| The experience of Netflix, Spotify, and Uber were
| revolutionary. It felt like the future, and it worked as
| expected. Sure, we didn't realize the poison these
| products were introducing into many creative and labor
| ecosystems, nor did we fully appreciate how they would
| operate as means to widen the income inequality gap by
| concentrating more profits to executives. But they fit
| cleanly into many of our lives immediately.
|
| Debating whether that's "revolutionary" or "innovative"
| or "whatever-other-word" is just a semantic sideshow
| common to online discourse. It's missing the point. I'll
| use whatever word you want, but it doesn't change the
| point.
| immibis wrote:
| Making simple, small improvements _feel_ revolutionary is
| good marketing.
| HelloMcFly wrote:
| "Simple, small" and "good marketing" seem like obvious
| undersells considering the titanic impacts Netflix and
| Spotify (for instance) have had on culture, personal
| media consumption habits, and the economics of
| industries. But if that's the semantic construction that
| works for you, so be it.
| rchaud wrote:
| > They were revolutionary as product genres, not
| necessary individual companies.
|
| Even then, they were evolutionary at best.
|
| Before Netflix and Spotify, streaming movies and music
| were already there as a technology, ask anybody with a
| Megaupload or Sopcast account. What changed was that DMCA
| acquired political muscle and cross-border reach, wiping
| out waves of torrent sites and P2P networks. That left a
| new generation of users with locked-down mobile devices
| no option but to use legitimate apps who had deals in
| place with the record labels and movie studios.
|
| Even the concept of "downloading MP3s" disappeared
| because every mobile OS vendor hated the idea of giving
| their customers access to the filesystem, and iOS didn't
| even have a file manager app until well into the next
| decade (2017).
| _Algernon_ wrote:
| >every mobile OS vendor
|
| Maybe half? Android has consistently had this capability
| since its inception.
| jimbokun wrote:
| > streaming movies and music were already there as a
| technology, ask anybody with a Megaupload or Sopcast
| account.
|
| You can't have a revolution without users. It's the
| ability to reach a large audience, through superior UX,
| superior business model, superior marketing, etc. which
| creates the possibility for revolutionary impact.
|
| Which is why Megaupload and Sopcast didn't revolutionize
| anything.
| Izkata wrote:
| > What changed was that DMCA acquired political muscle
| and cross-border reach, wiping out waves of torrent sites
| and P2P networks.
|
| Half true - that was happening some, but wasn't why music
| piracy mostly died out. DMCA worked on centralized
| platforms like YouTube, but the various avenues for
| downloading music people used back then still exist,
| they're just not used as much anymore. Spotify was proof
| that piracy is mostly a service problem: it was suddenly
| easier for most people to get the music they wanted
| through official channels than through piracy.
| csomar wrote:
| > None of those are revolutionary companies.
|
| Not only Uber/Grab (or delivery app) were revolutionary,
| they are still revolutionary. I could live without LLMs and
| my life will be slightly impacted when coding. If delivery
| apps are not available, my life is _severely_ degraded. The
| other day I was sick. I got medicine and dinner with Grab.
| Delivered to the condo lobby which is as far as I can get.
| That is revolutionary.
| InfiniteTitan wrote:
| Is it revolutionary to order from a screen rather than
| calling a restaurant for delivery? I don't think so.
| Dakizhu wrote:
| Honestly, yes. Calling in an order can result in the
| restaurant botching the order and you have no way to
| challenge it unless you recorded the call. Also, as
| someone who's been on both sides of the transaction, some
| people have poor audio quality or speak accented English,
| which is difficult to understand. Ordering from a screen
| saves everyone valuable time and reduces confusion.
| philwelch wrote:
| I've had app delivery orders get botched, drivers get
| lost on their way to my apartment, food show up cold or
| ruined, etc.
|
| The worst part is that when DoorDash fucks up an order,
| the standard remediation process every other business
| respects--either a full refund or come back, pick up the
| wrong order, and bring you the correct order--is just not
| something they ever do. And if you want to avoid
| DoorDash, you can't because if you order from the
| restaurant directly it often turns out to be white label
| DoorDash.
|
| Some days I wish there was a corporate death penalty and
| that it could be applied to DoorDash.
| fragmede wrote:
| Practically or functionally? Airbnb was invented by
| people posting on craigslist message boards, and even
| existed before the Internet, if you had rich friends with
| spare apartments. But by packaging it up into an online
| platform it became a company with 2.5 billion in revenue
| last year. So you can dismiss ordering from a screen
| instead of looking at a piece of paper and using the
| phone as not being revolutionary, because of you squint,
| they're the same thing, but I can now order take out for
| restaurants I previously would never have ordered from,
| and Uber Eats generated $13.7 billion in revenue last
| year, up from 12.2.
| rlnvlc wrote:
| Were you not able to order food before Uber/Grab?
| csomar wrote:
| I am not in the US and yes, it is not a thing (though
| there was a pizza place that had phone order, but that's
| rather an exception).
| sjsdaiuasgdia wrote:
| Before the proliferation of Uber Eats, Doordash, GrubHub,
| etc, most of the places I've lived had 2 choices for
| delivered food: pizza and Chinese.
|
| It has absolutely massively expanded the kinds of food I
| can get delivered living in a suburban bordering on rural
| area. It might be a different experience in cities where
| the population size made delivery reasonable for many
| restaurants to offer on their own.
| Rediscover wrote:
| FWIW, local Yellow Cab et al, in the U.S., has been doing
| that for /decades/ in the areas I've lived.
|
| Rx medicine delivery used to be quite standard for taxis.
| jimbokun wrote:
| > The innovations they made were almost completely legal
| ones; figuring out how to skirt employment and taxi law.
|
| The impact of this was quite revolutionary.
|
| > except for the fact that they have a pretty traditional
| content creator lab tacked on the side to do their own
| programs.
|
| The way in which they did this was quite innovative, if not
| "revolutionary". They used the data they had from the
| watching habits of their large user base to decide what
| kinds of content to invest in creating.
| fragmede wrote:
| > it's just not there
|
| Build the much maligned Todo app with Aider and Claude for
| yourself. give it one sentence and have it spit out working,
| if imperfect code. iterate. add a graph for completion or
| something and watch it pick and find a library without you
| having to know the details of that library. fine, sure, it's
| just a Todo app, and it'll never work for a "real" codebase,
| whatever that means, but holy shit, just how much programming
| did you need to get down and dirty with to build that
| "simple" Todo app? Obviously building a Todo app before LLMs
| was possible, but abstracted out, the fact that it can be
| generated like that's not a game changer?
| mlsu wrote:
| Revolutionary things are things that change how society
| actually works at a fundamental level. I can think of four
| technologies of the past 40 years that fit that bill:
|
| the personal computer
|
| the internet
|
| the internet connected phone
|
| social media
|
| those technologies are revolutionary, because they caused
| fundamental changes to how people behave. People who behaved
| differently in the "old world" were _forced_ to adapt to a
| "new world" with those technologies, whether they wanted to
| or not. A newer more convenient way of ordering a taxicab or
| watching a movie or music are great consumer product stories,
| and certainly big money makers. They don't cause complex and
| not fully understood changes to way people work, play,
| interact, self-identify, etc. the way that revolutionary
| technologies do.
|
| Language models _feel_ like they have the potential to be a
| full blown sociotechnological phenomenon like the above four.
| They don 't have a convenient consumer product story beyond
| ChatGPT today. But they are slowly seeping into the fabric of
| things, especially on social media, and changing the way
| people apply to jobs, draft emails, do homework, maybe
| eventually communicate and self-identify at a basic level.
|
| I'd almost say that the lack of a smash bang consumer product
| story is even more evidence that the technology is diffusing
| all over the place.
| grumbel wrote:
| While I don't disagree with that observation, it falls into the
| "well, duh!"-category for me. The models are build with no
| mechanism for long term memory and thus suck at tasks that
| require long term memory. There is nothing surprising here.
| There was never any expectation that LLMs magically develop
| long term memory, as that's impossible given the architecture.
| They predict the next word and once the old text moves out of
| the context window, it's gone. The models neither learn as they
| work nor can they remember the past.
|
| It's not even like humans are all that different here. Strip a
| human of their tools (pen&paper, keyboard, monitor, etc.) and
| have them try solving problems with nothing but the power of
| their brain and they'll struggle a hell of a lot too, since our
| memory ain't exactly perfect either. We don't have perfect
| recall, we look things up when we need to, a large part of our
| "memory" is out there in the world around us, not in our head.
|
| The open question is how to move forward. But calling AI
| progress a dead end before we even started exploring long term
| memory, tool use and on-the-fly learning is a tad little
| premature. It's like calling quits on the development of the
| car before you put the wheels on.
| _huayra_ wrote:
| Ultimately, every AI thing I've tried in this era seems to want
| to make me happy, even if it's wrong, instead of helping me.
|
| I describe it like "an eager intern who can summarize a 20-min
| web search session instantly, but ultimately has insufficient
| insight to actually help you". (Note to current interns: I'm
| mostly describing myself some years ago; you may be fantastic
| so don't take it personally!)
|
| Most of my interactions with it via text prompt or builtin code
| suggestions go like this:
|
| 1. Me: I want to do X in C++. Show me how to do it only using
| stdlib components (no external libraries).
|
| 2. LLM: Gladly! Here is solution X
|
| 3. Me: Remove the undefined behavior from foo() and fix the
| methods that call it
|
| 4. LLM: Sure! Here it is (produces solution X again)
|
| 5. Me: No you need to remove the use of uninitialized variables
| as the out parameters.
|
| 6. LLM: Oh certainly! Here is the correct solution (produces a
| completely different solution that also has issues)
|
| 7. Me: No go back to the first one
|
| etc
|
| For the ones that suggest code, it can at least suggest some
| very simple boilerplate very easily (e.g. gtest and gmock stuff
| for C++), but asking it to do anything more significant is a
| real gamble. Often I end up spending more time scrutinizing the
| suggested code than writing a version of it myself.
| rchaud wrote:
| The difference is that interns can learn, and can benefit
| from reference items like a prior report, whose format and
| structure they can follow when working on the revisions.
|
| AI is just AI. You can upload a reference file for it to
| summarize, but it's not going to be able to look at the
| structure of the file and use that as a template for future
| reports. You'll still have to spoon-feed it constantly.
| red-iron-pine wrote:
| interns can generally also tell me "tbh i have no damn idea",
| while AI just talks out it's virtual ass, and I can't read
| from it's voice or behavior that maybe it's not sure.
|
| interns can also be clever and think outside the box. this is
| mostly not good, but sometimes they will surprise you in a
| good way. the AI by definition can only copy what someone
| else has done.
| yifanl wrote:
| 7 is the worst part about trying to review my coworker's code
| that I'm 99% confident is copilot output - and to be clear, I
| don't really care how someone chooses to write their code,
| I'll still review it as evenly as I can.
|
| I'll very rarely ask someone to completely rewrite a patch,
| but so often a few minor comments get addressed with an
| entire new block of code that forces me to do a full re-
| review, and I can't get it across to him that that's not what
| I'm asking for.
| kledru wrote:
| github copilot is a bit outdated technology to be fair...
| colonCapitalDee wrote:
| Yeah, I'd buy it. I've been using Claude pretty intensively as a
| coding assistant for the last couple months, and the limitations
| are obvious. When the path of least resistance happens to be a
| good solution, Claude excels. When the best solution is off the
| beaten track, Claude struggles. When all the good solutions lay
| off the beaten track, Claude falls flat on its face.
|
| Talking with Claude about design feels like talking with that one
| coworker who's familiar with every trendy library and framework.
| Claude knows the general sentiment around each library and has
| gone through the quickstart, but when you start asking detailed
| technical questions Claude just nods along. I wouldn't bet money
| on it, but my gut feeling is that LLMs aren't going to be a
| straight or even curved shot to AGI. We're going to see plenty
| more development in LLMs, but it'll be just be that. Better LLMs
| that remain LLMs. There will be areas where progress is fast and
| we'll be able to get very high intelligence in certain
| situations, but there will also be many areas where progress is
| slow, and the slow areas will cripple the ability of LLMs to
| reach AGI. I think there's something fundamentally missing, and
| finding what that "something" is is going to take us decades.
| randomNumber7 wrote:
| Yes, but on the other hand I don't understand why people think
| something that you can train something on pattern matching and
| it magically becomes intelligent.
| danielbln wrote:
| We don't know what exactly makes us humans as intelligent as
| we are. And while I don't think that LLMs will be general
| intelligent without some other advancements, I don't get the
| confident statements that "clearly pattern matching can't
| lead to intelligence" when we don't really know what leads to
| intelligence to begin with.
| nyrikki wrote:
| We can't even define what intelligence is.
|
| We know or have strong hints at the limits of
| math/computation related to LLMs + CoT
|
| Note how PARITY and MEDIAN is hard here:
|
| https://arxiv.org/abs/2502.02393
|
| We also know HALT == open frame == symbol grounding ==
| system identification problems.
|
| The definition of AGI is also not well defined, but given
| the following:
|
| > Strong AI, also called artificial general intelligence,
| refers to machines possessing generalized intelligence and
| capabilities on par with human cognition.
|
| We know enough for _any mechanical methods_ with either
| current machines or even quantum machines, what is needed
| is impossible with the above definition.
|
| Walter Pitts drank himself to death, in part because of the
| failure of the perceptron model.
|
| Humans and machines are better at different things, and
| while ANNs are inspired by biology, they are very
| different.
|
| There are some hints that the way biological neurons work
| is incompatible with math as we know it.
|
| https://arxiv.org/abs/2311.00061
|
| Computation and machine learning are incredibly powerful
| and useful, but are fundamentally different, and that
| different is both a benefit and a limit.
|
| There are dozens of 'no effective procedure', 'no
| approximation', etc .. results that demonstrate that ML as
| we know it today is possible of most definitions of AGI.
|
| That is why particular C* types shift the goal post,
| because we know that the traditional definition of strong
| AI is equivalent to solving HALT.
|
| https://philarchive.org/rec/DIEEOT-2
|
| There is another path following PAC Learning as compression
| an NP being about finding parsimonious reductions (P being
| in NP)
| zero_bias wrote:
| Humans can't solve NP-hard problems either, so definition
| of intelligence shouldn't lie here, and these particular
| limits shouldn't matter too
| throw4847285 wrote:
| This is the difference between the scientific approach and
| the engineering approach. Engineers just need results. If
| humans had to mathematically model gravity first, there would
| be no pyramids. Plus, look up how many psychiatric
| medications are demonstrated to be very effective, but the
| action mechanisms are poorly understood. The flip side is
| Newton doing alchemy or Tesla claiming to have built an
| earthquake machine.
|
| Sometimes technology far predates science and other times you
| need a scientific revolution to develop new technology. In
| this case, I have serious doubts that we can develop
| "intelligent" machines without understanding the scientific
| and even philosophical underpinnings of human intelligence.
| But sometimes enough messing around yields results. I guess
| we'll see.
| danielbln wrote:
| A tip: ask Claude to put a critical hat on. I find the output
| afterwards to be improved.
| mehphp wrote:
| Do you have an example?
| Paradigma11 wrote:
| I am not so sure about that. Using Claude yesterday it gave me
| a correct function that returned an array. But the algorithm it
| used did not return the items sorted in one pass so it had run
| a separate sort at the end. The fascinating thing is that it
| realized that, commented on it and went on and returned a
| single pass function.
|
| That seems a pretty human thought process and shows that
| fundamental improvements might not depend as much on the
| quality of the LLM itself but on the cognitive structure it is
| embedded.
| jemmyw wrote:
| I've been writing code that implements tournament algorithms
| for games. You'd think an LLM would excel at this because it
| can explain the algorithms to me. I've been using cline on
| lots of other tasks to varying success. But it just totally
| failed with this one: it kept writing edge cases instead of a
| generic implementation. It couldn't write coherent enough
| tests across a whole tournament.
|
| So I wrote tests thinking it could implement the code from
| the tests, and it couldn't do that either. At one point it
| went so far with the edge cases that it just imported the
| test runner into the code so it could check the test name to
| output the expected result. It's like working with a VW
| engineer.
|
| Edit: I ended up writing the code and it wasn't that hard, I
| don't know why it struggled with this one task so badly. I
| wasted far more time trying to make the LLM work than just
| doing it myself.
| gymbeaux wrote:
| Yeah agree 100%. LLMs are overrated. I describe them as the "Jack
| of all, master of none" of AI. LLMs are that jackass guy we all
| know who has to chime in to every topic like he knows everything,
| but in reality he's a fraud with low self-esteem.
|
| I've known a guy since college who now has a PhD in something
| niche, supposedly pulls a $200k/yr salary. One of our first
| conversations (in college, circa 2014) was how he had this clever
| and easy way to mint money- by selling Minecraft servers
| installed on Raspberry Pis. Some of you will recognize how
| asinine this idea was and is. For everyone else- back then,
| Minecraft only ran on x86 CPUs (and I doubt a Pi would make a
| good Minecraft server today, even if it were economical). He had
| no idea what he was talking about, he was just spewing shit like
| he was God's gift. Actually, the problem wasn't that he had _no_
| idea- it was that he knew a tiny bit- enough to sound smart to an
| idiot (remind you of anyone?).
|
| That's an LLM. A jackass with access to Google.
|
| I've had great success with SLMs (small language models), and
| what's more I don't need a rack of NVIDIA L40 GPUs to train and
| use them.
| usaar333 wrote:
| Author also made a highly upvoted and controversial comment about
| o3 in the same vein that's worth reading:
| https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3?comment...
|
| Oh course lesswrong, being heavily AI doomers, may be slightly
| biased against near term AGI just from motivated reasoning.
|
| Gotta love this part of the post no one has yet addressed:
|
| > At some unknown point - probably in 2030s, possibly tomorrow
| (but likely not tomorrow) - someone will figure out a different
| approach to AI. Maybe a slight tweak to the LLM architecture,
| maybe a completely novel neurosymbolic approach. Maybe it will
| happen in a major AGI lab, maybe in some new startup. By default,
| everyone will die in <1 year after that
| demaga wrote:
| I would expect similar doom predictions in the era of nuclear
| weapon invention, but we've survived so far. Why do people
| assume AGI will be orders of magnitude more dangerous than what
| we already have?
| amoss wrote:
| Nuclear weapons are not self-improving or self-replicating.
| colonial wrote:
| Self-improvement (in the "hard takeoff" sense) is hardly a
| given, and hostile self-replication is nothing special in
| the software realm (see: worms.)
|
| Any technically competent human knows the foolproof
| strategy for malware removal - pull the plug, scour the
| platter clean, and restore from backup. What makes an out-
| of-control pile of matrix math any different from WannaCry?
|
| AI doom scenarios _seem_ scary, but most are premised on
| the idea that we can create an uncontainable, undefeatable
| "god in a box." I reject such premises. The whole idea is
| silly - Skynet Claude or whatever is not going to last very
| long once I start taking an axe to the nearest power pole.
| dsign wrote:
| You have a point that a powerful malicious AI can still
| be unplugged, if you are close to each and every power
| cord that would feed it, and react and do the right thing
| each and every time. Our world is far too big and too
| complicated to guarantee that.
| colonial wrote:
| Again, that's the "god in a box" premise. In the real
| world, you wouldn't need a perfectly timed and
| coordinated response, just like we haven't needed one for
| human-programmed worms.
|
| Any threat can be physically isolated case-by-case at the
| link layer, neutered, and destroyed. Sure, it could cause
| some destruction in the meantime, but our digital
| infrastructure can take a _lot_ of heat and bounce back -
| the CrowdStrike outages didn 't destroy the world, now
| did they?
| usaar333 wrote:
| More ability to kill everyone. That's harder to do with
| nukes.
|
| That said, the actual forecast odds on metaculus are pretty
| similar for nuclear and AI catastrophies:
| https://possibleworldstree.com/
| randomNumber7 wrote:
| Most people are just ignorant and dumb, dont listen to it.
| HelloMcFly wrote:
| Was that comment intended seriously? I thought it was a wry
| joke.
| usaar333 wrote:
| I think so. Thane is aligned with the high p doom folks.
|
| 1 year may be slightly exaggerated, but it aligns with his
| view
| gwern wrote:
| I never thought I'd see the day that LessWrong would be accused
| of being biased _against_ near-term AGI forecasts (and for none
| of the 5 replies to question this description either). But here
| we are. Indeed do many things come to pass.
| cglace wrote:
| The thing I can't wrap my head around is that I work on extremely
| complex AI agents every day and I know how far they are from
| actually replacing anyone. But then I step away from my work and
| I'm constantly bombarded with "agents will replace us".
|
| I wasted a few days trying to incorporate aider and other tools
| into my workflow. I had a simple screen I was working on for
| configuring an AI Agent. I gave screenshots of the expected
| output. Gave a detailed description of how it should work. Hours
| later I was trying to tweak the code it came up with. I scrapped
| everything and did it all myself in an hour.
|
| I just don't know what to believe.
| hattmall wrote:
| There are some fields though where they can replace humans in
| significant capacity. Software development is probably one of
| the least likely for anything more than entry level, but A LOT
| of engineering has a very very real existential threat. Think
| about designing buildings. You basically just need to know a
| lot of rules / tables and how things interact to know what's
| possible and the best practices. A purpose built AI could
| develop many systems and back test them to complete the design.
| A lot of this is already handled or aided by software, but a
| main role of the engineer is to interface with the non-
| technical persons or other engineers. This is something where
| an agent could truly interface with the non-engineer to figure
| out what they want, then develop it and interact with the
| design software quite autonomously.
|
| I think though there is a lot of focus on AI agents in software
| development though because that's just an early adopter market,
| just like how it's always been possible to find a lot of
| information on web development on the web!
| drysine wrote:
| >a main role of the engineer is to interface with the non-
| technical persons or other engineers
|
| The main role of the engineer is being responsible for the
| building not collapsing.
| randomNumber7 wrote:
| ChatGPT will probably take more responsibility than Boeing
| for their airplane software.
| tobr wrote:
| I keep coming back to this point. Lots of jobs are
| fundamentally about taking responsibility. Even if AI were
| to replace most of the work involved, only a human can
| meaningfully take responsibility for the outcome.
| dogmayor wrote:
| I think about this a lot when it comes to self-driving
| cars. Unless a manufacturer assumes liability, why would
| anyone purchase one and subject themselves to potential
| liability for something they by definition did not do?
| This issue will be a big sticking point for adoption.
| arkh wrote:
| > just
|
| In my experience this word means you don't know whatever
| you're speaking about. "Just" almost always hide a ton of
| unknown unknowns. After being burned enough times nowadays
| when I'm going to use it I try to stop and start asking more
| questions.
| fragmede wrote:
| It's a trick of human psychology. Asking "why don't you
| just..." leads to one reaction, when asking "what are the
| road blocks to completing..." leads to a different but same
| answer. But thinking "just" is good when you see it as a
| learning opportunity.
| gerikson wrote:
| Most engineering fields are _de jure_ professional, which
| means they can and probably will enforce limitations on the
| use of GenAI or its successor tech before giving up that kind
| of job security. Same goes for the legal profession.
|
| Software development does not have that kind of protection.
| red-iron-pine wrote:
| for ~3 decades IT could pretend it didn't need unions
| because wages and opportunities were good. now the pendulum
| is swinging back -- maybe they do need those kinds of
| protections.
|
| and professional orgs are more than just union-ish cartels,
| they exist to ensure standards, and enforce responsibility
| on their members. you do shitty unethical stuff as a lawyer
| and you get disbarred; doctors lose medical licenses, etc.
| ForHackernews wrote:
| Good freaking luck! The inconsistencies of the software world
| pale in comparison to trying to construct any real world
| building: http://johnsalvatier.org/blog/2017/reality-has-a-
| surprising-...
| seanhunter wrote:
| > "you basically just need to know a lot of rules..."
|
| This comment commits one of the most common fallacies that I
| see really often in technical people, which is to assume that
| any subject you don't know anything about must be really
| simple.
|
| I have no idea where this comment comes from, but my father
| was a chemical engineer and his father was mechanical
| engineer. A family friend is a structural engineer. I don't
| have a perspective about AI replacing people's jobs in
| general that is any more valuable than anyone elses, but I
| can say with a great deal of confidence that in those three
| engineering disciplines specifically literally none of any of
| their jobs are about knowing a bunch of rules and best
| practices.
|
| Don't make the mistake of thinking that just because you
| don't know what someone does, that their job is easy and/or
| unnecessary or you could pick it up quickly. It may or may
| not be true but assuming it to be the case is unlikely to
| take you anywhere good.
| spaceman_2020 wrote:
| You're biased because if you're here, you're likely an A-tier
| player used to working with other A-tier players.
|
| But the vast majority of the world is not A players. They're B
| and C players
|
| I don't think the people evaluating AI tools have ever worked
| in wholly mediocre organizations - or even know how many
| mediocre organizations exist
| code_for_monkey wrote:
| wish this didnt resonate with me so much. Im far from a 10x
| developer, and im in an organization that feels like a giant,
| half dead whale. Sometimes people here seem like they work on
| a different planet.
| cheevly wrote:
| I promise the amount of time, experiments and novel approaches
| you've tested are .0001% of what others have running in stealth
| projects. Ive spent an average of 10 hours per day constantly
| since 2022 working on LLMs, and I know that even what I've
| built pales in comparison to other labs. (And im well beyond
| agents at this point). Agentic AI is what's popular in the
| mainstream, but it's going to be trounced by at least 2 new
| paradigms this year.
| cglace wrote:
| So what is your prediction?
| handfuloflight wrote:
| Say more.
| zanfr wrote:
| seems like OP ran out of tokens
| aaronbaugher wrote:
| It kind of reminds me of the Y2K scare. Leading up to that,
| there were a _lot_ of people in groups like
| comp.software.year-2000 who claimed to be doing Y2K fixes at
| places like the IRS and big corporations. They said they were
| just doing triage on the most critical systems, and that most
| things wouldn 't get fixed, so there would be all sorts of
| failures. The "experts" who were closest to the situation,
| working on it in person, turned out to be completely wrong.
|
| I try to keep that in mind when I hear people who work with
| LLMs, who usually have an emotional investment in AI and often
| a financial one, speak about them in glowing terms that just
| don't match up with my own small experiments.
| naasking wrote:
| > But then I step away from my work and I'm constantly
| bombarded with "agents will replace us".
|
| An assembly language programmer might have said the same about
| C programming at one point. I think the point is, that once you
| depend on a more abstract interface that permits you to ignore
| certain details, that permits decades of improvements to that
| backend without you having to do anything. People are still
| experimenting with what this abstract interface is and how it
| will work with AI, but they've already come leaps and bounds
| from where they were only a couple of years ago, and it's only
| going to get better.
| tibbar wrote:
| LLMs make it very easy to cheat, both academically and
| professionally. What this looks like in the workplace is a junior
| engineer not understanding their task or how to do it but
| stuffing everything into the LLM until lint passes. This breaks
| the trust model: there are many requirements that are a little
| hard to verify than an LLM might miss, and the junior engineer
| can now represent to you that they "did what you ask" without
| really certifying the work output. I believe that this kind of
| professional cheating is just as widespread as academic cheating,
| which is an epidemic.
|
| What we really need is people who can certify that a task was
| done correctly, who can use LLMs as an aid. LLMs simply cannot be
| responsible for complex requirements. There is no way to hold
| them accountable.
| swazzy wrote:
| I see no reason to believe the extraordinary progress we've seen
| recently will stop or even slow down. Personally, I've benefited
| so much from AI that it feels almost alien to hear people
| downplaying it. Given the excitement in the field and the sheer
| number of talented individuals actively pushing it forward, I'm
| quite optimistic that progress will continue, if not accelerate.
| danielbln wrote:
| I hear you, I feel constantly bewildered by comments like "LLMs
| haven't changed really since GPT3.5.", I mean really? It went
| from an exciting novelty to a core pillar of my daily work,
| it's allowed me and my entire (granted , quote senior) org to
| be incredibly more productive and creative with our solutions.
|
| And the I stumble across a comment where some LLM hallucinated
| a library that means clearly AI is useless.
| Workaccount2 wrote:
| If LLM's are bumpers on a bowling lane, HN is a forum of pro
| bowlers.
|
| Bumpers are not gonna make you a pro bowler. You aren't going
| to be hitting tons of strikes. Most pro bowlers won't notice
| any help from bumpers, except in some edge cases.
|
| If you are an average joe however, and you need to knock over
| pins with some level of consistency, then those bumpers are a
| total revolution.
| esafak wrote:
| That is not a good analogy. They are closer to assistants to
| me. If you know how and what to delegate, you can increase
| your productivity.
| spaceman_2020 wrote:
| The impression I get from using all cutting edge AI tools:
|
| 1. Sonnet 3.7 is a mid-level web developer at least
|
| 2. DeepResearch is about as good an analyst as an MBA from a
| school ranked 50+ nationally. Not lower than that. EY, not
| McKinsey
|
| 3. Grok 3/GPT-4.5 are good enough as $0.05/word article writers
|
| Its not replacing the A-players but its good enough to replace B
| players and definitely better than C and D players
| tcoff91 wrote:
| A midlevel web developer should do a whole lot more than just
| respond to chat messages and do exactly what they are told to
| do and no more.
| danielbln wrote:
| When I use LLMs that what it does. Spawns commands, edits
| files, runs tests, evaluates outputs, iterates and solutions
| under my guidance.
| weweersdfsd wrote:
| The key here is "under your guidance". LLM's are a major
| productivity boost for many kinds of jobs, but can LLM-
| based agents be trusted to act fully autonomously for tasks
| with real world consequence? I think the answer is still
| no, and will be for a long time. I wouldn't trust LLM to
| even order my groceries without review, let alone push code
| into production.
|
| To reach anything close to definition of AGI, LLM agents
| should be able to independently talk to customers,
| iteratively develop requirements, produce and test
| solutions, and push them to production once customers are
| happy. After that, they should be able to fix any issues
| arising in production. All this without babysitting /
| review / guidance from human devs, reliably
| id00 wrote:
| I'd expect mid-level developer to show more understanding and
| better reasoning. So far it looks like a junior dev who read a
| lot of books and good at copy pasting from stackoverflow.
|
| (Based on my everyday experience with Sonet and Cursor)
| roenxi wrote:
| This seems to be ignoring the major force driving AI right now -
| hardware improvements. We've barely seen a new hardware
| generation since ChatGPT was released to the market, we'd
| certainly expect it to plateau fairly quickly on fixed hardware.
| My personal experience of AI models is going to be a series of
| step changes every time the VRAM on my graphics card doubles. Big
| companies are probably going to see something similar each time a
| new more powerful product hits the data centre. The algorithms
| here aren't all that impressive compared to the creeping FLOPS/$
| metric.
|
| Bear cases always welcome. This wouldn't be the first time in
| computing history that progress just falls off the exponential
| curve suddenly. Although I would bet money on there being a few
| years left and AGI is achieved.
| notTooFarGone wrote:
| hardware improvements don't strike me as the horse to bet on.
|
| LLM Progression seems to be linear and compute needed
| exponential. And I don't see exponential hardware improvements
| besides some new technology (that we should not bet on coming
| ayntime soon).
| redlock wrote:
| Moore's law is exponential
| jimbokun wrote:
| Was.
| greazy wrote:
| > Although I would bet money on there being a few years left
| and AGI is achieved.
|
| Yeah? I'll take you up on that offer. $100AUD AGI won't happen
| this decade.
| gmt2027 wrote:
| The typical AI economic discussion always focuses on job loss,
| but that's only half the story. We won't just have corporations
| firing everyone while AI does all the work - who would buy their
| products then?
|
| The disruption goes both ways. When AI slashes production costs
| by 10-100x, what's the value proposition of traditional capital?
| If you don't need to organize large teams or manage complex
| operations, the advantage of "being a capitalist" diminishes
| rapidly.
|
| I'm betting on the rise of independents and small teams. The idea
| that your local doctor or carpenter needs VC funding or an IPO
| was always ridiculous. Large corps primarily exist to organize
| labor and reduce transaction costs.
|
| The interesting question: when both executives and frontline
| workers have access to the same AI tools, who wins? The manager
| with an MBA or the person with practical skills and domain
| expertise? My money's on the latter.
| randomNumber7 wrote:
| Idk where you live, but in my world "being a capitalist"
| requires you to own capital. And you know what, AI makes it
| even better to own capital. Now you have these fancey machines
| doing stuff for you and you dont even need any annoying
| workers.
| gmt2027 wrote:
| By "capitalist," I'm referring to investors whose primary
| contribution is capital, not making a political statement
| about capitalism itself.
|
| Capital is crucial when tools and infrastructure are
| expensive. Consider publishing: pre-internet, starting a
| newspaper required massive investment in printing presses,
| materials, staff, and distribution networks. The web reduced
| these costs dramatically, allowing established media to cut
| expenses and focus on content creation. However, this also
| opened the door for bloggers and digital news startups to
| compete effectively without the traditional capital
| requirements. Many legacy media companies are losing this
| battle.
|
| Unless AI systems remain prohibitively expensive (which seems
| unlikely given current trends), large corporations will face
| a similar disruption. When the tools of production become
| accessible to individuals and small teams, the traditional
| advantage of having deep pockets diminishes significantly.
| HenryBemis wrote:
| My predictions on the matter: LLMs are already
| super useful. It does all my coding and scripting for me
| @home It does most of the coding and scripting at the
| workplace It creates 'fairly good' checklists for work (not
| perfect, but it takes a 4 hour effort and makes it 25mins - but
| the "Pro" is still needed to make this or that checklist usable -
| I call this a win)(need both the tech AND the human)
| If/when you train an 'in-house' LLM it can make some easy wins
| (on mega-big-companies with 100k staff they can get quick answers
| on "which Policy writes about XYZ, which department can I talk to
| about ABC, etc.) We won't have the "AGI"/Skynet anytime
| soon, and when one will exist the company (let's use OpenAI for
| example) will split in two. Half will give LLMs for the masses at
| $100 per month, the "Skynet" will go to the DOD and we will never
| hear about it again, except in the Joe Rogan podcast as a rumor.
| It is a great 'idea generator' (search engine and results
| aggregator): give me a list of 10 things I can do _that_ weekend
| in _city_I_will_be_traveling_to so if/when I go to (e.g. London):
| here are the cool concerts, theatrical performances, parks, blah
| blah blah
| audessuscest wrote:
| > It seems to me that "vibe checks" for how smart a model feels
| are easily gameable by making it have a better personality.
|
| I don't buy that at all, most of my use cases don't involve
| model's personality, if anything I usually instruct to skip any
| commentary and give the result excepted only. I'm sure most
| people using AI models seriously would agree.
|
| > My guess is that it's most of the reason Sonnet 3.5.1 was so
| beloved. Its personality was made much more appealing, compared
| to e. g. OpenAI's corporate drones.
|
| I would actually guess it's mostly because it was good at code,
| which doesn't involve much personnality
| orangebread wrote:
| I think the author provides an interesting perspective to the AI
| hype, however, I think he is really downplaying the effectiveness
| of what you can do with the current models we have.
|
| If you've been using LLMs effectively to build agents or AI-
| driven workflows you understand the true power of what these
| models can do. So in some ways the author is being a little
| selective with his confirmation bias.
|
| I promise you that if you do your due diligence in exploring the
| horizon of what LLMs can do you will understand what I'm saying.
| If ya'll want a more detailed post I can get into the AI systems
| I have been building. Don't sleep on AI.
| aetherson wrote:
| I don't think he is downplaying the effectiveness of what you
| can do with the current models. Rather, he's in a milieu
| (LessWrong), which is laser-focused on "transformative" AI,
| AGI, and ASI.
|
| Current AI is clearly economically valuable, but if we freeze
| everything at the capabilities it has today it is also clearly
| not going to result in mass transformation of the economy from
| "basically being about humans working" to "humans are
| irrelevant to the economy." Lots of LW people believe that in
| the next 2-5 years humans will become irrelevant to the
| economy. He's arguing against that belief.
| mbil wrote:
| I agree with you. I recently wrote up my perspective here:
| https://news.ycombinator.com/item?id=43308912
| lackoftactics wrote:
| >GPT-5 will be even less of an improvement on GPT-4.5 than
| GPT-4.5 was on GPT-4. The pattern will continue for GPT-5.5 and
| GPT-6, the ~1000x and 10000x models they may train by 2029 (if
| they still have the money by then). Subtle quality-of-life
| improvements and meaningless benchmark jumps, but nothing
| paradigm-shifting.
|
| It's easy to spot people who secretly hate LLMs and feel
| threatened by them these days. GPT-5 will be a unified model,
| very different from 4o or 4.5. Throwing around numbers related to
| scaling laws shows a lack of proper research. Look at what
| DeepSeek accomplished with far fewer resources; their paper is
| impressive.
|
| I agree that we need more breakthroughs to achieve AGI. However,
| these models increase productivity, allowing people to focus more
| on research. The number of highly intelligent people currently
| working on AI is astounding, considering the number of papers and
| new developments. In conclusion, we will reach AGI. It's a race
| with high stakes, and history shows that these types of races
| don't stop until there is a winner.
| contagiousflow wrote:
| > In conclusion, we will reach AGI
|
| I'm a little confused by this confidence? Is there more
| evidence aside from the number of smart people working on it?
| We have a lot of smart people working on a lot of big problems,
| that doesn't guarantee a solution nor a timeline.
| rscho wrote:
| It's also easy to spot irrational zealots. Your statement is no
| more plausible than OP's. No one knows whether we'll achieve
| AGI, especially since the definition is very blurry.
| radioactivist wrote:
| Some hard problems have remain unsolved in basically every
| field of human interest for decades/centuries/millennia --
| despite the number of intelligent people and/or resources that
| have been thrown at them.
|
| I really don't understand the level optimism that seems to
| exist for LLMs. And speculating that people "secretly hate
| LLMs" and "feel threatened by them" isn't an answer (frankly,
| when I see arguments that start with attacks like that alarm
| bells start going off in my head).
| ath3nd wrote:
| I logged in to specifically downvote this comment, because it
| attacks the OP's position with unjustified and unsubstantiated
| confidence in the reverse.
|
| > It's easy to spot people who secretly hate LLMs and feel
| threatened by them these days.
|
| I don't think OP is threatened or hates LLM, if anything, OP is
| on the position that LLM are so far away from intelligence that
| it's laughable to consider it threatening.
|
| > In conclusion, we will reach AGI
|
| The same way we "cured" cancer and Alzheimer's, two arguably
| much more important inventions than a glorified text
| predictor/energy guzzler. But I like the confidence, it's
| almost as much as OP's confidence that nothing substantial will
| happen.
|
| > It's a race with high stakes, and history shows that these
| types of races don't stop until there is a winner.
|
| So is the existential threat to humanity in the race to phase
| out fossil fuels/stop global warming, and so far I don't see
| anyone "winning".
|
| > However, these models increase productivity, allowing people
| to focus more on research
|
| The same way the invention of the computer, the car, the vacuum
| cleaner and all the productivity increasing inventions in the
| last centuries allowed us to idle around, not have a job, and
| focus on creative things.
|
| > It's easy to spot people who secretly hate LLMs and feel
| threatened by them these days
|
| It's easy to spot e/acc bros feeling threatened that all the
| money they sunk into crypto, AI, the metaverse, web3 are gonna
| go to waste and try to fan the hype around it so they can cash
| in big. How does that sound?
| lackoftactics wrote:
| I appreciate the pushback and acknowledge that my earlier
| comment might have conveyed too much certainty--skepticism
| here is justified and healthy.
|
| However, I'd like to clarify why optimism regarding AGI isn't
| merely wishful thinking. Historical parallels such as
| heavier-than-air flight, Go, and protein folding illustrate
| how sustained incremental progress combined with competition
| can result in surprising breakthroughs, even where previous
| efforts had stalled or skepticism seemed warranted. AI isn't
| just a theoretical endeavor; we've seen consistent and
| measurable improvements year after year, as evidenced by
| Stanford's AI Index reports and emergent capabilities
| observed at larger scales.
|
| It's true that smart people alone don't guarantee success.
| But the continuous feedback loop in AI research--where
| incremental progress feeds directly into further research--
| makes it fundamentally different from fields characterized by
| static or singular breakthroughs. While AGI remains ambitious
| and timelines uncertain, the unprecedented investment,
| diversity of research approaches, and absence of known
| theoretical barriers suggest the odds of achieving
| significant progress (even short of full AGI) remain strong.
|
| To clarify, my confidence isn't about exact timelines or
| certainty of immediate success. Instead, it's based on
| historical lessons, current research dynamics, and the
| demonstrated trajectory of AI advancements. Skepticism is
| valuable and necessary, but history teaches us to stay open
| to possibilities that seem improbable until they become
| reality.
|
| P.S. I apologize if my comment particularly triggered you and
| compelled you to log in and downvote. I am always open to
| debate, and I admit again that I started too strongly.
| ath3nd wrote:
| I am with you that when smart people combine their efforts
| together and build on previous research + learnings,
| nothing is impossible.
| lackoftactics wrote:
| I started the conversation off on the wrong foot.
| Commenting with "ad hominem" shuts down open discussion.
|
| I hope we can have a nice talk in future conversations.
| mark_l_watson wrote:
| I have used neural networks for engineering problems since the
| 1980s. I say this as context for my opinion: I cringe at most
| applications of LLMs that attempt mostly autonomous behavior, but
| I love using LLMs as 'side kicks' as I work. If I have a bug in
| my code, I will add a few printout statements where I think my
| misunderstanding of my code is, show an LLM my code and output,
| explain the error: I very often get useful feedback.
|
| I also like practical tools like NotebookLM where I can pose some
| questions, upload PDFs, and get a summary based in what my
| questions.
|
| My point is: my brain and experience are often augmented in
| efficient ways by LLMs.
|
| So far I have addressed practical aspects of LLMs. I am retired
| so I can spend time on non practical things: currently I am
| trying to learn how to effectively use code generated by gemini
| 2.0 flash at runtime; the gemini SDK supports this fairly well so
| I am just trying to understand what is possible (before this I
| spent two months experimenting with writing my own
| tools/functions in Common Lisp and Python.)
|
| I "wasted" close to two decades of my professional life on old
| fashioned symbolic AI (but I was well paid for the work) but I am
| interested in probabilistic approaches, such as in a book I
| bought yesterday "Causal AI" that was just published.
|
| Lastly, I think some of the recent open source implementations of
| new ideas from China are worth carefully studying.
| hangonhn wrote:
| I'll add this in case it's helpful to anyone else: LLMs are
| really good at regex and undoing various encodings/escaping,
| especially nested ones. I would go so far to say that it's
| better than a human at the latter.
|
| I once spend over an hour trying to unescape JSON containing
| UTF8 values that's been escaped prior to being written to AWS's
| Cloudwatch Logs for MySQL audit logs. It was a horrific level
| of pain until I just asked ChatGPT to do it and it figured out
| all the series of escapes and encoding immediately and gave me
| the step to reverse them all.
|
| LLM as a sidekick has saved me so much time. I don't really use
| it to generate code but for some odd tasks or API look up, it's
| a huge time saver.
| carlosdp wrote:
| > At some point there might be massive layoffs due to ostensibly
| competent AI labor coming onto the scene, perhaps because OpenAI
| will start heavily propagandizing that these mass layoffs must
| happen. It will be an overreaction/mistake. The companies that
| act on that will crash and burn, and will be outcompeted by
| companies that didn't do the stupid.
|
| Um... I don't think companies are going to perform mass layoffs
| because "OpenAI said they must happen". If that were to happen
| it'd be because they are genuinely able to automate a ton of jobs
| using LLMs, which would be a bull case (not for AGI necessarily,
| but for the increased usefulness of LLMs)
| AlotOfReading wrote:
| I don't think LLMs need to be able to genuinely fulfill the
| duties of a job to replace the human. Think call center workers
| and insurance reviewers where the point is to meet metrics
| without regard for the quality of the work performed. The main
| thing separating those jobs from say, HR (or even programmers)
| is how much the company cares about the quality of the work.
| It's not hard to imagine a situation where misguided people try
| to replace large numbers of federal employees with LLMs, as an
| entirely hypothetical example.
| Timber-6539 wrote:
| AI has no meaningful input to real world productivity because it
| is a toy that is never going to become the real thing that every
| person who has naively bought the AI hype expects it to be. And
| the end result of all the hype looks almost too predictable
| similar to how the also once promising crypto & blockchain
| technology turned out.
| mcintyre1994 wrote:
| > Scaling CoTs to e. g. millions of tokens or effective-
| indefinite-size context windows (if that even works) may or may
| not lead to math being solved. I expect it won't.
|
| > (If math is solved, though, I don't know how to estimate the
| consequences, and it might invalidate the rest of my
| predictions.)
|
| What does it mean for math to be solved in this context? Is it
| the idea that an AI will be able to generate any mathematical
| proof? To take a silly example, would we get a proof of whether
| P=NP from an AI that had solved math?
| daveguy wrote:
| I think "math is solved" refers more to AI performing math
| studies at the level of a mathematics graduate student.
| Obviously "math" won't ever be "solved" but the problem of AI
| getting to a certain math proficiency level could be. No matter
| how good an AI is, if P != NP it won't be able to prove P=NP.
|
| Regardless I don't think our AI systems are close to a
| proficiency breakthrough.
|
| Edit: it is odd that "math is solved" is never explained. But
| "proficient to do math research" makes the most sense to me.
| Imnimo wrote:
| >Test-time compute/RL on LLMs: >It will not meaningfully
| generalize beyond domains with easy verification.
|
| To me, this is the biggest question mark. If you could get good
| generalized "thinking" from just training on math/code problems
| with verifiers, that would be a huge deal. So far, generalization
| seems to be limited. Is this because of a fundamental limitation,
| or because the post-training sets are currently too small (or
| otherwise deficient in some way) to induce good thinking
| patterns? If the latter, is that fixable?
| trashtester wrote:
| > Is this because of a fundamental limitation, or because the
| post-training sets are currently too small (or otherwise
| deficient in some way) to induce good thinking patterns?
|
| "Thinking" isn't a singular thing. Humans learn to think in
| layer upon layer of understandig the world, physical, social
| and abstract, all at many different levels.
|
| Embodiment will allow them to use RL on the physical world, and
| this in combination with access to not only means of
| communication but also interacting in ways where there is skin
| in the game, will help them navigate social and digital spaces.
| bilsbie wrote:
| I have times when I use an LLM and it's completely brain dead and
| can't handle the simplest questions.
|
| Then other times it blows me away. Even figuring out things that
| can't possibly have been in its training data.
|
| I think there are groups of people that have either had all of
| the first experience or all of the latter. And that's why we see
| over optimistic and over pessimistic takes (like this one)
|
| I think the reality is current LLM's are better than he realizes
| and even if we plateau I really don't see how we don't make more
| breakthroughs in the next few years.
| klik99 wrote:
| This almost exactly what I've been saying while everyone was
| saying we're on the path to AGI in the next couple of years.
| We're an innovation / tweak / or paradigm shift away from AGI.
| His estimate in the 2030s that could happen is possible but
| optimistic- you can't time new techniques, you can only time
| progress on iterative progress.
|
| This is all the standard timeline for new technology - we enter
| the diminishing returns period, investment slows down a year or
| so afterwards, layoffs, contraction of industry, but when the
| hype dies down the real utilitarian part of the cycle begins. We
| start seeing it get integrated into the use cases it actually
| fits well with and by five years time its standard practice.
|
| This is a normal process for any useful technology (notably
| crypto never found sustainable use cases so it's kind of the
| exception, it's in superposition of lingering hype and complete
| dismissal), so none of this should be a surprise to anyone. It's
| funny that I've been saying this for so long that I've been
| pegged an AI skeptic, but in a couple of years when everyone is
| burnt out on AI hype it'll sound like a positive view. The truth
| is, hype serves a purpose for new technology, since it kicks off
| a wide search for every crazy use case, most of which won't work.
| But the places where it does work will stick around
| JTbane wrote:
| Anyone else feel like AI is a trap for developers? I feel like
| I'm alone in the opinion it decreases competence. I guess I'm a
| mid-level dev (5 YOE at one company) and I tend to avoid it.
___________________________________________________________________
(page generated 2025-03-10 23:02 UTC)