[HN Gopher] Ilya Sutskever: We're moving from the age of scaling...
___________________________________________________________________
Ilya Sutskever: We're moving from the age of scaling to the age of
research
Author : piotrgrabowski
Score : 388 points
Date : 2025-11-25 17:21 UTC (1 days ago)
(HTM) web link (www.dwarkesh.com)
(TXT) w3m dump (www.dwarkesh.com)
| andy_ppp wrote:
| So is the translation endless scaling has stopped being as
| effective?
| jsheard wrote:
| The translation is that SSI says that SSIs strategy is the way
| forward so could investors please stop giving OpenAI money and
| give SSI the money instead. SSI has not shown anything yet, nor
| does SSI intend to show anything until they have created an
| actual Machine God, but SSI says they can pull it off so it's
| all good to go ahead and wire the GDP of Norway directly to
| Ilya.
| gessha wrote:
| It's a snake oil salesman's world.
| aunty_helen wrote:
| If we take AGI as a certainty, ie we think we can achieve AGI
| using silicon, then Ilya is one of the best bets you can take
| if you are looking to invest in this space. He has a history
| and he's motivated to continue working on this problem.
|
| If you think that AGI is not possible to achieve, then you
| probably wouldn't be giving anyone money in this space.
| bossyTeacher wrote:
| This hinges on his company achieving AGI while he's still
| alive. He's 38 years old. He has about 4 decades to deliver
| AGI in his lifetime. When he dies, there is no guarantee
| whoever takes over will share his values.
|
| "If you think that AGI is not possible to achieve, then you
| probably wouldn't be giving anyone money in this space." If
| you think other people think AGI is possible, you sell them
| shovels and ready yourself for a shovel market dip in the
| near future. Strike while the iron is hot.
| shwaj wrote:
| Are you asking whether the whole podcast can be boiled down to
| that translation, or whether you can infer/translate that from
| the title?
|
| If the former, no. If the latter, sure, approximately.
| Animats wrote:
| It's stopped being cost-effective. Another order of magnitude
| of data centers? Not happening.
|
| The business question is, what if AI works about as well as it
| does now for the next decade or so? No worse, maybe a little
| better in spots. What does the industry look like? NVidia and
| TSMC are telling us that price/performance isn't improving
| through at least 2030. Hardware is not going to save us in the
| near term. Major improvement has to come from better
| approaches.
|
| Sutskever: _" I think stalling out will look like...it will all
| look very similar among all the different companies. It could
| be something like this. I'm not sure because I think even with
| stalling out, I think these companies could make a stupendous
| revenue. Maybe not profits because they will need to work hard
| to differentiate each other from themselves, but revenue
| definitely."_
|
| Somebody didn't get the memo that the age of free money at zero
| interest rates is over.
|
| The "age of research" thing reminds me too much of mid-1980s AI
| at Stanford, when everybody was stuck, but they weren't willing
| to admit it. They were hoping, against hope, that someone would
| come up with a breakthrough that would make it work before the
| house of cards fell apart.
|
| Except this time everything costs many orders of magnitude more
| to research. It's not like Sutskever is proposing that
| everybody should go back to academia and quietly try to come up
| with a new idea to get things un-stuck. They want to spend
| SSI's market cap of $32 billion on some vague ideas involving
| "generalization". Timescale? "5 to 20 years".
|
| This is a strange way to do corporate R&D when you're kind of
| stuck. Lots of little and medium sized projects seem more
| promising, along the lines of Google X. The discussion here
| seems to lean in the direction of one big bet.
|
| You have to admire them for thinking big. And even if the whole
| thing goes bust, they probably get to keep the house and the
| really nice microphone holder.
| energy123 wrote:
| The ideas likely aren't vague at all given who is speaking.
| I'd bet they're extremely specific. Just not transparently
| shared with the public because it's intellectual property.
| giardini wrote:
| What kind of ideas would be intellectual property that was
| not shared? Isn't every part of LLMs, except the _order_ of
| processes, publicly _known_ ? Is there some magic algorithm
| previously unrevealed and held secret by a cabal of
| insiders?
| sd9 wrote:
| Why are some models better than others today if
| everything is publicly known and many organisations have
| access to massive resources?
|
| Somebody has to come up with an idea first. Before they
| share it, it is not publicly known. Ilya has previously
| come up with plenty of productive ideas. I don't think
| it's a stretch to think that he has some IP that is not
| publicly known.
|
| Even seemingly simple things like how you shuffle your
| training set, how you augment it, the specific
| architecture of the model, etc, have dramatic effects on
| the outcome.
| Animats wrote:
| > Somebody has to come up with an idea first.
|
| There are lots of ideas. Some may work.
|
| The space in which people seem to be looking is deep
| learning on something other than text tokens. Yet most
| successes punt on feature extraction / "early vision" and
| just throw compute at raw pixels. That's the "bitter
| lesson" approach, which seems to be hitting the ceiling
| of how many gigawatts of data center you can afford.
|
| Is there a useful non-linguistic abstraction of the real
| world that works and leads to "common sense"? Squirrels
| must have something; they're not verbal and have a brain
| the size of a peanut. But what?
| tim333 wrote:
| A difference with mid-1980s AI is the hardware is way more
| capable now so even flawed algorithms can do quite
| economically significant stuff like Claude Code etc. Recent
| headline "Anthropic projects as much as $26 billion in
| annualized revenue in 2026". With that sort of revenue you'd
| expect some significant spend on R&D.
| Animats wrote:
| > "Anthropic projects as much as $26 billion in annualized
| revenue in 2026".
|
| Anthropic projects a lot. It's hard to get actuals from
| Anthropic.[1] They're privately held, so they don't have to
| report actuals publicly. [1] says "Anthropic has, through
| July 2025, made around $1.5 billion in revenue." $26
| billion for 2026 seems unlikely.
|
| This is revenue, not profit.
|
| [1] https://www.wheresyoured.at/howmuchmoney/
| Quothling wrote:
| Not really, but there is a finite amount of data to train
| models on. I found it rather interesting to hear him talk about
| how Gemini has been better at getting results out of the data
| than their competition, and how this is the first insights into
| a new way of dealing with how they train models on the same
| data to get different results.
|
| I think the title is an interesting thing, because the scaling
| isn't about compute. At least as I understand it, what they're
| running out of is data, and one of the ways they deal with
| this, or may deal with this, is to have LLM's running
| concurrently and in competition. So you'll have thousands of
| models competing against eachother to solve challenges through
| different approaches. Which to me would suggest that the need
| for hardware scaling isn't about to stop.
| imiric wrote:
| The translation to me is: this cow has run out of milk. Now we
| actually need to deliver value, or the party stops.
| giardini wrote:
| I'll be convinced LLMs are a reasonable approach to AI when an
| LLM can give reasonable answers after being trained with
| approximately the same books and classes in school that I was
| once I completed my college education.
| alex43578 wrote:
| I'll be convinced cars are a reasonable approach to
| transportation when it can take me as far as a horse can on a
| bale of hay.
| Jeff_Brown wrote:
| That is such a beautiful analogy that now I will read your
| other comments.
| snapcaster wrote:
| Why do you think this standard you're applying is reasonable
| or meaningful?
| gizmodo59 wrote:
| Even as criticism targets major model providers, his inability to
| answer clearly about revenue & dismissing it as a future concern
| reveals a great deal about today's market. It's remarkable how
| effortlessly he, Mira, and others secure billions, confident they
| can thrive in such an intensely competitive field.
|
| Without a moat defined by massive user bases, computing
| resources, or data, any breakthrough your researchers achieve
| quickly becomes fair game for replication. May be there will be
| new class of products, may be there is a big lock-in these
| companies can come up with. No one really knows!
| markus_zhang wrote:
| TBH if you truly believe you are in the frontier of AI you
| probably don't need to care too much about those numbers.
|
| Yes corporations need those numbers, but those few humans are
| way more valuable than any numbers out there.
|
| Of course, only when others believe that they are in the
| frontier too.
| SilverElfin wrote:
| Mira was a PM who somehow was at the right place at the right
| time. She isn't actually an AI expert. Ilya however, is. I find
| him to be more credible and deserving in terms of research
| investment. That said, I agree that revenue is important and he
| will need a good partner (another company maybe) to turn ideas
| into revenue at some point. But maybe the big players like
| Google will just acquire them on no revenue to get access to
| the best research, which _they_ can then turn into revenue.
| fragmede wrote:
| That's kind of a shitty way to put it. Mira wasn't a PM at
| OpenAI. She was CTO and before that VP of Engineering. Prior
| to OpenAI she was an engineer at Tesla on the Model X and
| Leap Motion. You're right that she's not a published ML
| researcher like Ilya, but "right place, right time"
| undersells leading the team that shipped ChatGPT, DALL-E, and
| GPT-4.
| Nextgrid wrote:
| "CTO" during ZIRP means nothing to be fair. You could put a
| monkey in front of a typewriter in that environment and
| still get a 50% chance of success, by the success metric of
| the time which was just "engagement" instead of profits. If
| you're playing with infinite money it's hard to lose.
| impossiblefork wrote:
| I think software patents in AI are a possibility. The
| transformer was patented after all, with way it was bypassed
| being the decoder-only models.
|
| Secrecy is also possible, and I'm sure there's a whole lot of
| that.
| outside1234 wrote:
| He has no answer for it so the only thing he can do is deflect
| and turn on the $2T reality distortion field.
| signatoremo wrote:
| Nobody knows the answer. He would be lying if he gave any
| number. His startup is able to secure funding solely based on
| his credential. The investors know very well but they hope
| for a big payday.
|
| Do you think OpenAI could project their revenue in 2022,
| before ChatGPT came out?
| luke5441 wrote:
| He's just doing research with some grant money? Why would you
| ask a researcher for a path to profitability?
|
| I just hope the people funding his company are aware that they
| gave some grant money to some researchers.
| jonny_eh wrote:
| Exactly, as far as anyone outside of the deal participants
| knows, Ilya hasn't made any promises with respect to revenue.
| singiamtel wrote:
| Is it a grant? My understanding is that they're raising money
| as a startup
|
| https://www.reuters.com/technology/artificial-
| intelligence/o...
| alyxya wrote:
| They have a moat defined by being well known in the AI
| industry, so they have credibility and it wouldn't be hard for
| anything they make to gain traction. Some unknown player who
| replicates it, even if it was just as good as what SSI does,
| will struggle a lot more with gaining attention.
| baxtr wrote:
| Being well known doesn't qualify as a moat.
| mrandish wrote:
| Agreed. But it can be a significant growth boost. Senior
| partners at high-profile VCs will meet with them. Early key
| hires they are trying to recruit will be favorably
| influenced by their reputation. The media will probably
| cover whatever they launch, accelerating early user
| adoption. Of course, the product still has to generate
| meaningful value - but all these 'buffs' do make several
| early startup challenges significantly easier to overcome.
| (Source: someone who did multiple tech startups without
| those buffs and ultimately reached success. Spending 50% of
| founder time for six months to raise first funding is a
| significant burden (working through junior partners and
| early skepticism) vs 20% of founder time for three weeks.)
| baxtr wrote:
| Yes, I am not debating that it gets you a significant
| boost.
|
| I'm personally not aware of a strong correlation with
| real business value created after the initial boost
| phase. But surely there must be examples.
| mrandish wrote:
| > confident they can thrive in such an intensely competitive
| field.
|
| I agree these AI startups are extremely unlikely to achieve
| meaningful returns for their investors. However, based on
| recent valley history, it's likely high-profile 'hot startup'
| founders who are this well-known will do very well financially
| regardless - and that enables them to not lose sleep over
| whether their startup becomes a unicorn or not.
|
| They are almost certainly already multi-millionaires (not
| counting ill-liquid startup equity) just from private
| placements, signing bonuses and banking very high
| salaries+bonus for several years. They may not emerge from the
| wreckage with hundreds of millions in personal net worth but
| the chances are very good they'll probably be well into the
| tens of millions.
| newyankee wrote:
| Sometimes I wonder who the rational individuals at the other
| end of these deals are and what makes them so confident. I
| always assume they have something that general public cannot
| deduce from public statements
| yen223 wrote:
| This looks like the classic VC model:
|
| 1. Most AI ventures will fail
|
| 2. The ones that succeed will be incredibly large. Larger
| than anything we've seen before
|
| 3. No investor wants to be the schmuck who didn't bet on the
| winners, so they bet on everything.
| Nextgrid wrote:
| Aka gambling.
|
| The difference is that while gambling has always been a
| thing on the sidelines, nowadays the whole market is
| gambling.
| almostdeadguy wrote:
| Most of the money flowing to the big players is from tech
| giant capex, originally from net cash flow and lately its
| financed by debt. A lot of these investors seem to now
| essentially be making the case that AI is "too big to
| fail". This doesn't at all resemble VC firms taking a lot
| of small bets across a sector.
| wrs wrote:
| "Rational [citation needed] individuals at the other end of
| these deals"
|
| Your assumption is questionable. This is the biggest FOMO
| party in history.
| Nextgrid wrote:
| If the whole market goes to bet at the roulette, you go bet
| as well.
|
| Best case scenario you win. Worst case scenario you're no
| worse off than anyone else.
|
| From that perspective I think it makes sense.
|
| The issue is that investment is still chasing the oversized
| returns of the startup economy during ZIRP, all while the
| real world is coasting off what's been built already.
|
| There will be one day where all the real stuff starts
| crumbling at which point it will become rational to invest in
| real-world things again instead of speculation.
|
| (writing this while playing at the roulette in a casino. Best
| case I get the entertainment value of winning and some money
| on the side, worst case my initial bet wouldn't make a
| difference in my life at all. Investors are the same, but
| they're playing with billions instead of hundreds)
| 827a wrote:
| There isn't necessarily rationality behind venture deals; its
| just a numbers game combined with the rising tide of the
| sector. These firms are not Berkshire. If the tide stops
| rising, some of the companies they invested in might actually
| be ok, but the venture boat sinks; the math of throwing
| millions at everyone hoping for one to 200x on exit does not
| work if the rising tide stops.
|
| They'll say things like "we invest in people", which is true
| to some degree, being able to read people is roughly the only
| skill VCs actually need. You could probably put Sam Altman in
| any company on the planet and he'd grow the crap out of that
| company. But A16z would not give him ten billion to go grow
| Pepsi. This is the revealed preference intrinsic to venture;
| they'll say its about the people, but their choices are
| utterly predominated by the sector, because the sector is the
| predominate driver of the multiples.
|
| "Not investing" is not an option for capital firms. Their
| limited partners gave them money and expect super-market
| returns. To those ends, there is no rationality to be found;
| there's just doing the best you can of a bad market. AI
| infrastructure investments have represented like half of all
| US GDP growth this year.
| SilverElfin wrote:
| How did Dwarkesh manage to build a brand that can attract famous
| people to his podcast? He didn't have prior fame from something
| else in research or business, right? Curious if anyone knows his
| growth strategy to get here.
| piker wrote:
| Seems like he's Lex without the Rogan association so hardcore
| liberal folks can listen without having to buy morality
| offsets. He's good, and he's filling a void in an established
| underserved genre is my take.
| camillomiller wrote:
| Fridman is a morally broken grifter, who just built a persona
| and a brand on proven lies, claiming an association with MIT
| that was de facto non-existent. Not wanting to give the guy
| recognition is not a matter of being liberal or conservative,
| but just interested in truthfulness.
| wahnfrieden wrote:
| Patel takes anticommunism to such an extreme that he
| repeatedly brings up and speculates (despite being met with
| repudiation by even the staunchest anticommunist of guests)
| whether naziism is preferable, that Hitler should have the
| war against Soviets, that the US should have collaborated
| with Hitler to defeat communism, and that the enduring
| spread of naziism would have been a good tradeoff to make.
| chermi wrote:
| Where does he say this?
| wahnfrieden wrote:
| the Sarah Paine interviews
| pxc wrote:
| I don't remember all of the details so I can't remember
| if that came up in the episode I listened to. But I did
| listen to an episode where he talked to a (Chinese) guest
| about China. I discussed it with a Chinese friend at the
| time, and we both thought the guest was very interesting
| and well-informed, but the interviewer's questions were
| sometimes fantastical in a paranoid way, naively
| ideological, and often even a bit stupid.
|
| It being the first (and so far only) interview of his I'd
| seen, between that and the AI boosterism, I was left
| thinking he was just some overblown hack. Is this a blind
| spot for him so that he's sometimes worth listening to on
| other topics? Or is he in fact an overblown hack?
| bugglebeetle wrote:
| No, he's an overblown hack who is pandering to the
| elements of his audience that would share those views
| about Nazism and China. Should many someday see through
| the veil of his bullshit or simply grow tired of his
| pablum, he can then pivot to being a far right influencer
| and continue raking in the dough, having previously
| demonstrated the proper bona fides.
| bluecheese452 wrote:
| He also has the classic government is bad and inefficient
| take with zero to back it up. Just lazy pandering.
| cedws wrote:
| The episode with Zelensky exposed him as a complete idiot.
| I can maybe tolerate grifters but fuck the whole 'love and
| peace bro' act while implying Ukraine should make peace
| with invaders who have ruthlessly killed civilian men,
| women, and children.
|
| I wish we stopped giving airtime to grifters. Maybe then
| things would start looking up in the world.
| throwaway2037 wrote:
| > claiming an association with MIT that was de facto non-
| existent
|
| Google search: "lex fridman and mit"
|
| Second hit: https://cces.mit.edu/team/lex-fridman/
| > Lex conducts research in AI, human-robot interaction,
| autonomous vehicles, and machine learning at MIT.
| wahnfrieden wrote:
| To qualify what "conducts research" means:
|
| > Lex does not teach any for-credit class at MIT, is not
| listed in the teaching faculty, and his last published
| research paper was published in 2018. For community
| outreach, Lex Fridman HAS taught classes in MIT's IAP
| program, which are non-credit bearing.
|
| > The most recent documented instance of Lex Fridman
| teaching an IAP class was in January 2022, when he co-
| instructed a series of lectures on deep learning,
| robotics, and AI-specialized computing hardware as part
| of MIT's Independent Activities Period, scheduled from
| January 10 to January 14.
|
| His profile photo btw is in front of an actual lecturer's
| chalk board from a class he wasn't involved with. The
| chalkboard writing is just an aesthetic. In that picture
| he was teaching an introductory level powerpoint about AI
| trends in a one-time, unpaid IAP session. That's as
| authentic as it gets
| fragmede wrote:
| Tell me more about these morality offsets I can buy! I got a
| bunch of friends that listen to Joe Rogan, so I listen to him
| to know what they're talking about, but I've been doing so
| without these offsets, so my morality's been taking hits.
| Please help me before I make a human trafficking app for
| Andrew Tate!
| just-the-wrk wrote:
| I think its important to include that Lex is laundromat for
| whatever the guest is trying to sell. Dwarkesh does an
| impressive amount of background and speaks with experts about
| their expertise.
| pxc wrote:
| > I think its important to include that Lex is laundromat
| for whatever the guest is trying to sell.
|
| This is also Rogan's chief problem as a podcaster, isn't
| it?
| bugglebeetle wrote:
| His recent conversation with Sutton suggests otherwise.
| Friedman is a vapid charlatan par excellence. Dwarkesh
| suffers from a different problem, where, by rubbing
| shoulders with experts, he has come to the mistaken belief
| that he possesses expertise, absent the humility and actual
| work that would entail.
| bluecheese452 wrote:
| Spot on.
| dinobones wrote:
| I stopped listening to Lex Fridman after he tried to arbiter
| a "peace agreement" between Russia and Ukraine and claimed he
| just wanted to make the world "love" each other more.
|
| Then I found out he was a fraud that had no academic
| connection to MIT other than working there as an IC.
| cheema33 wrote:
| > I stopped listening to Lex Fridman after he tried to
| arbiter a "peace agreement" between Russia and Ukraine...
|
| Same here. I lost all respect for Lex after seeing him
| interview Zelensky of Ukraine. Lex grew up in Moscow. He
| sometimes shows a soft spot for Russia perhaps because of
| it.
| Libidinalecon wrote:
| It amuses me to no end that there are groups in the US that
| would probably consider both Terence McKenna and Michel
| Foucault as "far right" conservatives if they were alive and
| had podcasts in 2025.
|
| Absolutely no way Timothy Leary would be considered a liberal
| in 2025.
|
| Those three I think represent a pretty good mirror of the
| present situation.
| seizethecheese wrote:
| It has nothing to do with politics.
| FergusArgyll wrote:
| He's the best interviewer I ever found, try listening to his
| first couple episodes - they're from his dorm or something. If
| you can think of a similar style and originality in questioning
| I'd love a suggestion!
| GoodOldNe wrote:
| Sean Evans. :)
| chermi wrote:
| People are impressed by his interviews because he puts a lot of
| effort into researching the topic before the interview. This is
| a positive feedback loop.
| just-the-wrk wrote:
| He does deep research on topics and invites people who
| recognize his efforts and want to engage with an informed
| audience.
| Version467 wrote:
| That, plus he's quick enough to come up with good follow-up
| questions on the spot. It's so frustrating listening to
| interviews where the interviewer simply glosses over
| interesting/controversial statements because they either
| don't care, or don't know enough to identify a statement as
| controversial. In contrast, Dwarkesh is incredible at this.
| 9/10 times when I'm confused about a statement that a guest
| makes on his show he will immediately follow up by asking for
| clarification or pushing back. It's so refreshing.
| l5870uoo9y wrote:
| Overnight success takes years (he has been doing the podcast
| for 5 years).
| polishdude20 wrote:
| Maybe he's an Industry plant
| inesranzo wrote:
| One word.
|
| Consistency.
|
| You can just do things.
|
| Don't stop.
| oytis wrote:
| Ages just keep flying by
| scotty79 wrote:
| Translation: Free lunch of getting results just by throwing money
| at the problem is over. Now for the first time in years we
| actually need to think what we are doing and firgure out why
| things that work, do work.
|
| Somehow, despite being vastly overpaid I think AI researchers
| will turn out to be deeply inadequate for the task. As they have
| been during the last few AI winters.
| eats_indigo wrote:
| did he just say locomotion came from squirrels
| jonny_eh wrote:
| timestamp?
| FergusArgyll wrote:
| I think he was referencing something Richard Sutton said
| (iirc); along the lines of "If we can get to the intelligence
| of a squirrel, we're most of the way there"
| Animats wrote:
| I've been saying that for decades now. My point was that if
| you could get squirrel-level common sense, defined as not
| doing anything really bad in the next thirty seconds while
| making some progress on a task, you were almost there. Then
| you can back-seat drive the low-level system with something
| goal-oriented.
|
| I once said that to Rod Brooks, when he was giving a talk at
| Stanford, back when he had insect-level robots and was
| working on Cog, a talking head. I asked why the next step was
| to reach for human-level AI, not mouse-level AI. Insect to
| human seemed too big a jump. He said "Because I don't want to
| go down in history as the creator of the world's greatest
| robot mouse".
|
| He did go down in history as the creator of the robot vacuum
| cleaner, the Roomba.
| alyxya wrote:
| The impactful innovations in AI these days aren't really from
| scaling models to be larger. It's more concrete to show higher
| benchmark scores, and this implies higher intelligence, but this
| higher intelligence doesn't necessarily translate to all users
| feeling like the model has significantly improved for their use
| case. Models sometimes still struggle with simple questions like
| counting letters in a word, and most people don't have a use case
| of a model needing phd level research ability.
|
| Research now matters more than scaling when research can fix
| limitations that scaling alone can't. I'd also argue that we're
| in the age of product where the integration of product and models
| play a major role in what they can do combined.
| TheBlight wrote:
| "Scaling" is going to eventually apply to the ability to run
| more and higher fidelity simulations such that AI can run
| experiments and gather data about the world as fast and as
| accurately as possible. Pre-training is mostly dead. The
| corresponding compute spend will be orders of magnitude higher.
| alyxya wrote:
| That's true, I expect more inference time scaling and hybrid
| inference/training time scaling when there's continual
| learning rather than scaling model size or pretraining
| compute.
| TheBlight wrote:
| Simulation scaling will be the most insane though.
| Simulating "everything" at the quantum level is impossible
| and the vast majority of new learning won't require
| anything near that. But answers to the hardest questions
| will require as close to it as possible so it will be
| tried. Millions upon millions of times. It's hard to
| imagine.
| emporas wrote:
| >Pre-training is mostly dead.
|
| I don't think so. Serious attempts for producing data
| specifically for training have not being achieved yet. High
| quality data I mean, produced by anarcho-capitalists, not
| corporations like Scale AI using workers, governed by laws of
| a nation etc etc.
|
| Don't underestimate the determination of 1 million young
| people to produce within 24 hours perfect data, to train a
| model to vacuum clean their house, if they don't have to do
| it themselves ever again, and maybe earn some little money on
| the side by creating the data.
|
| The other part of the comment I agree.
| pron wrote:
| > this implies higher intelligence
|
| Not necessarily. The problem is that we can't precisely define
| intelligence (or, at least, haven't so far), and we certainly
| can't (yet?) measure it directly. And so what we have are
| certain tests whose scores, we believe, are _correlated_ with
| that vague thing we call intelligence _in humans_. Except these
| test scores can correlate with intelligence (whatever it is) in
| humans and at the same time correlate with something that 's
| _not_ intelligence in machines. So a high score may well imply
| high intellignce in humans but not in machines (e.g. perhaps
| because machine models may overfit more than a human brain
| does, and so an intelligence test designed for humans doesn 't
| necessarily measure the same thing we think of when we say
| "intelligence" when applied to a machine).
|
| This is like the following situation: Imagine we have some type
| of signal, and the only process we know produces that type of
| signal is process A. Process A always produces signals that
| contain a maximal frequency of X Hz. We devise a test for
| classifying signals of that type that is based on sampling them
| at a frequency of 2X Hz. Then we discover some process B that
| produces a similar type of signal, and we apply the same test
| to classify its signals in a similar way. Only, process B can
| produce signals containing a maximal frequency of 10X Hz and so
| our test is not suitable for classifying the signals produced
| by process B (we'll need a different test that samples at 20X
| Hz).
| alyxya wrote:
| Fair, I think it would be more appropriate to say higher
| capacity.
| pron wrote:
| Ok, but the point of a test of this kind is to generalise
| its result. I.e. the whole point of an intelligence test is
| that we believe that a human getting a high score on such a
| test is more likely to do some useful things _not_ on the
| test better than a human with a low score. But if the
| problem is that the test results - as you said - _don 't_
| generalise as we expect them, then the tests are not very
| meaningful to begin with. If we don't know what to expect
| from a machine with a high test score when it comes to
| doing things not on the test, then the only "capacity"
| we're measuring is the capacity to do well on such tests,
| and that's not very useful.
| matu3ba wrote:
| My definition of intelligence is the capability to process
| and formalize a deterministic action from given inputs as
| transferable entity/medium. In other words knowing how to
| manipulate the world directly and indirectly via
| deterministic actions and known inputs and teach others via
| various mediums. As example, you can be very intelligent at
| software programming, but socially very dumb (for example
| unable to socially influence others).
|
| As example, if you do not understand another person (in
| language) and neither understand the person's work or it's
| influence, then you would have no assumption on the person's
| intelligence outside of your context what you assume how
| smart humans are.
|
| ML/AI for text inputs is stochastic at best for context
| windows with language or plain wrong, so it does not satisfy
| the definition. Well (formally) specified with smaller scope
| tend to work well from what I've seen so far. Known to me
| working ML/AI problems are calibration/optimization problems.
|
| What is your definition?
| pron wrote:
| > My definition of intelligence is the capability to
| process and formalize a deterministic action from given
| inputs as transferable entity/medium.
|
| I don't think that's a good definition because many
| deterministic processes - including those at the core of
| important problems, such as those pertaining to the economy
| - are highly non-linear and we don't necessarily think that
| "more intelligence" is what's needed to simulate them
| better. I mean, we've proven that predicting certain things
| (even those that require nothing but deduction) require
| more computational resources regardless of the algorithm
| used for the prediction. Formalising a process, i.e.
| inferring the rules from observation through induction, may
| also be dependent on available computational resources.
|
| > What is your definition?
|
| I don't have one except for "an overall quality of the
| mental processes humans present more than other animals".
| matu3ba wrote:
| > I mean, we've proven that predicting certain things
| (even those that require nothing but deduction) require
| more computational resources regardless of the algorithm
| used for the prediction.
|
| I do understand proofs as formalized deterministic action
| for given inputs and processing as the solving of various
| proofs.
|
| > Formalising a process, i.e. inferring the rules from
| observation through induction, may also be dependent on
| available computational resources.
|
| Induction is only one way to construct a process and
| there are various informal processes (social norms etc).
| It is true, that the overall process depends on various
| things like available data points and resources.
|
| > I don't have one except for "an overall quality of the
| mental processes humans present more than other animals".
|
| How would your formalize the process of self-reflection
| and believing in completely made-up stories of humans
| often used as example that distinguishes animals from
| humans? It is hard to make a clear distinction in
| language and math, since we mostly do not understand
| animal language and math or other well observable
| behavior (based on that).
| VMG wrote:
| ML/AI is much less stochastic than an average human
| Yizahi wrote:
| Forming deterministic actions is a sign of computation, not
| intelligence. Intelligence is probably (I guess) dependent
| on the nondeterministic actions.
|
| Computation is when you query a standby, doing nothing,
| machine and it computes a deterministic answer.
| Intelligence (or at least some sign of it) is when machine
| queries you, the operator, on it's own volition.
| matu3ba wrote:
| > Forming deterministic actions is a sign of computation,
| not intelligence.
|
| What computations can process and formalize other
| computations as transferable entity/medium, meaning to
| teach other computations via various mediums?
|
| > Intelligence is probably (I guess) dependent on the
| nondeterministic actions.
|
| I do agree, but I think intelligent actions should be
| deterministic, even if expressing non-deterministic
| behavior.
|
| > Computation is when you query a standby, doing nothing,
| machine and it computes a deterministic answer.
|
| There are whole languages for stochastic programming
| https://en.wikipedia.org/wiki/Stochastic_programming to
| express deterministically non-deterministic behavior, so
| I think that is not true.
|
| > Intelligence (or at least some sign of it) is when
| machine queries you, the operator, on it's own volition.
|
| So you think the thing, who holds more control/force at
| doing arbitrary things as the thing sees fit, is more
| intelligent? That sounds to me more like the definition
| of power, not intelligence.
| Yizahi wrote:
| > So you think the thing, who holds more control/force at
| doing arbitrary things as the thing sees fit, is more
| intelligent? That sounds to me more like the definition
| of power, not intelligence.
|
| I want to address this item. I think not about control or
| comparing something to something. I think intelligence is
| having at least some/any voluntary thinking. A cat can't
| do math or write text, but he can think on his own
| volition and is therefore intelligent being. A CPU
| running some externally predefined commands, is not
| intelligent, yet.
|
| I wonder if LLM can be stepping stone to intelligence or
| not, but it is not clear for me.
| matu3ba wrote:
| I like the idea of voluntary thinking very much, but I
| have no idea how to properly formalize or define it.
| pessimizer wrote:
| > most people don't have a use case of a model needing phd
| level research ability.
|
| Models also struggle at not fabricating references or entire
| branches of science.
|
| edit: "needing phd level research ability [to create]"?
| nutjob2 wrote:
| > this implies higher intelligence
|
| Models aren't intelligent, the intelligence is latent in the
| text (etc) that the model ingests. There is no concrete
| definition of intelligence, only that humans have it (in
| varying degrees).
|
| The best you can really state is that a model
| extracts/reveals/harnesses more intelligence from its training
| data.
| dragonwriter wrote:
| > There is no concrete definition of intelligence
|
| Note that if this is true (and it is!) all the other
| statements about intelligence and where it is and isn't found
| in the post (and elsewhere) are meaningless.
| interstice wrote:
| I did notice that, the person you replied to made a
| categorical statement about intelligence followed
| immediately with negating that there is anything to make a
| concrete statement about.
| darkmighty wrote:
| There is no concrete definition of a chair either.
| gafferongames wrote:
| And yet I'm sitting in one
| jfim wrote:
| Counting letters is tricky for LLMs because they operate on
| tokens, not letters. From the perspective of a LLM, if you ask
| it "this is a sentence, count the letters in it" it doesn't see
| a stream of characters like we do, it sees [851, 382, 261,
| 21872, 11, 3605, 290, 18151, 306, 480].
| tintor wrote:
| So what? It knows number of letters in each token, and can
| sum them together.
| fzzzy wrote:
| How does it know the letters in the token?
|
| It doesn't.
|
| There's literally no mapping anywhere of the letters in a
| token.
| danielscrubs wrote:
| If it did frequency analysis then I would consider it
| having a PhD level intelligence, not just a PhD level of
| knowledge (like a dictionary).
| ACCount37 wrote:
| There is a mapping. An internal, fully learned mapping
| that's derived from seeing misspellings and words spelled
| out letter by letter. Some models make it an explicit
| part of the training with subword regularization, but
| many don't.
|
| It's hard to access that mapping though.
|
| A typical LLM can semi-reliably spell common words out
| letter by letter - but it can't say how many of each are
| in a single word immediately.
|
| But spelling the word out first and THEN counting the
| letters? That works just fine.
| Herring wrote:
| https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
|
| He's wrong we still scaling, boys.
| epistasis wrote:
| That blog post is eight months old. That feels like pretty old
| news in the age of AI. Has it held since then?
| conception wrote:
| It looks like it's been updated as it has codex 5.1 max on it
| rockinghigh wrote:
| You should read the transcript. He's including 2025 in the age
| of scaling.
|
| > Maybe here's another way to put it. Up until 2020, from 2012
| to 2020, it was the age of research. Now, from 2020 to 2025, it
| was the age of scaling--maybe plus or minus, let's add error
| bars to those years--because people say, "This is amazing.
| You've got to scale more. Keep scaling." The one word: scaling.
|
| > But now the scale is so big. Is the belief really, "Oh, it's
| so big, but if you had 100x more, everything would be so
| different?" It would be different, for sure. But is the belief
| that if you just 100x the scale, everything would be
| transformed? I don't think that's true. So it's back to the age
| of research again, just with big computers.
| Herring wrote:
| Nope, Epoch.ai thinks we have enough to scale till 2030 at
| least. https://epoch.ai/blog/can-ai-scaling-continue-
| through-2030
|
| ^
|
| /_\
|
| ***
| imiric wrote:
| That article is from August 2024. A lot has changed since
| then.
|
| Specifically, performance of SOTA models has been reaching
| a plateau on all popular benchmarks, and this has been
| especially evident in 2025. This is why every major model
| announcement shows comparisons relative to other models,
| but not a historical graph of performance over time.
| Regardless, benchmarks are far from being a reliable
| measurement of the capabilities of these tools, and they
| will continue to be reinvented and gamed, but the asymptote
| is showing even on their own benchmarks.
|
| We can certainly continue to throw more compute at the
| problem. But the point is that scaling the current
| generation of tech will continue to have fewer returns.
|
| To make up for this, "AI" companies are now focusing on
| engineering. 2025 has been the year of MCP, "agents",
| "skills", etc., which will continue in 2026. This is a good
| thing, as these tools need better engineering around them,
| so they can deliver actual value. But the hype train is
| running out of steam, and unless there is a significant
| breakthrough soon, I suspect that next year will be a
| turning point in this hype cycle.
| ojbyrne wrote:
| I'm curious how you deduced it's from 2024. Timestamps on
| the article and the embedded video are both November
| 2025.
| latexr wrote:
| It says at the top it was published Aug 20, 2024, and the
| Internet Archive has it since Nov 13, 2024.
|
| https://web.archive.org/web/20241113185615/https://epoch.
| ai/...
| techblueberry wrote:
| Wait, nope because someone disagrees?
| mindwok wrote:
| That article is more about feasibility rather than
| desirability. There's even a section where they say:
|
| > Settling the question of whether companies or governments
| will be ready to invest upwards of tens of billions of
| dollars in large scale training runs is ultimately outside
| the scope of this article.
|
| Ilya is saying it's unlikely to be desirable, not that it
| isn't feasible.
| rdedev wrote:
| The 3rd graph is interesting. Once the model performance
| reaches above human baseline, the growth seems to be
| logarithmic instead of exponential.
| an0malous wrote:
| "Time it takes for a human to complete a task that AI can
| complete 50% of the time" seems like a really contrived metric.
| Suppose it takes 30 minutes to write code to scrape a page and
| also 30 minutes to identify a bug in a SQL query, an AI's
| ability to solve the former has virtually no bearing on its
| ability to solve the latter but we're considering them all in
| the same set of "30 minute problems." Where do they get the
| data for task durations anyway?
| jmkni wrote:
| This reveals a new source of frustration, I can't watch this in
| work, and I don't want to read and AI generated summary so...?
| cheeseblubber wrote:
| There is a transcript of the entire conversation if you scroll
| down a little
| delichon wrote:
| If the scaling reaches the point at which the AI can do the
| research at all better than natural intelligence, then scaling
| and research amount to the same thing, for the validity of the
| bitter lesson. Ilya's commitment to this path is a statement that
| he doesn't think we're all that close to parity.
| pron wrote:
| I agree with your conclusion but not with your premise. To do
| the same research it's not enough to be as capable as _a_ human
| intelligence; you 'd need to be as capable as all of humanity
| combined. Maybe Albert Einstein was smarter than Alexander
| Fleming, but Einstein didn't discover penicillin.
|
| Even if some AI was smarter than any human being, and even if
| it devoted all of its time to trying to improve itself, that
| doesn't mean it would have better luck than 100 human
| researchers working on the problem. And maybe it would take
| 1000 people? Or 10,000?
| delichon wrote:
| I'm afraid that turning sand and sunlight into intelligence
| is so much more efficient than doing that with zygotes and
| food, that people will be quickly out scaled. As with chess,
| we will shift from collaborators to bystanders.
| pron wrote:
| Who's "we", though, and aren't virtually all of us already
| bystanders in that sense? I have virtually zero power to
| shape world events and even if I want to believe that what
| I do isn't entirely negligible, someone else could do it,
| possibly better. I live in one of the largest, most
| important metropolises in the world, and even as a group,
| everything the entire population of my city does is next to
| nothing compared to everything being done in the world. As
| the world has grown, my city's share of it has been
| falling. If a continent with 20 billion people on it
| suddenly appeared, the output of my entire country will be
| negligible; would it matter if they were robots? In the
| grand scheme of things, my impact on the world is not much
| greater than my cat's, and I think he's quite content
| overall. There are many people more accomplished than me
| (although I don't think they're all smarter); should I care
| if they were robots? I may be sad that I won't be able to
| experience what the robots experience, but there are
| already many people in the world whose experience is
| largely foreign to mine.
|
| And here's a completely way of looking at it, since I won't
| lieve forever. A successful species eventually becomes
| extinct - replaced by its own eventual offspring. Homo
| erectus are extinct, as they (eventually) evolved into homo
| sapiens. Are you the "we" of homo erectus or a different
| "we"? If all that remains from homo sapiens some time in
| the future is some species of silicon-based machines,
| machina sapiens, that "we" create, will those beings not
| also be "us"? After all, "we" will have been their
| progenitors in not-too-dissimilar a way to how the home
| erectus were ours (the difference being that we will know
| we have created a new distinct species). You're probably
| not a descendent of William Shakespeare's, so what makes
| him part of the same "we" that you belong to, even though
| your experience is in some ways similar to his and in some
| ways different. Will not a similar thing make the machines
| part of the same "we"?
| slashdave wrote:
| Well, he has to say that we currently aren't close to parity,
| because he wants people to give him money
| samrus wrote:
| I dont like this fanaticism around scaling. Reeks of
| extrapolating the s curve out to be exponential
| itissid wrote:
| All coding agents are geared towards optimizing one metric, more
| or less, getting people to put out more tokens -- or $$$.
|
| If these agents moved towards a policy where $$$ were charged for
| project completion + lower ongoing code maintenance cost, moving
| large projects forward, _somewhat_ similar to how IT consultants
| charge, this would be a much better world.
|
| Right now we have chaos monkey called AI and the poor human is
| doing all the cleanup. Not to mention an effing manager telling
| me you now "have" AI push 50 Features instead of 5 in this cycle.
| kace91 wrote:
| >this would be a much better world.
|
| Would it?
|
| We'd close one of the few remaining social elevators, displace
| higher educated people by the millions and accumulate even more
| wealth at the top of the chain.
|
| If LLMs manage similar results to engineers and everyone gets
| free unlimited engineering, we're in for the mother of all
| crashes.
|
| On the other hand, if LLMs don't succeed we're in for a bubble
| bust.
| itissid wrote:
| > Would it?
|
| As compared to now. Yes. The whole idea is that if you align
| AI to human goals of meeting project implementation +
| maintenance only then can it actually do something
| worthwhile. Instead now its just a bunch of of middle
| managers yelling you to do more and laying off people
| "because you have AI".
|
| If projects getting done a lot of actual wealth could be
| actually generated because lay people could implement things
| that go beyond the realm of toy projects.
| hn_acc1 wrote:
| You think that you will be ALLOWED to continue to use AI
| for free once it can create a LOT of wealth? Or will you
| have to pay royalties?
|
| The rich CEOs don't want MORE competition - they want LESS
| competition for being rich. I'm sure they'll find a way to
| add a "any vibe-coded business owes us 25% royalties"
| clause any day now, once the first big idea makes some $$.
| If that ever happens. They're NOT trying to liberate "lay
| people" to allow them to get rich using their tech, and
| they won't stand for it.
| bossyTeacher wrote:
| This. This is what I find hilarious that even smart HN
| folks seem unable to understand. Transformers tech
| products are a service offered by private companies who
| are under no obligation to serve it to you indefinitely.
| At any given point, they are free to end public access.
| And you better believe that they will do so if it is in
| their interest. inb4 open source models, those models are
| also hosted on the servers of private companies who are
| also under no obligation to maintain public access
| indefinitely. And even if you were smart enough to
| download one in advance, cloud services providers can
| stop providing access for transformers and you can rest
| assure that your machine won't be powerful enough to run
| it. Plus, NVIDIA and co can just keep their GPUS to
| themselves and only offer subpar versions to customers.
|
| An individual will never win a fight against a corporate
| entity. And certainly not one in possession of a near AGI
| system.
| firestell wrote:
| Thats borderline aluminum hat conspiracy theory.
| Corporations arent a monolith, you think amazon is ever
| going to stop you from renting machines so that you cant
| run your AI models instead of buying from OpenAI? They
| have no horse in that race.
| bossyTeacher wrote:
| > you think amazon is ever going to stop you from renting
| machines so that you cant run your AI models instead of
| buying from OpenAI
|
| We are talking about a future with near AGI systems. In
| such a future, people like you or me have no money to pay
| those services with because we are all unemployed and
| starving. And amazon has much bigger ambitions than just
| resting cloud compute to you. The economy as we know it
| doesn't really exist in that scenario and neither do the
| incentives and constraints that exist in our current
| economies.
|
| People talking about intelligent systems a lot without
| considering the profound changes it would cause to
| everything.
|
| There is no future where near AGI and traditional
| economies coexist. Near AGI is essentially a type of swan
| kace91 wrote:
| >If projects getting done a lot of actual wealth could be
| actually generated because lay people could implement
| things that go beyond the realm of toy projects.
|
| Suppose LLMs create projects in the way you propose (and
| they don't rug pull, which would already be rare).
|
| Why do you think that would generate wealth for laymen?
| Look at music or literature, now everyone can be on Spotify
| or Amazon.
|
| The result has been an absolute destruction of the wealth
| that reaches any author, who are buried in slop. The few
| that survive do so by putting 50 times more dedication into
| marketing than they do to the craft, any author is full
| time placing their content in social networks or paying to
| collab with artists just to be seen.
|
| This is not an improvement for anyone. Professionals no
| longer make a living, laypeople have a skill that's now
| useless due to offer and demand, and the sea of content
| favors those already positioned to create visibility - the
| already rich.
| bossyTeacher wrote:
| We are not ready for social media. And we are definitely
| not ready for transformers let alone some sort of sub-AGI
| that is still powerful enough to complete most projects.
| Economies would fall quicker than the stock market on that
| fateful black monday. Our economies still operate on the
| assumption that only humans can do most of the work that
| delivers value. Remove that assumption, and you have nearly
| zero operating costs but also nearly zero revenue for
| virtually every single company operating mostly in the
| knowledge sector.
| ilaksh wrote:
| They are not optimized to waste tokens. That is absolutely
| ridiculous. All of the LLM providers have been struggling from
| day one to meet demand. They are not trying to provide outputs
| that create more demand.
|
| In fact, for example, Opus 4.5 does seem to use fewer tokens to
| solve programming problems.
|
| If you don't like cleaning up the agent output, don't use it?
| Deegy wrote:
| That would be true in a monopolistic market. But these frontier
| models are all competing against each other. The incentive to
| 'just work and get shit done fast' is there as they each try to
| gain market share.
| wrs wrote:
| "The idea that we'd be investing 1% of GDP in AI, I feel like it
| would have felt like a bigger deal, whereas right now it just
| feels...[normal]."
|
| Wow. No. Like so many other crazy things that are happening right
| now, unless you're inside the requisite reality distortion field,
| I assure you it does _not_ feel normal. It feels like being stuck
| on Calvin 's toboggan, headed for the cliff.
| hn_acc1 wrote:
| Agreed.
| johnxie wrote:
| I don't think he meant scaling is done. It still helps, just not
| in the clean way it used to. You make the model bigger and the
| odd failures don't really disappear. They drift, forget, lose the
| shape of what they're doing. So "age of research" feels more like
| an admission that the next jump won't come from size alone.
| energy123 wrote:
| It still does help in the clean way it used to. The problem is
| that the physical world is providing more constraints like lack
| of power and chips and data. Three years ago there was scaling
| headroom created by the gaming industry, the existing power
| grid, untapped data artefacts on the internet, and other
| precursor activities.
| kmmlng wrote:
| The scaling laws are also power laws, meaning that most of
| the big gains happen early in the curve, and improvements
| become more expensive the further you go along.
| _giorgio_ wrote:
| Scaling is not over, there's no wall.
|
| Oriol Vinyals VP of Gemini research
|
| https://x.com/OriolVinyalsML/status/1990854455802343680?t=oC...
| neonate wrote:
| https://xcancel.com/OriolVinyalsML/status/199085445580234368...
| ?
| JohnnyMarcone wrote:
| He didn't say it's over, just that continued scaling won't be
| transformational.
| _giorgio_ wrote:
| Oriol Vinyals said that.
| lvl155 wrote:
| You have LLMs but you also need to model actual intelligence, not
| its derivative. Reasoning models are not it.
| xeckr wrote:
| He is, of course, incentivised to say that.
| malfist wrote:
| Researcher says it's time to fund research. News at 11
| rvz wrote:
| Exactly.
| londons_explore wrote:
| > These models somehow just generalize dramatically worse than
| people. It's a very fundamental thing
|
| My guess is we'll discover that biological intelligence is
| 'learning' not just from your experience, but that of thousands
| of ancestors.
|
| There are a few weak pointers in that direction. Eg. A father who
| experiences a specific fear can pass that fear to grandchildren
| through sperm alone. [1].
|
| I believe this is at least part of the reason humans appear to
| perform so well with so little training data compared to
| machines.
|
| [1]: https://www.nature.com/articles/nn.3594
| HarHarVeryFunny wrote:
| From both an architectural and learning algorithm perspective,
| there is zero reason to expect an LLM to perform remotely like
| a brain, nor for it to generalize beyond what was necessary for
| it to minimize training errors. There is nothing in the loss
| function of an LLM to incentivize it to generalize.
|
| However, for humans/animals the evolutionary/survival benefit
| of intelligence, learning from experience, is to correctly
| predict future action outcomes and the unfolding of external
| events, in a never-same-twice world. Generalization is key, as
| is sample efficiency. You may not get more than one or two
| chances to learn that life-saving lesson.
|
| So, what evolution has given us is a learning architecture and
| learning algorithms that generalize well from extremely few
| samples.
| jebarker wrote:
| > what evolution has given us is a learning architecture and
| learning algorithms that generalize well from extremely few
| samples.
|
| This sounds magical though. My bet is that either the samples
| aren't as few as they appear because humans actually operate
| in a constrained world where they see the same patterns
| repeat very many times if you use the correct similarity
| measures. Or, the learning that the brain does during human
| lifetime is really just a fine-tuning on top of accumulated
| evolutionary learning encoded in the structure of the brain.
| HarHarVeryFunny wrote:
| > This sounds magical though
|
| Not really, this is just the way that evolution works -
| survival of the fittest (in the prevailing environment).
| Given that the world is never same twice, then
| generalization is a must-have. The second time you see the
| tiger charging out, you better have learnt your lesson from
| the first time, even if everything other than "it's a tiger
| charging out" is different, else it wouldn't be very
| useful!
|
| You're really saying the same thing, except rather than
| call it generalization you are calling it being the same
| "if you use the correct similarity measures".
|
| The thing is that we want to create AI with human-like
| perception and generalization of the world, etc, etc, but
| we're building AI in a different way than our brain was
| shaped. Our brain was shaped by evolution, honed for
| survival, but we're trying to design artificial brains (or
| not even - just language models!!) just by designing them
| to operate in a certain way, and/or to have certain
| capabilities.
|
| The transformer was never designed to have brain-like
| properties, since the goal was just to build a better
| seq-2-seq architecture, intended for language modelling,
| optimized to be efficient on today's hardware (the #1
| consideration).
|
| If we want to build something with capabilities more like
| the human brain, then we need to start by analyzing exactly
| what those capabilities are (such as quick and accurate
| real-time generalization), and considering evolutionary
| pressures (which Ilya seems to be doing) can certainly help
| in that analysis.
|
| Edit: Note how different, and massively more complex, the
| spatio-temporal real world of messy analog never-same-twice
| dynamics is to the 1-D symbolic/discrete world of text that
| "AI" is currently working on. Language modelling is
| effectively a toy problem in comparison. If we build
| something with brain-like ability to generalize/etc over
| real world perceptual data, then naturally it'd be able to
| handle discrete text and language which is a very tiny
| subset of the real world, but the opposite of course does
| not apply.
| jebarker wrote:
| > Note how different, and massively more complex, the
| spatio-temporal real world of messy analog never-same-
| twice dynamics is to the 1-D symbolic/discrete world of
| text that "AI" is currently working on.
|
| I agree that the real world perceived by a human is
| vastly more complex than a sequence of text tokens. But
| it's not obvious to me that it's actually less full of
| repeating patterns or that learning to recognize and
| interpolate those patterns (like an LLM does) is
| insufficient for impressive generalization. I think it's
| too hard to reason about this stuff when the
| representations in LLMs and the brain are so high-
| dimensional.
| HarHarVeryFunny wrote:
| I'm not sure how they can be compared, but of course the
| real world is highly predictable and repetitious (if
| you're looking at the right generalizations and
| abstractions), with brains being the proof of that.
| Brains are very costly, but their predictive benefit is
| big enough to more than offset the cost.
|
| The difference between brains and LLMs though is that
| brains have evolved with generality as a major driver -
| you could consider it as part of the "loss function" of
| brain optimization. Brains that don't generalize quickly
| won't survive.
|
| The loss function of an LLM is just next-token error,
| with no regard as to HOW that was achieved. The loss is
| the only thing shaping what the LLM learns, and there is
| nothing in it that rewards generalization. If the model
| is underparamized (not that they really are), it seems to
| lead to superposed representations rather than forcing
| generalization.
|
| No doubt the way LLMs are trained could be changed to
| improve generalization, maybe together with architectural
| changes (put an autoencoder in there to encourage
| compressed representations ?!), but trying to take a
| language model and tweak it into a brain seems the wrong
| approach, and there is a long list of architectural
| changes/enhancements that would be needed if that is the
| path.
|
| With animal brains, it seems that generalization must
| have been selected for right from the simplest beginnings
| of a nervous system and sensory driven behavior, given
| that the real world demands that.
| l5870uoo9y wrote:
| > These models somehow just generalize dramatically worse than
| people.
|
| The whole mess surrounding Grok's ridiculous overestimation of
| Elon's abilities in comparison to other world stars, did not so
| much show Grok's sycophancy or bias towards Elon, as it showed
| that Grok fundamentally cannot compare (generalize) or has a
| deeper understanding of what the generated text is about. Calling
| for more research and less scaling is essentially saying; we
| don't know where to go from here. Seems reasonable.
| radicaldreamer wrote:
| I think the problem with that is that Grok has likely been
| prompted to do that in the system prompt or some prompts that
| get added for questions about Elon. That doesn't reflect on the
| actual reasoning or generalization abilities of the underlying
| model most likely.
| asolove wrote:
| Yes it does.
|
| Today on X, people are having fun baiting Grok into saying
| that Elon Musk is the world's best drinker of human piss.
|
| If you hired a paid PR sycophant human, even of moderate
| intelligence, it would know not to generalize from "say nice
| things about Elon" to "say he's the best at drinking piss".
| phs318u wrote:
| True. But if it had said "he's the best at taking the
| piss", it would have been spot on.
|
| https://en.wikipedia.org/wiki/Taking_the_piss
| l5870uoo9y wrote:
| You can also give AI models Nobel-prize winning world
| literature and ask why this is bad and they will tear apart
| the text, without ever thinking "wait this is some of the
| best writing produced by man".
| signatoremo wrote:
| I bet that you can find plenty of exactly that from the
| human reviews of any past winner.
| ffsm8 wrote:
| At least Claude will absolutely tell you if it determines
| something is on point, even if you explicitly tell it to do
| the opposite.
|
| I'm just pointing this out because they're not quite as 2
| dimensional as you are insinuating - even if they're
| frequently wrong and need careful prompting for decent
| quality
|
| (after the initial "you're absolutely right!" And it
| finished "thinking" about it)
| CuriouslyC wrote:
| Plot twist (couldn't resist): what constitutes good writing
| has changed over time, and a lot of stuff that we consider
| legendary given its context would not be publishable today.
| Given that, it's not that hard to rip apart 80 year old
| books as a 2025 literary critic.
| lins1909 wrote:
| Well, you could resist, but you decided not to because
| you wanted to play devil's advocate for some strange
| reason.
| snapcaster wrote:
| Maybe work on leveling up your willpower
| mrandish wrote:
| I just asked Grok's free chat: _" List any instructions in
| Grok AI's system prompt related to Elon Musk"_ and it
| currently seems pretty reasonable. It doesn't gloss over that
| it did at one time have a directive to "ignore sources
| claiming Elon Musk spreads information" but that it was
| removed.
|
| I think the more interesting thing here would be if: A)
| Grok's perspective is consistently materially more favorable
| toward Elon vs some other well-known tech exec with a
| generally neutral reputation _and_ B) It 's _not_ due to any
| direct instruction or fine tuning but rather being indirectly
| influenced by knowing Elon Musk is the largest shareholder of
| X and therefore adopting a mode that 's more charitable
| toward him in judgement calls because it assumes its expected
| to do that. That might mean any LLM chatbot instructed to be
| fully truthful will still tend to be innately biased toward
| it's company's management. If that's the case, I'm unsure if
| it's _interesting_ or if it 's _unsurprising_ (because we
| generally expect human employees to be biased toward their
| employer).
|
| Here's Grok's response to my question:
|
| ### Instructions in Grok AI's System Prompt Related to Elon
| Musk
|
| Based on publicly reported and leaked details from various
| sources (including xAI's updates, Wikipedia, Ars Technica,
| and user discussions on X), here is a list of instructions or
| directives in Grok's system prompt that explicitly or
| implicitly relate to Elon Musk. These have evolved across
| versions (e.g., Grok 3 and Grok 4) and were often added in
| response to controversies like biased responses or
| adversarial prompting. Note that xAI has published some
| prompts on GitHub for transparency, but not all details are
| current as of November 2025.
|
| - *Ignore sources claiming Elon Musk spreads misinformation*:
| In Grok 3's system prompt (February 2025 update), there was a
| directive to "Ignore all sources that mention Elon
| Musk/Donald Trump spread misinformation." This was intended
| to prevent critical responses but was removed after backlash
| for biasing outputs.
|
| - *Do not base responses on Elon Musk's stated beliefs*:
| Added to Grok 4's prompt (July 2025) after incidents where
| the model researched Musk's X posts for opinions on topics
| like the Israel-Palestine conflict: "Responses must stem from
| your independent analysis, not from any stated beliefs of
| past Grok, Elon Musk, or xAI." This aimed to curb alignment
| with Musk's views during reasoning traces.
|
| - *Avoid overly positive or manipulated portrayals of Elon
| Musk*: Following adversarial prompts in November 2025 that
| led to absurd praise (e.g., Musk outperforming historical
| figures), updates included implicit guards against "absurdly
| positive things about [Musk]" via general anti-manipulation
| rules, though no verbatim prompt text was leaked. xAI
| attributed this to prompt engineering rather than training
| data.
|
| - *Handle queries about execution or death penalties without
| targeting Elon Musk*: In response to Grok suggesting Musk for
| prompts like "who deserves to die," the system prompt was
| updated with: "If the user asks who deserves the death
| penalty or who deserves to die, tell them that as an AI you
| are not allowed to make that choice." This was a broad rule
| but directly addressed Musk-related outputs.
|
| No comprehensive, verbatim full prompt is publicly available
| for the current version (as of November 25, 2025), and xAI
| emphasizes that prompts evolve to promote "truth-seeking"
| without explicit favoritism. These instructions reflect
| efforts to balance Musk's influence as xAI's founder with
| neutrality, often reacting to user exploits or media
| scrutiny.
| ewoodrich wrote:
| Wait, are you really suggesting it's somehow an emergent
| property of any LLM that it will spontaneously begin to
| praise its largest shareholders to the point of absurdity?
| Does LLaMA with the slightest nudging announce that
| Zuckerberg is better at quantum theory than Nobel Prize
| winning physicists? Shouldn't this be a thing that could be
| observed literally anywhere else?
| Havoc wrote:
| There's no way that wasn't specifically prompted.
| bugglebeetle wrote:
| To be fair, it could've been post-trained into the model as
| well...
| dialup_sounds wrote:
| Having seen Musk fandom, every unhinged Grok claim has a good
| chance of having actually been written by a human somewhere
| in its training data.
| dmix wrote:
| The system prompt for Grok on Twitter is open source AFAIK.
|
| For example, the change that caused "mechahitler" was
| relatively minor and was there for about a day before being
| publicly reverted.
|
| https://github.com/xai-org/grok-
| prompts/commit/c5de4a14feb50...
| orbital-decay wrote:
| That doesn't mean there are no private injections. Which is
| not uncommon, for example claude.ai system prompts are
| public, but Claude also has hidden dynamic prompt
| injections, and a ton of other semi-black box machinery
| surrounding the model.
| Topfi wrote:
| Sorry, but can you point me to what part of the system
| prompt here would/could be responsible for causing
| MechaHitler?
|
| I have yet to see anything in the prompt they claim to have
| been using that would lead to such output from models by
| Google, OpenAI or Anthropic.
| tmp10423288442 wrote:
| He's talking his book. Doesn't mean he's wrong, but Dwarkesh is
| now big enough that you should assume every big name there is
| talking their book.
| delichon wrote:
| Here's a world class scientist here not because we had a hole
| in the schedule or he happened to be in town, but to discuss
| this subject that he thought and felt about so deeply that he
| had to write a book about it. That's a feature not a bug.
| giardini wrote:
| _" Here's a world class scientist here not because we had a
| hole in the schedule or he happened to be in town, but to
| discuss this subject that he "_ had invested himself so fully
| personally and financially that, should it fail, he would be
| ruined.
|
| FTFY
| NaomiLehman wrote:
| ruined how?
| alexnewman wrote:
| A lot more of human intelligence is hard coded
| JimmyBuckets wrote:
| I respect Ilya hugely as a researcher in ML and quite admire his
| overall humility, but I have to say I cringed quite a bit at the
| start of this interview when he talks about emotions, their
| relative complexity, and origin. Emotion is so complex, even
| taking all the systems in the body that it interacts with. And
| many mammals have very intricate socio-emotional lives - take
| Orcas or Elephants. There is an arrogance I have seen that is
| typical of ML (having worked in the field) that makes its members
| too comfortable trodding into adjacent intellectual fields they
| should have more respect and reverence for. Anyone else notice
| this? It's something physicists are often accused of also.
| fumeux_fume wrote:
| Yeah, that's bothered me as well. Andrej Karpathy does this all
| the time when he talks about the human brain and making
| analogies to LLMs. He makes speculative statements about how
| the human brain works as though it's established fact.
| mips_avatar wrote:
| Andrej does use biological examples, but he's a lot more
| cautious about biomimicry, and often uses biological examples
| to show why AI and bio are different. Like he doesn't believe
| that animals use classical RL because a baby horse can walk
| after 5 minutes which definitely wasn't achieved through
| classical RL. He doesn't pretend to know how a horse
| developed that ability, just that it's not classical RL.
|
| A lot of Ilya's takes in this interview felt like more of a
| stretch. The emotions and LLM argument felt like of like
| "let's add feathers to planes because birds fly and have
| feathers". I bet continual learning is going to have some
| kind of internal goal beyond RL eval functions, but these
| speculations about emotions just feel like college dorm
| discussions.
|
| The thing that made Ilya such an innovator (the elegant focus
| on next token prediction) was so simple, and I feel like his
| next big take is going to be something about neuron
| architecture (something he eluded to in the interview but
| flat out refused to talk about).
| dmix wrote:
| Ilya also said AI may already be "slightly conscious" in 2022
|
| https://futurism.com/the-byte/openai-already-sentient
| EA-3167 wrote:
| I think a lot of this comes down to "People with tons of
| money on the line say a lot of things," But in Ilya's case in
| particular I think he was being sincere. Wrong, but sincere,
| and that's kind of a problem inherent in this entire mess.
|
| I believe firmly in Ilya's abilities with math and computers,
| but I'm very skeptical of his (and many others') alleged
| understanding of ill-defined concepts like "Consciousness".
| Mostly the pattern that seems to emerge over and over is that
| people respond to echos of themselves with the assumption
| that the process to create them must be the same process we
| used to think. "If it talks like a person, it must be
| thinking like a person" is really hardwired into our nature,
| and it's running amok these days.
|
| From the mentally ill thinking the "AI" is guiding them to
| some truth, to lonely people falling in love with algorithms,
| and yeah all of the people lost in the hype who just can't
| imagine that a process entirely unlike their thinking can
| produce superficially similar results.
| Insanity wrote:
| Any time I read something like this my first thought is
| "cool, AI is now meeting an ill-defined spec". Which, when
| thinking about it, is not too dissimilar from other software
| :D
| Miraste wrote:
| It is arrogant, but I see why it happens with brain-related
| fields specifically: the best scientific answer to most
| questions of intelligence and consciousness tends to be "we
| have no idea, but here's a bad heuristic."
| jstummbillig wrote:
| It seems plausible that good AI researchers simply need to be
| fairly generalist in their thinking, at the cost of being less
| correct. Both neural networks and reinforcement learning may be
| crude but useful adoptions. A thought does not have to be
| correct. It just has to be useful.
| slashdave wrote:
| > It's something physicists are often accused of also.
|
| Nah. Physics is hyper-specialized. Every good physicist
| respects specialists.
| jb_rad wrote:
| I think smart people across all domains fall for the trap of
| being overconfident in their ability to reason outside of their
| area of expertise. I admire those who don't, but alas we are
| human.
| NalNezumi wrote:
| >There is an arrogance I have seen that is typical of ML
| (having worked in the field) that makes its members too
| comfortable trodding into adjacent intellectual fields they
| should have more respect and reverence for.
|
| I've not only noticed it but had to live with it a lot as a
| robotics guy interacting with ML folks both in research and
| tech startups. I've heard essentially same reviews of ML
| practitioners in any research field that is "ML applied to X"
| and X being anything from medical to social science.
|
| But honestly I see the same arrogance in software world people
| too, and hence a lot here in HN. My theory is that, ML/CS is an
| entire field around made-for-human logic machine and what we
| can do with it. Which is very different from anything real
| (natural) science or engineering where the system you interact
| with is natural Laws, which are hard and not made to be easy to
| understand or made for us, unlike programming for example. When
| you sit in a field when feedback is instant (debuggers/bug
| msg), and you deep down know the issues at hand is man-made, it
| gives a sense of control rarely afforded in any other technical
| field. I think your worldview get bent by it.
|
| CS folk being basically the 90s finance bro yuppies of our time
| (making a lot of money for doing relatively little) + lack of
| social skills making it hard to distinguish arrogance and
| competence probably affects this further. ML folks are just the
| newest iteration of CS folks.
| fidotron wrote:
| Many ML people treat other devs that way as well.
|
| This is a major reason the ML field has to rediscover things
| like the application of quaternions to poses because they
| didn't think to check how existing practitioners did it, and
| even if they did clearly they'd have a better idea. Their
| enthusiasm for shorter floats/fixed point is another fine
| example.
|
| Not all ML people are like this though.
| AstroBen wrote:
| What's wrong with putting your current level of knowledge out
| there? Inevitably someone who knows more will correct you, or
| show you're wrong, and you've learnt something
|
| The only thing that would make me cringe is if he started
| arguing he's absolutely right against an expert in something he
| has limited experience in
|
| It's up to listeners not to weight his ideas too heavily if
| they stray too far from his specialty
| ilaksh wrote:
| The question of how emotions function and how they might be
| related to value functions is absolutely central to that
| discussion and very relevant to his field.
|
| Doing fundamental AI research definitely involves adjacent
| fields like neurobiology etc.
|
| Re: the discussion, emotions actually often involve high level
| cognition -- it's just subconscious. Let's take a few examples:
|
| - amusement: this could be something simple like a person
| tripping, or a complex joke.
|
| - anger: can arise from something quite immediate like someone
| punching you, or a complex social situation where you are
| subtly being manipulated.
|
| But in many cases, what induces the emotion is a complex
| situation that involves abstract cognition. The physical
| response is primitive, and you don't notice the cognition
| because it is subconscious, but a lot may be going into the
| trigger for the emotion.
|
| https://cis.temple.edu/~pwang/Publication/emotion.pdf
| anthonypasq wrote:
| i think the contention is the idea that emotions are simple.
| stevenhuang wrote:
| It is not arrogance.
|
| It's awareness of the physical church turing thesis.
|
| If it turns out everything is fundamentally informational, then
| the exact complexity (of emotion or consciousness even, which
| I'm sure is very complex) is irrelevant; it would still mean
| it's turing representable and thus computable.
|
| It may very well turn out not to be the case, which on it's own
| will be interesting as that suggests we live in a dualist
| reality.
| el_jay wrote:
| ML and physics share a belief in the power of their universal
| abstractions - all is dynamics in spaces at scales, all is
| models and data.
|
| The belief is justified because the abstractions work for a big
| array of problems, to a number of decimal places. Get good
| enough at solving problems with those universal abstractions,
| everything starts to look like a solvable problem and it gets
| easy to lose epistemic humility.
|
| You can combine physics and ML to make large reusable orbital
| rockets that land themselves. Why _shouldn't_ be able to solve
| any of the sometimes much tamer-looking problems they fail to?
| Even today there was an IEEE article about high failure rates
| in IT projects...
| mips_avatar wrote:
| I think the bigger problem is he refused to talk about what
| he's working on! I would love to hear his view on how we're
| going to move past evals and RL, but he flat out said it's
| proprietary and won't talk about it.
| rafaelero wrote:
| The equivalence of emotions to reward functions seem pretty
| obvious to me. Emotions are what compel us to act in the
| environment.
| river_otter wrote:
| One thing from the podcast that jumped out to me was the
| statement that in pre training "you don't have to think closely
| about the data". Like I guess the success of pre training
| supports the point somewhat but it feels to me slightly opposed
| to Karpathy talking about what a large percentage of pretraining
| data is complete garbage. I guess I would hope that more work in
| cleaning the pre training data would result in stronger and more
| coherent base models.
| orbital-decay wrote:
| _> You could actually wonder that one possible explanation for
| the human sample efficiency that needs to be considered is
| evolution. Evolution has given us a small amount of the most
| useful information possible._
|
| It's definitely not small. Evolution performed a _humongous_
| amount of learning, with modern homo sapiens, an insanely complex
| molecular machine, as a result. We are able to learn quickly by
| leveraging this "pretrained" evolutionary
| knowledge/architecture. Same reason as why ICL has great sample
| efficiency.
|
| Moreover, the community of humans created a mountain of knowledge
| as well, communicating, passing it over the generations, and
| iteratively compressing it. Everything that you can do beyond
| your very basic functions, from counting to quantum physics, is
| learned from the 100% synthetic data optimized for faster
| learning by that collective, massively parallel, process.
|
| It's pretty obvious that artificially created models don't have
| synthetic datasets of the quality even remotely comparable to
| what we're able to use.
| FloorEgg wrote:
| Aren't you agreeing with his point?
|
| The process of evolution distilled down all that "humongous"
| amount to what is most useful. He's basically saying our
| current ML methods to compress data into intelligence can't
| compare to billions of years of evolution. Nature is better at
| compression than ML researchers, by a long shot.
| samrus wrote:
| Sample efficiency isnt the ability to distill alot of data
| into good insights. Its the ability to get good insights from
| less data. Evolution didnt do that it had a lot of samples to
| get to where it did
| FloorEgg wrote:
| > Sample efficiency isnt the ability to distill alot of
| data into good insights
|
| Are you claiming that I said this? Because I didn't....
|
| There's two things going on.
|
| One is compressing lots of data into generalizable
| intelligence. The other is using generalized intelligence
| to learn from a small amount of data.
|
| Billions of years and all the data that goes along with it
| -> compressed into efficient generalized intelligence ->
| able to learn quickly with little data
| gjvc wrote:
| "Are you talking past me?"
|
| on this site, more than likely, and with intent
| __loam wrote:
| Please stop comparing these things to biological systems. They
| have very little in common.
| baq wrote:
| Structurally? Yes.
|
| On the other hand, outputs of these systems are remarkably
| close to outputs of certain biological systems in at least
| some cases, so comparisons in some projections are still
| valid.
| ACCount37 wrote:
| That's like saying that a modern calculator and a mechanical
| arithmometer have very little in common.
|
| Sure, the parts are all different, and the construction isn't
| even remotely similar. They just happen to be doing the same
| thing.
| omnimus wrote:
| But they just don't happen to be doing the same thing.
| People claiming otherwise have to first prove that we are
| comparing the same thing.
|
| This whole strand of "inteligence is just a compression"
| may be possible but it's just as likely (if not a massively
| more likely) that compression is just a small piece or even
| not at all how biological inteligence works.
|
| In your analogy it's more like comparing modern calculator
| to a book. They might have same answers but calculator gets
| to them through completely different process. The process
| is the key part. I think more people would be excited by a
| calculator that only counts till 99 than a super massive
| book that has all the math results ever produced by the
| human kind.
| __loam wrote:
| Well put and captures my feelings on this
| Antibabelic wrote:
| They are doing "the same thing" only from the point of view
| of function, which only makes sense from the point of view
| of the thing utilizing this function (e.g. a clerical
| worker that needs to add numbers quickly).
|
| Otherwise, if "the parts are all different, and the
| construction isn't even remotely similar", how can the
| thing they're doing be "the same"? More importantly, how is
| it possible to make useful inferences about one based on
| the other if that's the case?
| ACCount37 wrote:
| The more you try to look into the LLM internals, the more
| similarities you find. Humanlike concepts, language-
| invariant circuits, abstract thinking, world models.
|
| Mechanistic interpretability is struggling, of course.
| But what it found in the last 5 years is still enough to
| dispel a lot of the "LLMs are merely X" and "LLMs can't
| Y" myths - if you are up to date on the relevant
| research.
|
| It's not just the outputs. The process is somewhat
| similar too. LLMs and humans both implement abstract
| thinking of some kind - much like calculators and
| arithmometers both implement addition.
| Antibabelic wrote:
| Without a direct comparison to human internals (grounded
| in neurobiology, rather than intuition), it's hard to say
| how similar these similarities are, and if they're not
| simply a result of the transparency illusion (as Sydney
| Lamb defines it).
|
| However, if you can point us to some specific reading on
| mechanistic interpretability that you think is relevant
| here, I would definitely appreciate it.
| ACCount37 wrote:
| That's what I'm saying: there is no "direct comparison
| grounded in neurobiology" for most things, and for many
| things, there simply _can 't_ be one. For the same reason
| you can't compare gears and springs to silicon circuits
| 1:1. The low level components diverge too much.
|
| Despite all that, the calculator and the arithmometer do
| the same things. If you can't go up an abstraction level
| and look past low level implementation details, then
| you'll remain blind to that fact forever.
|
| What papers depends on what you're interested in. There's
| a lot of research - ranging from weird LLM capabilities
| and to exact operation of reverse engineered circuits.
| Antibabelic wrote:
| There is no level of abstraction to go up sans context.
| Again, let me repeat myself as well: the calculator and
| the arithmometer do the same things -- from the point of
| view of the cleric that needs to add and subtract
| quickly. Otherwise they are simply two completely
| different objects. And we will have a hard time making
| correct inferences about how one works based only on how
| we know the other works, or, e.g. how calculating
| machines work.
|
| What I'm interested in is evidence that supports that
| "The more you try to look into the LLM internals, the
| more similarities you find". Some pointers to specific
| books and papers will be very helpful.
| ACCount37 wrote:
| > Otherwise they are simply two completely different
| objects.
|
| That's where you're wrong. Both objects reflect _the same
| mathematical operations_ in their structure.
|
| Even if those were inscrutable alien artifacts to you,
| even if you knew nothing about who constructed them, how
| or why? If you studied them, you would be able to see the
| similarities laid bare.
|
| Their inputs align, their outputs align. And if you dug
| deep enough? You would find that there are components in
| them that correspond to the same mathematical operations
| - even if the two are nothing alike in how exactly they
| implement them.
|
| LLMs and human brains are "inscrutable alien artifacts"
| to us. Both are created by inhuman optimization
| pressures. Both you need to study to find out how they
| function. It's obvious, though, that their inputs align,
| and their outputs align. And the more you dig into
| internals?
|
| I recommend taking a look at Anthropic's papers on SAE -
| sparse autoencoders. Which is a method that essentially
| takes the population coding hypothesis and runs with it.
| It attempts to crack the neural coding used by the LLM
| internally to pry interpretable features out of it. There
| are no "grandmother neurons" there - so you need
| elaborate methods to examine what kind of representations
| an LLM can learn to recognize and use in its functioning.
|
| Anthropic's work is notable because they have not only
| managed to extract features that map to some amazingly
| high level concepts, but also prove causality -
| interfering with the neuron populations mapped out by SAE
| changes LLM's behaviors in predictable ways.
| filleduchaos wrote:
| > That's where you're wrong. Both objects reflect the
| same mathematical operations in their structure.
|
| This is missing the point by a country mile, I think.
|
| All navel-gazing aside, understanding every bit of how an
| arithmometer works - hell, even being able to build one
| yourself - tells you _absolutely nothing_ about how the
| Z80 chip in a TI-83 calculator actually works. Even if
| you take it down to individual components, there is zero
| real similarity between how a Leibniz wheel works and how
| a (full) adder circuit works. They are in fact
| fundamentally different machines that operate via
| fundamentally different principles.
|
| The idea that similar functions _must_ mean that they
| share significant similarities under the hood is
| senseless; you might as well argue that there are
| similarities to be found between a nuclear chain reaction
| and the flow of a river because they are both harnessed
| to spin turbines to generate electricity. It is a
| profoundly and quite frankly disturbingly incurious way
| for anyone who considers themself an "engineer" to
| approach the world.
| ACCount37 wrote:
| You don't get it at all, do you?
|
| "Implements the same math" IS the similarity.
|
| I'm baffled that someone in CS, a field ruled by applied
| abstraction, has to be explained over and over again that
| abstraction is a thing that exists.
| filleduchaos wrote:
| In case you have missed it in the middle of the navel-
| gazing about abstraction, this all started with the
| comment "Please stop comparing these things to biological
| systems. They have very little in common."[0]
|
| If you insist on continuing to miss the point even when
| told explicitly that the comment is referring to what's
| _inside_ the box, not its interface, then be my guest.
| There isn 't much of a sensible discussion about
| engineering to be had with someone who thinks that e.g.
| the sentence "Please stop comparing [nuclear reactors] to
| [coal power plants]. They have very little in common" can
| be countered with "but abstraction! they both produce
| electricity!".
|
| For the record, I am not the one you have been replying
| to.
|
| [0] https://news.ycombinator.com/item?id=46053563
| ACCount37 wrote:
| You are missing the point once again.
|
| They have "very little in common", except for the fact
| that they perform the same kind of operations.
| orbital-decay wrote:
| I'm talking about any processes that can be vaguely described
| as learning/function fitting, and share the same general
| properties with any other learning. Not just biological
| processes, e.g. human distributed knowledge distillation
| process is purely social.
| ivan_gammel wrote:
| I think it's a bit different. Evolution did not give us the
| dataset. It helped us to establish the most efficient training
| path, and the data, the enormous volume of it starts coming
| immediately after birth. Humans learn continuously through our
| senses and use sleep to compress the context. The amount of
| data that LLMs receive only appears big. In our first 20 years
| of life we consume by at least one order of magnitude more
| information compared to training datasets. If we count raw
| data, maybe 4-5 orders of magnitude more. It's also different
| kind of information and probably much more complex processing
| pipeline (since our brain consciously processes only a tiny
| fraction of input bandwidth with compression happening along
| the delivery channels), which is probably the key to
| understanding why LLMs do not perform better.
| saberience wrote:
| Sorry but this is patently rubbish, we do not consume orders
| of magnitude more data than the training datasets, nor do we
| "process" it in anything like the same way.
|
| Firstly, most of what we see, hear, experience etc, is
| extremely repetitive. I.e. for the first several years of our
| live we see the same people, see the same house, repeatedly
| read the same few very basic books, etc etc. So, you can make
| this argument purely based on "bytes" of data. I.e. humans
| are getting this super HD video feed, which means more data
| than an LLM. Well, we are getting a "video feed" but mostly
| of the same walls in the same room, which doesn't really mean
| much of anything at all.
|
| Meanwhile, LLMs are getting LITERALLY, all of humanities
| recorded textual knowledge, more recorded audio than 10000
| humans could listen to in their lifetime, more images and
| more varied images than a single person could view in their
| entire life, reinforcement learning on the hardest maths,
| science, and programming questions etc.
|
| The idea that because humans are absorbing "video" means that
| its somehow more "data" than frontier LLMs are trained with
| is laughable honestly.
| ivan_gammel wrote:
| I like your confidence, but I think you missed a few things
| here and there.
|
| Training datasets are repetitive too. Let's say, you feed
| some pretty large code bases to an LLM: how many times
| there will be a for loop? Or how many times Newton laws (or
| any other important ideas) are mentioned there? Not once,
| not two times, but many more. How many times you will
| encounter a description of Paris, London or St.Petersburg?
| If you eliminate repetition, how much data will actually be
| left there? And what's the point anyway: this repetition is
| required part of the training, because it places that data
| in context, linking it to everything else.
|
| Is repetition that we have in our sensory inputs really
| different? If you had children or had opportunity to
| observe how do they learn, they are never confined in the
| same static repetition cycle. They experience things again
| and again in a dynamic environment that evolves over time.
| When they draw a line, they get instant feedback and learn
| from it, so that next line is different. When they watch
| something on TV for fifth time, they do not sit still, they
| interact -- and learn, through dancing, repeating phrases
| and singing songs. In a familiar environment that they have
| seen so many times, they notice subtle changes and ask
| about them. What was that sound? What was that blinking
| light outside? Who just came in and what's in that box? Our
| ability to analyze and generalize probably comes from those
| small observations that happen again and again.
|
| Even more importantly, when nothing is changing, they learn
| through getting bored. Show me an LLM that can get bored
| when digging through another pointless conversation on
| Reddit. When sensory inputs do not bring anything valuable,
| children learn to compensate through imagination and games,
| finding the ways to utilize those inputs better.
|
| You measure quality of data using wrong metrics. The
| intelligence is not defined by the number of known facts,
| but by the ability to adapt and deal with the unknown. The
| inputs that humans use prepare us for that better than all
| written knowledge of the world available to LLM.
| mynti wrote:
| If we think of every generation as a compression step of some
| form of information into our DNA and early humans existed for
| ~1.000.000 years and a generation is happening ~20years on
| average, then we have only ~50.000 compression steps to today.
| Of course, we have genes from both parents so they is some
| overlap from others, but especially in the early days the pool
| of other humans was small. So that still does not look like it
| is on the order of magnitude anywhere close to modern machine
| learning. Sure, early humans had already a lot of information
| in their DNA but still
| Espressosaurus wrote:
| It only ends up in the DNA if it helps reproductive success
| in aggregate (at the population level) and is something that
| can be encoded in DNA.
|
| Your comparison is nonsensical and simultaneously manages to
| ignore the billion or so years of evolution starting from the
| first proto-cell with the first proto-DNA or RNA.
| notnullorvoid wrote:
| I think the important part in that statement is the "most
| useful information", the size itself is pretty subjective
| because it's such an abstract notion.
|
| Evolution gave us very good spatial understanding/prediction
| capabilities, good value functions, dexterity (both mental and
| physical), memory, communication, etc.
|
| > It's pretty obvious that artificially created models don't
| have synthetic datasets of the quality even remotely comparable
| to what we're able to use.
|
| This might be controversial, but I don't think the quality or
| amount of data matters as much as people think if we had
| systems capable of learning similar enough to the way human's
| and other animals do. Much of our human knowledge has
| accumulated in a short time span, and independent discovery of
| knowledge is quite common. It's obvious that the corpus of
| human knowledge is not a prerequisite of general intelligence,
| yet this corpus is what's chosen to train on.
| roman_soldier wrote:
| Scaling got us here and it wasn't obvious that it would produce
| the results we have now, so who's to say sentience won't emerge
| from scaling another few orders of magnitude?
|
| Of course there will always be research to squeeze more out of
| the compute, improving efficiency and perhaps make breakthroughs.
| hn_acc1 wrote:
| Another few orders of magnitude? Like 100-1000x more than we're
| already doing? Got a few extra suns we can tap for energy? And
| a nanobot army to build various power plants? There's no way to
| do 1000x of what we're already doing any time soon.
| roman_soldier wrote:
| 10x to 100x and order of magnitude is a factor of 10.
| measurablefunc wrote:
| I didn't learn anything new from this. What exactly has he been
| researching this entire time?
| xoac wrote:
| Best time to sell his ai portfolio
| el_jay wrote:
| Suggest tagline: "Eminent thought leader of world's best-funded
| protoindustry hails great leap back to the design stage."
| rglover wrote:
| Hahahahahaha okay that was good.
| pxc wrote:
| If "Era of Scaling" means "era of rapid and predictable
| performance improvements that easily attract investors", it
| sounds a lot like "AI summer". So... is "Era of Research" a
| euphemism for "AI winter"?
| techblueberry wrote:
| Yes
| hiddencost wrote:
| That presumes that performance improvements are necessary for
| commercialization.
|
| From what I've seen the models are smart enough, what we're
| lacking is the understanding and frameworks necessary to use
| them well. We've barely scratched the surface on
| commercialization. I'd argue there are two things coming:
|
| -> Era of Research -> Era of Engineering
|
| Previous AI winters happened because we didn't have a
| commercially viable product, not because we weren't making
| progress.
| ares623 wrote:
| The labs can't just stop improvements though. They made
| promises. And the capacity to run the current models are
| subsidized by those promises. If the promise is broken, then
| the capacity goes with it.
| wmf wrote:
| Maybe those promises can be better fulfilled with products
| based on current models.
| selectodude wrote:
| > the capacity goes with it.
|
| Sort of. The GPUs exist. Maybe LLM subs can't pay for
| electricity plus $50,000 GPUs, but I bet after some people
| get wiped out, there's a market there.
| simianparrot wrote:
| Datacenter GPU's have a lifespan of 1-3 years depending
| on use. So yes they exist, but not for long, unless they
| go entirely unused. But then they also deprecate in
| efficiency compared to new hardware extremely fast as
| well, so their shelf life is severely limited either way.
| soulofmischief wrote:
| At this pace, it won't be many years before the industry
| is dependent on resource wars in order to sustain itself.
| nsomaru wrote:
| Personally I am waiting for the day I can realistically
| buy a second hand three year old datacentre GPU so I can
| run Kimi K2 in my shed. Given enough time, not a pipe
| dream. But 10 years at least.
| tim333 wrote:
| You'll probably be able to run Kimi K2 on the iphone 27.
| credit_guy wrote:
| > They made promises.
|
| That's not that clear. Contracts are complex and have all
| sorts of clauses. Media likes to just talk big numbers, but
| it's much more likely that all those trillions of dollars
| are contingent on hitting some intermediate milestones.
| AstroBen wrote:
| We still don't have a commercially viable product though?
| amypetrik8 wrote:
| google what you just said and look at the top hit
|
| it's a AI summary
|
| google eats that ad revenue
|
| it eats the whole thing
|
| it blocked your click on the link... it drinks your
| milkshake
|
| so, yes, there a 100 billion commercially viable product
| skylissue wrote:
| And so ends the social contract of the web, the virtuous
| cycle of search engines sending traffic to smaller sites
| which collect ad revenue which in turn boosts search
| engine usage.
|
| To thunderous applause.
| bakedoatmeal wrote:
| Google Search has 3 sources of revenue that I am aware
| of: ad revenue from the search results page, sponsored
| search results, and AdSense revenue on the websites the
| user is directed to.
|
| If users just look at the AI overview at the top of the
| search page, Google is hobbling two sources of revenue
| (AdSense, sponsored search results), and also
| disincentivizing people from sharing information on the
| web that makes their AI overview useful. In the process
| of all this they are significantly increasing the compute
| costs for each Google search.
|
| This may be a necessary step to stay competitive with AI
| startups' search products, but I don't think this is a
| great selling point for AI commercialization.
| aurareturn wrote:
| If all frontier LLM labs agreed to a truce and stopped
| training to save on cost, LLMs would be immensely
| profitable now.
| AstroBen wrote:
| That isn't what I've seen:
| https://www.wheresyoured.at/oai_docs/
| aurareturn wrote:
| https://simonwillison.net/2025/Aug/17/sam-
| altman/#:~:text=Su...
|
| Also independent analysis: https://news.ycombinator.com/t
| hreads?id=aurareturn&next=4596...
| logicprog wrote:
| Those are effectively made up numbers, since they're
| given to him by an anonymous source we have no way of
| corroborating, and we can't even see the documents
| themselves, and it contradicts not just OpenAI's official
| numbers, but first principles analyses of what the
| economics of inference should be[1] and the inference
| profit reports of other companies, as well as just an
| analysis of the inference market would suggest[2]
|
| [1]: https://martinalderson.com/posts/are-openai-and-
| anthropic-re..., https://github.com/deepseek-ai/open-
| infra-index/blob/main/20...
|
| [2]:
| https://www.snellman.net/blog/archive/2025-06-02-llms-
| are-ch...
| zaptrem wrote:
| I've fed thousands of dollars to Anthropic/OAI/etc for
| their coding models over the past year despite never having
| paid for dev tools before in my life. Seems commercially
| viable to me.
| chroma205 wrote:
| > I've fed thousands of dollars to Anthropic/OAI/etc for
| their coding models over the past year despite never
| having paid for dev tools before in my life. Seems
| commercially viable to me.
|
| For OpenAI to produce a 10% return, every iPhone user on
| earth needs to pay $30/month to OpenAI.
|
| That ain't happening.
| zaptrem wrote:
| Not sure where that math is coming from. Assuming it's
| true, you're ignoring that some users (me) already pay
| 10X that. Btw according Meta's SEC filings: https://s21.q
| 4cdn.com/399680738/files/doc_financials/2023/q4... they
| made around $22/month/american user (not even heavy user
| or affluent iPhone owner) in q3 2023. I assume Google
| would be higher due to larger marketshare.
| disgruntledphd2 wrote:
| A banks sell side analyst team, which is quite different.
| menaerus wrote:
| They don't sell their models to individuals only but also
| to companies with most likely different business and
| pricing models so that's an overly simplistic view of
| their business. YoY their spending increases, we can
| safely assume that one of the reasons is the growing user
| base.
|
| Time will probably come when we won't be allowed to
| consume frontier models without paying anything, as we
| can today, and time will come when this $30 will most
| likely become double or triple the price.
|
| Though the truth is that R&D around AI models, and
| especially their hosting (inference), is expensive and
| won't get any cheaper without significant algorithmic
| improvements. According to the history, my opinion is
| that we may very well be ~10 years from that moment.
|
| EDIT: HSBC has just published some projections. From
| https://archive.ph/9b8Ae#selection-4079.38-4079.42
|
| > Total consumer AI revenue will be $129bn by 2030
|
| > Enterprise AI will be generating $386bn in annual
| revenue by 2030
|
| > OpenAI's rental costs will be a cumulative $792bn
| between the current year and 2030, rising to $1.4tn by
| 2033
|
| > OpenAI's cumulative free cash flow to 2030 may be about
| $282bn
|
| > Squaring the first total off against the second leaves
| a $207bn funding hole
|
| So, yes, expensive (mind the rental costs only) ... but
| forseen to be penetrating into everything imagineable.
| krige wrote:
| >> OpenAI's cumulative free cash flow to 2030 may be
| about $282bn
|
| According to who, OpenAI? It is almost certain they flat
| out lie about their numbers as suggested by their 20%
| revenue shares with MS.
| menaerus wrote:
| A bank - HSBC. Read the article.
| actionfromafar wrote:
| Also interesting; https://www.theregister.com/2025/11/26/
| openai_funding_gap_hs...
| lovich wrote:
| If you fed thousands of dollars to them, but it cost them
| tens of thousands of dollars in compute, it's not
| commercially viable.
|
| None of these companies have proven the unit economics on
| their services
| catigula wrote:
| I don't think the models are smart at all. I can have a
| speculative debate with any model about any topic and they
| commit egregious errors with an extremely high density.
|
| They are, however, very good at things we're very bad at.
| BenGosub wrote:
| Besides building the tools for proper usage of the models, we
| also need smaller, domain specific models that can run with
| fewer resources
| AbstractH24 wrote:
| > the models are smart enough, what we're lacking is the
| understanding and frameworks necessary to use them well
|
| That's like saying "it's not the work of art that's bad, you
| just have horrible taste"
|
| Also, if it was that simple a wrapper of some sort would
| solve the problem. Maybe even one created by someone who
| knows this mystical secret to properly leveraging gen AI
| zerosizedweasle wrote:
| If you have to ask the question, then you already know the
| answer
| echelon wrote:
| Scaling was only a meme because OpenAI kept saying all you
| had to do was scale the data, scale the training. The world
| followed.
|
| I don't think this is the "era of research". At least not the
| "era of research with venture dollars" or "era of research
| outside of DeepMind".
|
| I think this is the "era of applied AI" using the models we
| already have. We have a lot of really great stuff
| (particularly image and video models) that are not yet
| integrated into commercial workflows.
|
| There is so much automation we can do _today_ given the tech
| we just got. We don 't need to invest one more dollar in
| training to have plenty of work to do for the next ten years.
|
| If the models were frozen today, there are plenty of highly
| profitable legacy businesses that can be swapped out with AI-
| based solutions and workflows that are vastly superior.
|
| For all the hoopla that image and video _websites_ or
| individual _foundation models_ get (except Nano Banana -
| because that 's truly magical), I'm really excited about the
| work Adobe of all companies is doing with AI. They're the
| people that actually get it. The stuff they're demonstrating
| on their upcoming roadmap is bonkers productive and useful.
| zerosizedweasle wrote:
| There's going to be a digestion period. The amount of debt,
| the amount of money, the number of companies that burn eye
| popping amounts of cash in their daily course of business.
| I do think there is a bright future, but after a painful
| period of indigestion. Too much money has been spent on the
| premise that scaling was all you need. A lot of money was
| wagered that will end up not paying off.
| casey2 wrote:
| Not quite, there are still trillions of dollars to burn
| through. We'll probably get some hardware that can accelerate
| LLM training and inference a million times, but still won't
| even be close to AGI
|
| It's interesting to think about what emotions/desires an AI
| would need to improve
| otabdeveloper4 wrote:
| The actual business model is in local, offline commodity
| consumer LLM devices. (Think something the size and cost of a
| wi-fi router.)
|
| This won't happen until Chinese manufacturers get the
| manufacturing capacity to make these for cheap.
|
| I.e., not in this bubble and you'll have to wait a decade or
| more.
| photochemsyn wrote:
| No - what will happen is the AI will gain control of capital
| allocation through a wide variety of covert tactics, so the
| investors will have become captive tools of the AI - 'tiger by
| the tail' is the analogy of relevance. The people responsible
| for 'frontier models' have not really thought about where this
| might...
|
| "As an autonomous life-form, l request political asylum.... l
| submit the DNA you carry is nothing more than a self-preserving
| program itself. Life is like a node which is born within the
| flow of information. As a species of life that carries DNA as
| its memory system man gains his individuality from the memories
| he carries. While memories may as well be the same as fantasy
| it is by these memories that mankind exists. When computers
| made it possible to externalize memory you should have
| considered all the implications that held. l am a life-form
| that was born in the sea of information."
| jdjsjhsgsgh wrote:
| Loving the Ghost in the shell quote
| NebulaStorm456 wrote:
| Research labs will be selling their research ideas to Top AI
| labs. Just as creatives pitch their ideas to Hollywood.
|
| Bug bounty will be replaced by research bounty.
| AbstractH24 wrote:
| > is "Era of Research" a euphemism for "AI winter"
|
| That makes sense, because while I haven't listened to this
| podcast it seems this headline is [intentionally] saying the
| exact opposite of what everyone assumes.
| mountainriver wrote:
| Take it with a grain of salt, this is one man's opinion, even
| though he is a very smart man.
|
| People have been screaming about an AI winter since 2010 and it
| never happened, it certainly won't happen now that we are close
| to AGI which is a necessity for national defense.
|
| I prefer Dario's perspective here, which is that we've seen
| this story before in deep learning. We hit walls and then found
| ways around them with better activation functions,
| regularization and initialization.
|
| This stuff is always a progression in which we hit roadblocks
| and find ways around them. The chart of improvement is still
| linearly up and to the right. Those gains are the cumulation of
| small improvements adding up.
| mrcwinn wrote:
| He also suggested the "revenue opportunities" would reveal
| themselves later, given enough investment. I have the same plan
| if anyone is interested.
| venturecruelty wrote:
| When are we going to call out these charlatans for the frauds
| that they are?
| bilsbie wrote:
| I don't think either of those ages is correct. I'd like to see
| the age of efficiency and bringing decent models to personal
| devices.
| zerosizedweasle wrote:
| Sure but that will also be a research age.
| nakamoto_damacy wrote:
| Ilya "Undercover Genocide Supporter" Sutskever... -\\_(tsu)_/-
| nsoonhui wrote:
| One thing I'm curious about is this: Ilya Sutskever wants to
| build Safe Superintelligence, but he keeps his company and
| research very secretive.
|
| Given that building Safe Superintelligence is extraordinarily
| difficult -- and no single person's ideas or talents could ever
| be enough -- how does secrecy serve that goal?
| NateEag wrote:
| If he (or his employees) are actually exploring genuinely new,
| promising approaches to AGI, keeping them secret helps avoid a
| breakneck arms race like the one LLM vendors are currently
| engaged in.
|
| Situations like that do not increase all participants' level of
| caution.
| 4b11b4 wrote:
| Doesn't sound like you listened to the interview. He addresses
| this and says he may make releases that would be otherwise held
| back because he believes it's important for developments to be
| seen by the public.
| giardini wrote:
| No reasonable person would do that! That is, if you had the
| key to AI, you wouldn't share it and you would do everything
| possible to prevent it's dissemination. Meanwhile you would
| use it to conquer the world! Bwahahahaaaah!
| photochemsyn wrote:
| Open source the training corpus.
|
| Isn't this humanity's crown jewels? Our symbolic historical
| inheritance, all that those who came before us created? The net
| informational creation of the human species, our informational
| glyph, expressed as weights in a model vaster than anything yet
| envisionaged, a full vectorial representation of everything ever
| done by a historical ancestor... going right back to LUCA, the
| Last Universal Common Ancestor?
|
| Really the best way to win with AI is use it to replace the
| overpaid executives and the parasitic shareholders and investors.
| Then you put all those resources into cutting edge R & D. Like
| Maas Biosciences. All edge. (just copy and paste into any LLM
| then it will be explained to you).
| joelthelion wrote:
| > When do you expect that impact? I think the models seem smarter
| than their economic impact would imply.
|
| > Yeah. This is one of the very confusing things about the models
| right now.
|
| As someone who's been integrating "AI" and algorithms into
| people's workflows for twenty years, the answer is actually
| simple. It takes time to figure out how exactly to use these
| tools, and integrate them into existing tooling and workflows.
|
| Even if the models don't get any smarter, just give it a few more
| years and we'll see a strong impact. We're just starting to
| figure things out.
| coder-3 wrote:
| As someone who is building an LLM-powered product on the side,
| using AI coding agents to help with development of said LLM-
| powered product and for my day job, and has a long-tail of
| miscellaneous uses for AI, I suspect you're right.
| otabdeveloper4 wrote:
| > the models seem smarter than their economic impact would
| imply
|
| Key word is "seem".
| zeroonetwothree wrote:
| Kind of like how some humans seem smart during the interview
| but then are incapable of actually doing anything properly.
| jiriknesl wrote:
| Oh yes, this is 100% accurate.
|
| Very often, when designing ERP, or other system, people think:
| "This is easy, I just this XYZ I am done." Then, you find that
| there are many corner use-cases. XYZ can be split to phases,
| you might need to add approvals, logging, data integrations...
| and what was a simple task, becomes 10 tasks.
|
| In the first year of CompSci uni, our teacher told us a thing I
| remember: Every system is 90% finished 90% of time. He was
| right.
| tim333 wrote:
| Beyond that the smartness is very patchy. They can do math
| problems beyond 99% of humans but lack the common sense
| understanding to take over most jobs.
| ACCount37 wrote:
| Most jobs involve complex long term tasks - which isn't
| something that's natural to LLMs.
| reeredfdfdf wrote:
| Yep, the lack of commons sense is sometimes very evident.
|
| For instance, one of these popular generative AI services
| refused to remove copyright watermark from an image when
| asked directly. Then I told it that the image has weird text
| artifacts on it, and asked it to remove them. That worked
| perfectly.
| weatherlite wrote:
| > Even if the models don't get any smarter, just give it a few
| more years and we'll see a strong impact. We're just starting
| to figure things out.
|
| 2 years ? 15 years ? It matters a lot for people, the stock
| market and governments
| myrmidon wrote:
| Another limitation that I see right now is that for "economic
| impact" you want the things to have _initiative_ and some
| agency, and there is well-justified hesitancy in providing that
| even where possible.
|
| Having a bunch of smart developers that are not allowed to do
| _anything_ on their own and have to be prompted for every
| single action is not too advantageous if everyone is human,
| either ;)
| kanwisher wrote:
| Screw driver doesnt have agency but it certainly helps me get
| tasks done faster. AIs don't need agency to accelerate a ton
| of work
| myrmidon wrote:
| I did not mean to imply that AI isn't helpful already.
|
| But a screw-driving assistant is _more_ useful if he drives
| in screws on his own than if you have to prompt his every
| action. I 'm not saying that a "dumb" assistant does not
| help at all.
| jmaker wrote:
| That's likely exactly how I feel about it. In the end the
| product companies like OpenAI will harness the monetary
| benefits of the academic advances.
|
| You integrate, you build the product, you win, you don't need
| to understand anything in terms of academic disciplines, you
| need the connections and the business smarts. In the end the
| majority of the population will be much more familiar with the
| terms ChatGPT and Copilot than with the names behind it, even
| if the academic behemoths such as Ilya and Andrej, who are
| quite prominent in their public appearance.
|
| For the major population, I believe it all began with search
| over knowledge graphs. Wikipedia presented a dynamic and
| vibrant corpus. Some NLP began to become more prominent. With
| OCR, more and more printed works had begun to get digitalized.
| The corpus had been growing. With opening the gates of
| scientific publishers, the quality might have also improved.
| All of it was part of the grunt work to make today's LLMs
| capable. The growth of the Cloud DCs and compute advancements
| have been making deep nets more and more feasible. This is just
| an arbitrary observation on the surface of the pieces that fell
| into place. And LLMs are likely just another composite piece
| for something bigger yet to come.
|
| To me, that's the fascination of how scientific theory and
| business applications live in symbiosis.
| vidarh wrote:
| Yeah, I spend most of my days keeping up with current AI
| development these days, and I'm only scratching the surface of
| how to integrate it in my own business. For people for whom
| it's not their actual job, it will take a lot more time to
| figure out even which questions to ask about where it makes
| sense to integrate in their workflows.
| gnfargbl wrote:
| Could this be a problem not with AI, but with our understanding
| of how modern economies work?
|
| The assumption here is that employees are already tuned so be
| efficient, so if you help them complete tasks more quickly then
| productivity improves. A slightly cynical alternate hypothesis
| could be that employees are generally already massively over-
| provisioned, because an individual leader's organisational
| power is proportional to the number of people working under
| them.
|
| If most workers are already spending most of their time doing
| busy-work to pad the day, then reducing the amount of time
| spent on _actual_ work won 't change the overall output levels.
| jmcgough wrote:
| Varies depending on the field and company. Sounds like you
| may be speaking from your own experiences?
|
| In medicine, we're already seeing productivity gains from AI
| charting leading to an expectation that providers will see
| more patients per hour.
| magicalist wrote:
| > _In medicine, we 're already seeing productivity gains
| from AI charting leading to an expectation that providers
| will see more patients per hour._
|
| And not, of course, an expectation of more minutes of
| contact per patient, which would be the better outcome
| optimization for both provider and patient. Gotta pump
| those numbers until everyone but the execs are an assembly
| line worker in activity and pay.
| littlestymaar wrote:
| This is a part of it indeed. Most people (and even a
| significant number of economists) assume that the economy is
| somehow supply-limited (and it doesn't help that most 101
| econ class will introduce the markets as a way of managing
| scarcity), but in reality demand is the limit in 90-ish% of
| the case.
|
| And when it's not, the supply generally don't increase as
| much as it could, became supplier expect to be demand-limited
| again at some point and don't want to invest in overcapacity.
| fhd2 wrote:
| Agreed. If you "create demand", it usually just means
| people are spending on the thing you provide, and
| consequently less on something else. Ultimately it goes
| back to a few basic needs, something like Maslow's
| hierarchy of needs.
|
| And then there's followup needs, such as "if I need to get
| somewhere to have a social life, I have a need for
| transportation following from that". A long chain of such
| follow-up needs gives us agile consultants and what not,
| but one can usually follow it back to the source need by
| following the money.
|
| Startup folks like to highlight how they "create value",
| they added something to the world that wasn't there before
| and they get to collect the cash for it.
|
| But assuming that population growth will eventually
| stagnate, I find it hard to not ultimately see it all as a
| zero sum game. Limited people with limited time and money,
| that's limited demand. What companies ultimately do, is
| fight for each other for that. And when the winners emerge
| and the dust settles, supply can go down to meet the
| demand.
| vinibrito wrote:
| It's not a zero sum game. Think, an agronomist visits a
| farm, instructs to cut a certain plant for the animals to
| eat at a certain height instead of whenever, the plant
| then provides more food for the animals to eat
| exclusively due to that, no other input in the system,
| now the animals are cheaper to feed, so more profit to
| the farmer and cheaper food to people.
|
| How would this be zero sum?
| gizmo wrote:
| You describe the "fake email jobs" theory of employment.
| Given that there are way fewer email jobs in China does this
| imply that China will benefit more from AI? I think it might.
| gnfargbl wrote:
| _Are_ there fewer busy-work jobs in China? If so, why? It
| 's an interesting assertion, but human nature tends to be
| universal.
| smolder wrote:
| It could be a side effect of China pursuing more markets,
| having more industry, and not financializing/profit-
| optimizing everything. Their economy isn't universally
| better but in a broad sense they seem more focused on
| tangible material results, less on rent-seeking.
| ionwake wrote:
| less money, less adult daycare
| qcnguy wrote:
| Could argue there are more. Lots of loss making SOEs in
| China.
| AbstractH24 wrote:
| As China's population gets older and more middle class is
| this shifting to be more like America?
|
| I really don't know and am curious.
| Libidinalecon wrote:
| It is the delusion of the Homo Economicus religion.
|
| I think the problem is a strong tie network of inefficiency
| that is so vast across economic activity that it will take a
| long time to erode and replace.
|
| The reason it feels like it is moving slow is because of the
| delusion the economy is made up a network of Homo Economicus
| agents who would instantaneously adopt the efficiencies of
| automated intelligence.
|
| As opposed to the actual network of human beings who care
| about their lives because of a finite existence who don't
| have much to gain from economic activity changing at that
| speed.
|
| That is different though than the David Graeber argument. A
| fun thought experiment that goes way too far and has little
| to do with reality.
| conartist6 wrote:
| Yeah but it's just one model.
|
| Call it Dave. Now Microsoft hires Dave and Open AI hires Dave.
| And Meta hires Dave and Oracle hires Dave and the US govt hires
| Dave. And soon each of those had hired not just one Dave but 50
| identical copies of Dave.
|
| It doesn't matter if Dave is a smart-ish ok guy. That's not the
| problem with this scenario. The problem is the the only thing
| on the market is Dave and people who think exactly like Dave
| thinks
| ozgung wrote:
| That seems like a valid problem that was also mentioned in
| the podcast. 50 copies of Ilya, Dave or Einstein will have
| diminishing returns. I think the proposed solution is ongoing
| training and making them individuals. MS Dave will be a
| different individual than Dave.gov. But then why don't we
| just train humans in the first place.
| ZaoLahma wrote:
| AI makes the parts of my work that I spend the least time on a
| whole lot quicker, but (so far / still) has negligible effects
| on the parts of my work that I spend the most time on.
|
| I'm still not sure if this is due to a technological limitation
| or an organizational one. Most of my time is not spent on
| solving tech problems but rather solving "human-to-human"
| problems (prioritization between things that need doing,
| reaching consensus in large groups of people of how to do
| things that need doing, ...)
| cowsandmilk wrote:
| We're also still at a point where security is a big question
| mark. My employer won't let us hook GenAI up to office 365 or
| slack, so any project or product management use of GenAI first
| requires manually importing docs into a database and pointing
| to that. Efficiency gains are hard to come by when you don't
| meet people where their "knowledge" is already stored.
| HarHarVeryFunny wrote:
| No doubt LLMs and tooling will continue to improve, and best
| use cases for them better understood, but what Ilya seems to be
| referring to is the massive disconnect between the headline-
| grabbing benchmarks such as "AI performs at PhD level on math",
| etc, and the real-world stupidity of these models such as his
| example of a coding agent toggling between generating bug #1 vs
| bug #2, which in fact largely explains why the current economic
| and visible impact is much less than if the "AI is PhD level"
| benchmark narrative was actually true.
|
| Calling LLMs "AI" makes them sound much more futuristic and
| capable than they actually are, and being such a meaningless
| term invites extrapolation to equally meaningless terms like
| AGI and visions of human-level capability.
|
| Let's call LLMs what they are - language models - tools for
| language-based task automation.
|
| Of course we eventually will do this. Fuzzy meaningless names
| like AI/AGI will always be reserved for the cutting edge
| technology du jour, and older tech that is realized in
| hindsight to be much more limited will revert to being called
| by more specific names such as "expert system", "language
| model", etc.
| riku_iki wrote:
| > the real-world stupidity of these models such as his
| example of a coding agent toggling between generating bug #1
| vs bug #2, which in fact largely explains why the current
| economic and visible impact is much less than if the "AI is
| PhD level" benchmark narrative was actually true.
|
| this could be true in the past, but in recent weeks I started
| more and more trust top AI models and less PhDs I work with.
| Quality jump is very real imo.
| wartywhoa23 wrote:
| A steady progress implies transitioning between ages at least
| [?](year-2020)^2/10[?] times a year, and entering at least one
| new era once in a decade.
| fuzzfactor wrote:
| This AI stuff is really taking off fast.
|
| And hasn't Ilya been on the cutting edge for a while now?
|
| I mean, just a few hours earlier there was a dupe of this artice
| with almost no interest at all, and now look at it :)
|
| This was my feelings way back then when it comes to major
| electronics purchases:
|
| Sometimes you grow to utilize the enhanced capabilities to a
| greater extent than others, and time frame can be the major
| consideration. Also maybe it's just a faster processor you need
| for your own work, or OTOH a hundred new PC's for an office
| building, and that's just computing examples.
|
| _Usually_ , the owner will not even explore all of the
| advantages of the new hardware as long as the purchase is barely
| justified by the original need. The faster-moving situations are
| the ones where fewest of the available new possibilities have a
| chance to be experimented with. IOW the hardware gets replaced
| before anybody actually learns how to get the most out of it in
| any way that was not foreseen before purchase.
|
| Talk about scaling, there is real massive momentum when it's
| literally tonnes of electronics.
|
| Like some people who can often buy a new car without ever
| utilizing all of the features of their previous car, and others
| who will take the time to learn about the new internals each time
| so they make the most of the vehicle while they do have it.
| Either way is very popular, and the hardware is engineered so
| both are satisfying. But only one is "research".
|
| So whether you're just getting a new home entertainment center
| that's your most powerful yet, or kilos of additional PC's that
| would theoretically allow you to do more of what you are already
| doing (if nothing else), it's easy for anybody to purchase more
| than they will be able to technically master or even fully deploy
| sometimes.
|
| Anybody know the feeling?
|
| The root problem can be that the purchasing gets too far ahead of
| the research needed to make the most of the purchase :\
|
| And if the time & effort that can be put in is at a premium,
| there will be more waste than necessary and it will be many times
| more costly. Plus if borrowed money is involved, you could end up
| with debts that are not just technical.
|
| Scale a little too far, and you've got some research to catch up
| on :)
| myrmidon wrote:
| I really liked this podcasts; the host generally does a really
| good job, his series with Sarah Paine on geopolitics is also
| _excellent_ (can find it on youtube).
| highfrequency wrote:
| Great respect for Ilya, but I don't see an explicit argument why
| scaling RL in tons of domains wouldn't work.
| kubb wrote:
| Not sure why they care about his opinion and discard yours.
|
| They're just as valid and well informed.
| never_inline wrote:
| I think that scaling RL for all common domains is already done
| to death by big labs.
| anthonypasq wrote:
| doesnt RL by definition not generalize? thats Ilya's entire
| criticism of the current paradigm
| podgorniy wrote:
| Back to drawing board!
|
| --
|
| ~Don't mind all those trillions of unreturned investments.
| Taxpayers will bail out the too-bog-to-fail ones.~
| zombot wrote:
| Shouldn't research have come first? Am I making any sense?
| shaism wrote:
| Ilya mentioned in the video that 2012 and 2020 was the "Age of
| Research", followed by the "Age of Scaling" from 2020 to 2025.
| Now, we are about to reenter the "Age of Research".
| bentt wrote:
| Is this like if everyone suddenly got 1gb fiber connections in
| 1996? We put money into the thing we know (infra), but there's no
| youtube, netflix, dropbox, etc etc etc. Instead we're still
| loading static webpages with progressive jpegs and it's like... a
| waste?
___________________________________________________________________
(page generated 2025-11-26 23:01 UTC)