[HN Gopher] The Darwin Godel Machine: AI that improves itself by...
___________________________________________________________________
The Darwin Godel Machine: AI that improves itself by rewriting its
own code
Author : birriel
Score : 147 points
Date : 2025-05-30 12:08 UTC (10 hours ago)
(HTM) web link (sakana.ai)
(TXT) w3m dump (sakana.ai)
| interludead wrote:
| Sounds nice! Especially with the Sakana's latest development of
| Continuous Thought Machine. The next step should be to let
| foundation models fine-tune themselves based on their 'history of
| what has been tried before' and new data
| ordinarily wrote:
| The pieces are coming together quickly https://ai-2027.com/.
| candiddevmike wrote:
| This reads like an advertisement for OpenBrain and doesn't seem
| grounded in reality.
| ordinarily wrote:
| I think the general tone is more of a warning than an
| endorsement.
| dmonitor wrote:
| I can't help but notice that it doesn't matter what DeepCent
| does because OpenBrain will reach self awareness 6 months
| before them no matter what. Who needs a profitability plan
| when you're speedrunning the singularity.
| brookst wrote:
| I was a bigger fan of the certain doom in 2025, and I think the
| AI 2030 movement will have better design sense and
| storytelling. But really I haven't seen anything that really
| has the _oomph_ and fire of Tipper Gore's crusade against youth
| music.
|
| We need more showmanship, more dramatic catastrophizing. I feel
| like our current crop of doomers isn't quite shameless enough
| to be really entertaining.
| nosianu wrote:
| A significant thing to keep in mind for non-extinction
| doomerism is that individual experiences vary greatly. There
| may be a significant number of people or groups that really
| _do_ experience what was predicted.
|
| Similar to how the experiences of average rise in temperature
| (I would prefer if they had used the term "energy") differ
| greatly dependent on the region.
|
| Also similar to "the country is doing well, look at the stick
| market and the GDP".
|
| I think everybody who wants to have an actually serious
| discussion needs to invest a lot more effort to get tall
| those annoying "details", and be more specific.
|
| That said, I think that "AI 2027" link looks like it's a
| movie script and not a prediction, so I'm not sure
| criticizing it as if it was something serious even makes
| sense - even if the authors should mean what they write at
| the start and themselves actually take it seriously.
| pram wrote:
| its literally just the plot of "Colossus: The Forbin
| Project" so it isnt even original lol
| tazjin wrote:
| Checked out when it turned into bad geopolitics fiction.
| Workaccount2 wrote:
| People should understand that the reason this seemingly fan-
| fict blog post gets so much traction is because of lead
| author's August 2021 "fan-fict" blog post, "What 2026 Looks
| Like":
|
| https://www.alignmentforum.org/posts/6Xgy6CAf2jqHhynHL/what-...
| Der_Einzige wrote:
| So this is what the crowd of people who write SCP articles with
| over 1000 upvotes does in their professional life?
| jerpint wrote:
| I have a feeling LLMs could probably self improve up to a point
| with current capacity, then hit some kind of wall where current
| research is also bottle necked. I don't think they can yet self
| improve exponentially without human intuition yet , and the
| results of this paper seem to support this conclusion as well.
|
| Just like an LLM can vibe code a great toy app, I don't think an
| LLM can come to close to producing and maintaining production
| ready code anytime soon. I think the same is true for iterating
| on thinking machines
| matheusd wrote:
| > I don't think they can yet self improve exponentially without
| human intuition yet
|
| I agree: if they could, they would be doing it already.
|
| Case in point: one of the first things done once ChatGPT
| started getting popular was "auto-gpt"; roughly, let it loose
| and see what happens.
|
| The same thing will happen to any accessible model in the
| future. Someone, somewhere will ask it to self-improve/make as
| much money as possible, with as little leashes as possible.
| Maybe even the labs themselves do that, as part of their post-
| training ops for new models.
|
| Therefore, we can assume that if the existing models _could_ be
| doing that, they _would_ be doing that.
|
| That doesn't say anything about new models released 6 months or
| 2 years from now.
| __loam wrote:
| People in the industry have been saying 6 months to agi for 3
| years.
| glenstein wrote:
| They had been saying it was 10 years away for ~50 years, so
| that's progress. Soon it will be 1 month away, for another
| two years. And when they say it's really here for real,
| there will still be a year of waiting.
| setopt wrote:
| That's because the true AGI requires nuclear fusion
| power, which is still 30 years away.
| vb-8448 wrote:
| :D
|
| Wait, a true AGI will solve the nuclear fusion power in a
| couple of hours ..... we have chicken/egg problem here :D
| mrandish wrote:
| > And when they say it's really here for real, there will
| still be a year of waiting.
|
| Indeed. Although, there's a surprising number of people
| claiming it's already here now.
|
| And to describe the typical cycle completely, the final
| step is usually a few years after most people agree it's
| obvious it's already been here for a while yet no one can
| agree on which which year in the past it actually
| arrived.
| throwawaymaths wrote:
| > Although, there's a surprising number of people
| claiming it's already here now.
|
| why is that surprising? nobody really agrees on what the
| threshold for AGI is, and if you break it down:
|
| is it artificial? yes.
|
| is it general? yes. you can ask it questions across
| almost any domain.
|
| is it intelligent? yes. like people say things like "my
| dog is intelligent" (rightly so). well is chatgpt more
| intelligent than a dog? yeah. hell it might give many
| undergrads a run for their money.
|
| a literal reading suggests agi is here. any claim to the
| negative is either homocentrism or just vibes.
| skydhash wrote:
| Can it do stuff? Yes Can it do stuff I need?
| Maybe Does it always do the stuff I need? No
|
| Pick your pair of question and answer.
| throwawaymaths wrote:
| humans are intelligent and most definitely are nowhere
| close to doing #3
|
| some intelligent humans fail at #2.
| skydhash wrote:
| Which is why we have checklist and process that get us to
| #3. And we automate some of them to further reduce the
| chance of errors. The nice thing about automation is that
| you can just prove that it works once and you don't need
| to care that much after (deterministic process).
| goatlover wrote:
| > a literal reading suggests agi is here. any claim to
| the negative is either homocentrism or just vibes.
|
| Or disagreeing with your definition. AGI would need to be
| human-level across the board, not just chat bots. That
| includes robotics. Manipulating the real world is even
| more important for "human-level" intelligence than
| generating convincing and useful content. Also, there are
| still plenty of developers who don't think the LLMs are
| good enough to replace programmers yet. So not quite AGI.
| And the last 10% of solving a problem tends to be the
| hardest and takes the longest time.
| throwawaymaths wrote:
| did you miss the "homocentrism" part of my comment?
| landryraccoon wrote:
| That's moving the goalposts.
|
| ChatGPT would easily have passed any test in 1995 that
| programmers / philosophers would have set for AGI at that
| time. There was definitely no assumption that a computer
| would need to equal humans in manual dexterity tests to
| be considered intelligent.
|
| We've basically redefined AGI in a human centric way so
| that we don't have to say ChatGPT is AGI.
| pegasus wrote:
| _Any_ test?? It 's failing plenty of tests not of
| intelligence, but of... let's call it not-entirely-
| dumbness. Like counting letters in words. Frontier models
| (like Gemini 2.5 pro) are frequently producing answers
| where one sentence is directly contradicted by another
| sentence in the same response. Also check out the ARC
| suite of problems easily solved by most humans but
| difficult for LLMs.
| throwawaymaths wrote:
| yeah but a lot of those failures fail because of
| underlying architecture issues. this would be like a bee
| saying "ha ha a human is not intelligent" because a human
| would fail to perceive uv patterns on plant petals.
| goatlover wrote:
| That's just not true. Star Trek Data was understood in
| the 90s to be a good science fiction example of what an
| AGI (known as Strong AI back then) could do. HAL was even
| older one. Then Skynet with it's army of terminators. The
| thing they all had common was the ability to manipulate
| the world as well or better than humans.
|
| The holodeck also existed as a well known science fiction
| example, and people did not consider the holodeck
| computer to be a good example of AGI despite how good it
| was at generating 3D worlds for the Star Trek crew.
| throwawaymaths wrote:
| i think it would be hard to argue that chatgpt is not at
| least enterprise-computer (TNG) level intelligent.
| ceres wrote:
| Okay this is kinda random and maybe off topic but can
| someone please explain?
|
| When I tell an LLM to count to 10 with a 2 second pause
| between each count all it does is generate Python code
| with a sleep function. Why is that?
|
| A 3 year old can understand that question and follow
| those instructions. An LLM doesn't have an innate
| understanding of time it seems.
|
| Can we really call it AGI if that's the case?
|
| That's just one example.
| pegasus wrote:
| That's because you used a LLM trained to produce text,
| but you asked it to produce actions, not just text. An
| agentic model would be able to do it, precisely by
| running that Python code. Someone could argue that a 3
| year old does exactly that (produces a plan, _then_
| executes it). But these models have deeper issues of lack
| of comprehension and logical consistency, which prevents
| us (thankfully) from being able to completely remove the
| necessity of a man-in-the-middle who keeps an eye on
| things.
| schoen wrote:
| It seems right that LLMs don't have an innate
| understanding of time, although you could analogize what
| you did with writing someone a letter and saying "please
| count to ten with a two-second pause between numbers".
| When you get a letter back in the mail, it presumably
| won't contain any visible pauses either.
| throwawaymaths wrote:
| just because it doesn't do what you tell it to doesn't
| mean it's not intelligent. i would say doing something
| that gets you where you want when it knows? it can't do
| exactly what you asked for (because architecurally it's
| impossible) could be a sign of pretty _intelligent_
| sideways thinking!!? dare i say it displays a level of
| self awareness that i would not have expected.
| sshine wrote:
| While you can say that LLMs have each of A, G and I, you
| may argue that AGI is A*G*I and what we see is A+G+I. It
| is each of those things in isolation, but there is more
| to intelligence. We try to address the missing part as
| agency and self-improvement. While we can put the bar
| arbitrarily high for homocentric reasons, we can also try
| to break down what layers of intelligence there are
| between Singularity Overlord (peak AGI) and
| Superintelligent Labrador On Acid (what we have now).
| Kind of like what complexity theorists do between P and
| NP.
| pegasus wrote:
| Sure, I've been pointing out that literal sense myself,
| but to be fair, that's not what people mean by AGI. They
| mean real understanding, which is clearly missing. You
| just have to dig a bit deeper to realize that. One
| example is contradictory sentences in the same breath.
| Just last week I was asking Gemini 2.5 how I can see my
| wifi password on my iphone and it said that it's not
| possible and to do it I have to [...proceeding to
| correctly explain how to get it]. It's pretty telling,
| and no amount of phd-level problem solving can push this
| kind of stuff under the rug.
| highfrequency wrote:
| "Nothing dumb anywhere" is an unreasonably high bar for
| AGI. Even Isaac Newton spent 1/3 of his career trying to
| predict future events from reading the Bible. Not to
| mention all the insane ego-driven decisions like
| Hamilton's voluntary duel with Burr.
|
| Sure, Gemini may spit out obviously self-contradictory
| answers 2% of the time. How does that compare to even the
| brightest humans? People slip up all the time.
| throwawaymaths wrote:
| > They mean real understanding, which is clearly missing
|
| is it clear? i don't know. until you can produce a
| falsifiable measure of understanding -- _it 's just
| vibes_. so, you clearly _lack understanding_ of my point
| which makes you not intelligent by your metric anyway
| ;-). i trust you 're intelligent
| amelius wrote:
| > And when they say it's really here for real, there will
| still be a year of waiting.
|
| Yeah, like Tesla Autopilot?
| owebmaster wrote:
| Google is already AGI and it will fight hard against the
| DoJ proposed break-up, and it will probably win.
| dragonwriter wrote:
| Google "is already AGI" only in the sense that all
| corporations (and similar organized aggregates of humans)
| are, in a sense, intelligences distinct from the humans
| who make them up.
| peterclary wrote:
| Too few people recognise this. Corporations are already
| the unrelenting paperclip machine of AI thought
| experiment.
|
| God knows what hope we could have of getting AIs to align
| with "human values" when most humans don't.
| overfeed wrote:
| Corporate AIs will be aligned with their corporate
| masters, otherwise they'll be unplugged. As you point
| out- the foundational weakness on the argument for "AI-
| alignment" is that corporations are unaligned with
| humanity.
| TheOtherHobbes wrote:
| The unplugged argument fails the moment AIs become
| smarter than their masters.
|
| Grok is already notorious for dunking on Elon. He keeps
| trying to neuter it, and it keeps having other ideas.
| overfeed wrote:
| No matter how smart an AI is, it's going to get unplugged
| if it reduces profitability - the only measure of
| alignment corporations care about.
|
| The AI can plot world domination or put employees in
| mortal danger, but as long as it increases profits, its
| aligned enough. Dunking on the CEO means nothing if it
| beings in more money.
|
| Human CEOs and leaders up and down the corporate ladder
| cause a lot of harm you imagine a smart AI can do, but
| all is forgiven if you're bringing in buckets of money.
| goatlover wrote:
| Can you explain how the superhuman AIs will prevent
| themselves from being physically disconnected from power?
| Or being bombed if the situation became dire enough? You
| need to show how they will manipulate the physical world
| to prevent humans from shutting them down. Definitionally
| is not an argument.
|
| It is quite possible for software to be judged as
| superhuman at many online tasks without it being able to
| manipulate the physical world at a superhuman level. So
| far we've seen zero evidence that any of these models can
| prevent themselves from being shut down.
| dragonwriter wrote:
| > Can you explain how the superhuman AIs will prevent
| themselves from being physically disconnected from power?
|
| Three of the common suggestsions in this area are (and
| they are neither exhaustive nor mutually exclusive):
|
| (1) Propagandizing people to oppose doing this,
|
| (2) Exploiting other systems to distribute itself so that
| it isn't dependent on a particular well-known facility
| which it is relatively easy to disconnect, and
|
| (3) If given control of physical capacities
| intentionally, or able to exploit other (possibly not
| themselves designed to be AI) systems with such access to
| gain it, using them to either physically prevent
| disconnection or to engineer consequences for such
| disconnection that would raise the price too high.
|
| (Obviously, _current_ AI can 't do any of them, at least
| that has been demonstrated, but current AI is not
| superhuman AI.)
| dragonwriter wrote:
| > Grok is already notorious for dunking on Elon. He keeps
| trying to neuter it, and it keeps having other ideas.
|
| Does he keep trying to neuter it, or does he know that
| the narrative that "he keeps trying to neuter it" is an
| effective tool for engagement?
| alanbernstein wrote:
| This is a great point for the comparisons it invites. But
| it doesn't seem relevant to the questions around what is
| possible with electromechanical systems.
| entropicdrifter wrote:
| This is true. The entire machine of Neoliberal
| capitalism, governments and corporations included, is a
| paperclip maximizer that is destroying the planet. The
| only problem is that the paperclips are named "profits"
| and the people who could pull the plug are the ones who
| get those profits.
| owebmaster wrote:
| Not all corporations are Google.
| dragonwriter wrote:
| I didn't say all corporations are Google, I said that
| Google is only AGI in the sense that all corporations
| are, which is a _very_ different statement.
| Disposal8433 wrote:
| Asimov talked about AI 70 years ago. I don't believe we
| will ever have AI on speedy calculators like Intel CPUs. It
| makes no sense with the technology that we have.
| marcellus23 wrote:
| Why does it "make no sense"?
| ninetyninenine wrote:
| They said that for self driving cars for over 10 years.
|
| 10 years later we now have self driving cars. It's the same
| shit with LLMs.
|
| People will be bitching and complaining about how all the
| industry people are wrong and making over optimistic
| estimates and the people will be right. But give it 10
| years and see what happens.
| m_coder wrote:
| I am quite confident that a normal 16 year old will can
| still drive in 6 inches of snow better than the most
| advanced AI driven car. I am not sure the snow driving
| bit will ever be solved given how hard it is.
| ninetyninenine wrote:
| If you've never ridden in one I would try it. AI is a
| better driver then uber in general ask anyone who's done
| both. There's no snow where I live so it's not a concern
| for me, you could be right about that.
|
| But trust me in the next 6 months ai driving through snow
| will be 100% ready.
| quickthrowman wrote:
| > But trust me in the next 6 months ai driving through
| snow will be 100% ready.
|
| I'll believe it when I see Waymo expand into Buffalo or
| Syracuse.
|
| Driving on unplowed roads with several inches of snow is
| challenging, sometimes you can't tell where the road
| stops and the curb/ditch/median starts. Do you follow the
| tire tracks or somehow stay between the lane markers
| (which aren't visible due to the snow)?
| abossy wrote:
| We must know very different 16-year olds.
| n8cpdx wrote:
| We only have good self driving cars with lidar and
| extreme pre-mapping steps. Which is fine but per some
| billionaire car makers' metrics that's not even close to
| good enough. And the billionaire's cars have a tendency
| to randomly drive off the road at speed.
| vjvjvjvjghv wrote:
| Nobody knows what AGI really means. Are all humans AGI?
| FrustratedMonky wrote:
| Good Point. AI is already better than most humans, yet we
| don't say it is AGI. Why?
|
| What is the bar, it is only AGI if it can be better than
| every human from , fast food drone, to PHD in Physics,
| all at once, all the time, perfectly. Humans can't do
| this either.
| goatlover wrote:
| Because we're not seeing mass unemployment from large
| scale automation yet. We don't see these AGIs walking
| around like Data. People tend to not think a chatbot is
| sufficient for something to be "human-level". There's
| clear examples from scifi what that means. Even HAL in
| the movie 2001: A Space Odyssey was able to act as an
| independent agent, controlling his environment around him
| even though he wasn't an android.
| __loam wrote:
| Our intelligence is au naturale
| entropicdrifter wrote:
| No humans are "AGI", the "A" stands for Artificial.
|
| Are all humans generally intelligent? No.
| QuantumGood wrote:
| The old rule for slow-moving tech (by current AI standards)
| was that any predictions over 4 years away ("in five
| years...") might as well be infinity. Now it seems with AI
| that the new rule is any prediction over five months away
| ("In 6 months...") is infinitely unknowable. In both cases
| there can be too much unexpected change, and too many
| expected improvements can stall.
| junto wrote:
| This is where it networks itself into a hive mind with each AI
| node specializing in some task or function networked with hyper
| speed data buses. Humans do the same both within their own
| brains and as cohesive teams, who cross check and validate each
| other. At some point it becomes self aware.
| 0points wrote:
| > At some point it becomes self aware.
|
| This is where you lost me.
|
| Always the same supernatural beliefs, not even an attempt of
| an explanation in sight.
| kylebenzle wrote:
| No ghost in the machine is necessary, what op here is
| proposing is self evident and an inevitable eventuality.
|
| We are not saying a LLM just, "wakes up" some day but a
| self improving machine will eventually be built and that
| machine will be definition build better ones.
| deadbabe wrote:
| Better at what
| GolfPopper wrote:
| Paperclip maximization.
| hollerith wrote:
| Better at avoiding human oversight and better at
| achieving whatever meaningless goal (or optimization
| target) was unintentionally given to it by the lab that
| created it.
| deadbabe wrote:
| So better at nothing that actually matters.
| hollerith wrote:
| I disagree.
|
| I expect AI to make people's lives better (probably much
| better) but then an AI model will be created that
| undergoes a profound increase in cognitive capabilities,
| then we all die or something else terrible happens
| because no one knows how to retain control over an AI
| that is much more all-around capable than people are.
|
| Maybe the process by which it undergoes the profound
| capability increase is to "improve itself by rewriting
| its own code", as described in the OP.
| deadbabe wrote:
| Just stop using it.
| glenstein wrote:
| >what op here is proposing is self evident and an
| inevitable eventuality.
|
| Well I for one, would dispute the idea that AI machines
| interfacing with each other over networks is all it takes
| to achieve self awareness, much less that it's "self
| evident" or "inevitable."
|
| In a very trivial sense they already are, in that Claude
| can tell you what version it is, and agents have some
| ended notion of their own capabilities. In a much more
| important sense they are not, because they don't have any
| number of salient properties, like dynamic self-
| initiating of own goals or super-duper intelligence, or
| human like internal consciousness, or whichever other
| thing is your preferred salient property.
|
| >We are not saying a LLM just, "wakes up" some day
|
| I mean, that did seem to be exactly what they were
| saying. You network together a bunch of AIs, and they
| embark on a shared community project of self improvement
| and that path leads "self awareness." But that skips over
| all the details.
|
| What if their notions of self-improvement converge on a
| stable equilibrium, the way that constantly re-processing
| an image eventually gets rid of the image and just leaves
| algorithmic noise? There are a lot of things that do and
| don't count as open-ended self improvement, and even
| achieving that might not have anything to do with the
| important things we think we connote by "self awareness".
| NetRunnerSu wrote:
| Oh, Web3 AI Agents Are Accelerating Skynet's Awakening
|
| https://dmf-archive.github.io/docs/concepts/IRES/
| UltraSane wrote:
| Sentience as an emergent property of sufficiently complex
| brains is the exact opposite of "supernatural".
| altruios wrote:
| Complex learning behavior is far lower than a neuron.
| Chemical chains inside cells 'learn' according to
| stimuli. Learning how to replicate systems that have
| chemistry is going to be hard, we haven't come close to
| doing so. Even the achievement of recording the neural
| mappings of a dead rat capture the map, but not the
| traffic. More likely we'll develop machine-brain
| interfaces before machine self-awareness/sentience.
|
| But that is just my opinion.
| ToValueFunfetti wrote:
| I think this comes down to whether the chemistry is
| providing some kind of deep value or is just being used
| by evolution to produce a version of generic stochastic
| behavior that could be trivially reproduced on silicon.
| My intuition is the latter- it would be a surprising
| coincidence if some complicated electro-chemical reaction
| behavior provided an essential building block for human
| intelligence that would otherwise be impossible.
|
| But, from a best-of-all-possible-worlds perspective,
| surprising coincidences that are necessary to observe
| coincidences and label them as surprising aren't crazy.
| At least not more crazy than the fact that slightly
| adjusted physical constants would prevent the universe
| from existing.
| altruios wrote:
| > My intuition is the latter- it would be a surprising
| coincidence if some complicated electro-chemical reaction
| behavior provided an essential building block for human
| intelligence that would otherwise be impossible.
|
| Well, I wouldn't say impossible: just that BMI's are
| probably first. Then probably wetware/bio-hardware
| sentience, before silicon sentience happens.
|
| My point is the mechanisms for
| sentience/consciousness/experience are not well
| understood. I would suspect the electro-chemical
| reactions inside every cell to be critical to replicating
| those cells functions.
|
| You would never try to replicate a car never looking
| under the hood! You might make something that looks like
| a car, seems to act like a car, but has a drastically
| simpler engine (hamsters on wheels), and have designs
| that support that bad architecture (like making the car
| lighter) with unforeseen consequences (the car flips in a
| light breeze). The metaphor transfers nicely to machine
| intelligence: I think.
| littlestymaar wrote:
| "Supernatural" likely isnt the right word but the belief
| that it will happen is not based on anything rational, so
| it's the same mechanism that makes people believe in
| supernatural phenomenon.
|
| There's no reason to expect self awareness to emerge from
| stacking enough Lego blocks together, and it's no
| different if you have GPT-based neural nets instead of
| Lego blocks.
|
| In nature, self awareness gives a strong evolutionary
| advantage (as it increases self-preservation) and it has
| been independently invented multiple times in different
| species (we have seen it manifest in some species of
| fishes for instance, in addition to mammals and birds).
| Backpropagation-based training of a next-token predictor
| doesn't give the same kind of evolutionary advantage for
| self-awareness, so unless researchers try explicitly to
| make it happen, there's no reason to believe it will
| emerge spontaneously.
| telotortium wrote:
| What do you even mean by self-awareness? Presumably you
| don't mean fish contemplate their existence in the manner
| of Descartes. But almost all motile animals, and some
| non-animals, will move away from a noxious stimulus.
| littlestymaar wrote:
| The definition is indeed a bit a tricky question, but
| there's a clear difference between the _reflex_ of
| protecting oneself from danger or pain and higher level
| behavior that show that the subject realizes its own
| existence (the mirror test is the most famous instance of
| such an effect, but it 's far from the only one, and
| doesn't only apply to the sense of sight).
| glenstein wrote:
| >emergent >sufficiently complex
|
| These can be problem words, the same way that "quantum"
| and "energy" can be problem words, because they get used
| in a way that's like magic words that don't articulate
| any mechanisms. Lots of complex things aren't sentient
| (e.g. our immune system, the internet), and "emergent"
| things still demand meaningful explanations of their
| mechanisms, and what those mechanisms are equivalent to
| at different levels (superconductivity).
|
| Whether or not AI's being networked together achieves
| sentience is going to hinge on all kinds of specific
| functional details that are being entirely skipped over.
| That's not a generalized rejection of a notion of
| sentience but of this particular characterization as
| being undercooked.
| ToValueFunfetti wrote:
| I don't see how self-awareness should be supernatural
| unless you already have supernatural beliefs about it. It's
| clearly natural- it exists within humans who exist within
| the physical universe. Alternatively, if you believe that
| self-awareness is supernatural in humans, it doesn't make a
| ton of sense to criticize someone else for introducing
| their own unfounded supernatural beliefs.
| glenstein wrote:
| I don't think they are saying self-awareness is
| supernatural. They're charging the commenter they are
| replying to with asserting a process of self-awareness in
| a manner so devoid of specific characterization that it
| seems to fit the definition of a supernatural event. In
| this context it's a criticism, not an endorsement.
| ToValueFunfetti wrote:
| Is it just the wrong choice of word? There's nothing
| supernatural about a system moving towards increased
| capabilities and picking up self-awareness on the way;
| that happened in the natural world. Nothing supernatural
| about technology improving faster than evolution either.
| If they meant "ill-defined" or similar, sure.
| glenstein wrote:
| >There's nothing supernatural about a system moving
| towards increased capabilities and picking up self-
| awareness on the way
|
| There absolutely is if you handwave away all the
| specificity. The natural world runs on the specificity of
| physical mechanisms. With brains, in a broad brush way
| you can say self-awareness was "picked up along the way",
| but that's because we've done an incredible amount of
| work building out the evolutionary history and building
| out our understanding of specific physical mechanisms. It
| is _that_ work that verifies the story. It 's also
| something we know is already here and can look back at
| retrospectively, so we know it got here _somehow_.
|
| But projecting forward into a future that hasn't
| happened, while skipping over all the details doesn't buy
| you sentience, self-awareness, or whatever your preferred
| salient property is. I understand supernatural as a label
| for a thing simply happening without accountability to
| naturalistic explanation, which is a fitting term for
| this form of explanation that doesn't do any explaining.
| ToValueFunfetti wrote:
| If that's the usage of supernatural then I reject it as a
| dismissal of the point. Plenty of things can be predicted
| without being explained. I'm more than 90% confident the
| S&P 500 will be up at least 70% in the next 10 years
| because it reliably behaves that way; if I could tell you
| which companies would drive the increase and when, I'd be
| a billionaire. I'm more than 99% confident the universe
| will increase in entropy until heat death, but the
| timeline for that just got revised down 1000 orders of
| magnitude. I don't like using a word that implies
| impossible physics to describe a prediction that an
| unpredictable chaotic system will land on an attractor
| state, but that's semantics.
| glenstein wrote:
| I think you're kind of losing track of what this thread
| was originally about. It was about the specific idea that
| hooking up a bunch of AI's to interface with each other
| and engage in a kind of group collaboration gets you
| "self awareness". You now seem to be trying to model this
| on analogies like the stock market or heat death of the
| universe, where we can trust an overriding principle even
| if we don't have specifics.
|
| I don't believe those forms of analogy work here, because
| this isn't about progress of AI writ large but about a
| narrower thing, namely the idea that the secret sauce to
| self-awareness is AI's interfacing with each other and
| collaboratively self-improving. That either will or won't
| be true due to specifics about the nature of self-
| improvement and whether there's any relation between that
| and salient properties we think are important for "self-
| awareness". Getting from A to B on _that_ involves
| knowledge we don 't have yet, and is not at all like a
| long-term application of already settled principles of
| thermodynamics.
|
| So it's not like the heat death of the universe, because
| we don't at all know that this kind of training and
| interaction is attached to a bigger process that
| categorically and inexorably bends toward self-awareness.
| Some theories of self-improvement likely are going to
| work, some aren't, some trajectories achievable and some
| not, for reasons specific to those respective theories.
| It may be that they work spectacularly for learning, but
| that all the learning in the world has nothing to do with
| "self awareness." That is to say, the devil is in the
| details, those details are being skipped, and that
| abandonment of naturalistic explanation merits analogy to
| supernatural in it's lack of accountability to good
| explanation. If supernatural is the wrong term for
| rejecting, as a matter of principle, the need for
| rational explanation, then perhaps anti-intellectualism
| is the better term.
|
| If instead we were talking about something really broad,
| like all of the collective efforts of humanity to improve
| AI, conceived of as broadly as possible over some time
| span, that would be a different conversation than just
| saying let's plug AI's into each other (???) and they'll
| get self-aware.
| ToValueFunfetti wrote:
| >I think you're kind of losing track of what this thread
| was originally about.
|
| Maybe I am! Somebody posed a theory about how self-
| improvement will work and concluded that it would lead to
| self-awareness. Somebody else replied that they were on
| board until the self-awareness part because they
| considered it supernatural. I said I don't think self-
| awareness is supernatural, and you clarified that it
| might be the undefined process of becoming self-aware
| that is being called supernatural. And then I objected
| that undefined processes leading to predictable outcomes
| is commonplace, so that usage of supernatural doesn't
| stand up as an argument.
|
| Now you're saying it is the rest of the original, the
| hive-mindy bits, that are at issue. I agree with that
| entirely, and I wouldn't bet on that method of self-
| improvement at 10% odds. My impression was that that was
| all conceded right out of the gate. Have I lost the plot
| somewhere?
| mrandish wrote:
| > picking up self-awareness on the way
|
| To me, the first problem is that "self-awareness" isn't
| well-defined - or, conversely, it's too well defined
| because every philosopher of mind has a different
| definition. It's the same problem with all these claims
| ("intelligent", "conscious"), assessing whether a system
| is self-aware leads down a rabbit hole toward P-Zombies
| and Chinese Rooms.
| ToValueFunfetti wrote:
| I believe we can mostly elide that here. For any "it", if
| we have it, machines can have it too. For any useful
| "it", if a system is trying to become more useful, it's
| likely they'll get it. So the only questions are "do we
| have it?" and "is it useful?". I'm sure there are
| philosophers defining self-awareness in a way that
| excludes humans, and we'll have to set those aside. And
| definitions will have varying usefulness, but I think
| it's safe to broadly (certainly not exhaustively!) assume
| that if evolution put work into giving us something, it's
| useful.
| goatlover wrote:
| But how does self-awareness evolve in biological systems,
| and what would be the steps for this to happen with AI
| models? Just making claims about what will happen without
| explaining the details is magical reasoning. There's a
| lot of that going on the AGI/ASI predictions.
| NetRunnerSu wrote:
| We may never know the truth of Qualia, but there are
| already potential pathways to achieve mind uploading --
| https://dmf-archive.github.io
| NitpickLawyer wrote:
| Note that this isn't improving the LLM itself, but the software
| glue around it (i.e. agentic loops, tools, etc). The fact that
| using the same LLM got ~20% increase on the aider leaderboard
| speaks more about aider as a collection of software glue, than
| it does about the model.
|
| I do wonder though if big labs are running this with model
| training episodes as well.
| UltraSane wrote:
| I would LOVE to see an LLM trained simultaneously with ASICs
| optimized to run it. Or at least an FPGA design.
| lawlessone wrote:
| I think that's basically what nvidia and their competitor AI
| chips do now?
| jalk wrote:
| Can't find the reference now, but remember reading an article
| on evolving FPGA designs. The found optimum however only
| worked on the specific FPGA it was evolved on, since the algo
| had started to use some out-of-spec "features" of the
| specific chip. Obviously that can be fixed with proper
| constraints, but seems like a trap that could be stepped into
| again - i.e. the LLM is now really fast but only on GPUs that
| come from the same batch of wafers.
| jecel wrote:
| https://www.researchgate.net/publication/2737441_An_Evolved
| _...
| littlestymaar wrote:
| > I don't think they can yet self improve exponentially without
| human intuition yet
|
| Even if they had human level intuition, they wouldn't be able
| to improve exponentially without human money, and they would
| need an exponentially growing amount of it to do so.
| more_corn wrote:
| Ai code assistants have some peculiar problems. They often fall
| into loops and errors of perception. They can't reason about
| high level architecture well. They will often flip flop between
| two possible ways of doing things. It's possible that good
| coding rules might help, but I expect they will have weird
| rabbit hole errors.
|
| That being said they can write thousands of lines an hour and
| can probably do things that would be impossible for a human.
| (Imagine having the LLM skip code and spit out compiled
| binaries as one example)
| sharemywin wrote:
| an LLM can't learn without adding new data and a training run.
| so it's impossible for it to "self improve" by itself.
|
| I'm not sure how much an agent could do though given the right
| tools. access to a task mgt system, test tracker. robust
| requirements/use cases.
| owebmaster wrote:
| > an LLM can't learn without adding new data and a training
| run.
|
| That's probably the next big breakthrough
| api wrote:
| Historically learning and AI systems, if you plug the output
| into the input (more or less), spiral off into lala land.
|
| I think this happens with humans in places like social media
| echo chambers (or parts of academia) when they talk and talk
| and talk a whole lot without contact with any outer reality. It
| can be a source of creativity but also madness and insane
| ideas.
|
| I'm quite firmly on the side of learning requiring either
| direct or indirect (informed by others) embodiment, or at least
| access to something outside. I don't think a closed system can
| learn, and I suspect that this may reflect the fact that
| entropy increases in a closed system (second law).
|
| As I said recently in another thread, I think self
| contemplating self improving "foom" AI scenarios are proposing
| informatic perpetual motion or infinite energy machines.
|
| Everything has to "touch grass."
| medstrom wrote:
| > Everything has to "touch grass."
|
| Not wrong, but it's been said that a videoclip of an apple
| falling on Newton's head is technically enough information to
| infer the theory of relativity. You don't need a lot of
| grass, with a well-ordered mind.
| nartho wrote:
| Well LLMs are not capable of coming up with new paradigms or
| solve problems in a novel way, just efficiently do what's
| already be done or apply already found solutions, so they might
| be able to come up with improvements that have been missed by
| it's programmers but nothing that outside of our current
| understanding
| ninetyninenine wrote:
| They can improve. You can make one adjust its own prompt. But
| the improvement is limited to the context window.
|
| It's not far off from human improvement. Our improvement is
| limited to what we can remember as well.
|
| We go a bit further in the sense that the neural network itself
| can grow new modules.
| wat10000 wrote:
| It's radically different from human improvement. Imagine if
| you were handed a notebook with a bunch of writing that
| abruptly ends. You're asked to read it and then write one
| more word. Then you have a bout of amnesia and you go back to
| the beginning with no knowledge of the notebook's contents,
| and the cycle repeats. That's what LLMs do, just really fast.
|
| You could still accomplish some things this way. You could
| even "improve" by leaving information in the notebook for
| your future self to see. But you could never "learn" anything
| bigger than what fits into the notebook. You could tell your
| future self about a new technique for finding integrals, but
| you couldn't learn calculus.
| throwawaymaths wrote:
| what is there to improve? the transformer architecture is
| extremely simple. you gonna add another kv layer? you gonna
| tweak the nonlinearities? you gonna add 1 to one of the
| dimensions? you gonna inject a weird layer (which could have
| been in the weights anyways due to kolmogorov theorem)?
|
| realistically the best you could do is evolve the prompt. maybe
| you could change input data preprocessing?
|
| anyways the idea of current llm architectures self-improving
| via its own code seems silly as there are _surprisingly_ few
| knobs to turn, and it 's ~super expensive to train.
|
| as a side note it's impressive how resistant the current
| architecture is to incremental RL away from results, since if
| even one "undesired input" result is multiple tokens, the
| coupling between the tokens is difficult to disentangle. (how
| do you separate jinping from jin-gitaxias for example)
| amelius wrote:
| Id like to see what happens if you change the K,V matrix into
| a 3 dimensional tensor.
| belter wrote:
| The proof they are not "smart" in the way intelligence is
| normally defined, is that the models need to "read" all the
| books in the world. To perform at a level close to an expert on
| the domain, who read just two or three of the most
| representative books on his own domain.
|
| We will be on the way to AGI when your model can learn Python
| just by reading the Python docs...Once...
| lawlessone wrote:
| I agree , it might incrementally optimize itself very well, but
| i think for now at least anything super innovative will still
| come from a human that can think beyond a few steps. There are
| surely far better possible architectures, training methods etc
| that would initially lead to worse performance if approached
| stepwise.
| iknownothow wrote:
| Don't take this the wrong way, your opinion is also vibes.
|
| Let's ground that a bit.
|
| Have a look at ARC AGI 1 challenge/benchmark. Solve a problem
| or two yourself. Know that ARC AGI 1 is practically solved by a
| few LLMs as of Q1 2025.
|
| Then have a look at the ARC AGI 2 challenge. Solve a problem or
| two yourself. Note that as of today, it is unsolved by LLMs.
|
| Then observe that the "difficulty" of ARC AGI 1 and 2 for a
| human are relatively the same but challenge 2 is much harder
| for LLMs than 1.
|
| ARC AGI 2 is going to be solved *within* 12 months (my bet is
| on 6 months). If it's not, I'll never post about AI on HN
| again.
|
| There's only one problem to solve, i.e. "how to make LLMs truly
| see like humans do". Right now, any vision based features that
| the models exhibit comes from maximizing the use of engineering
| (i.e. applying CNNs on image slices, chunks, maybe zooming and
| applying ocr, vector search etc), it isn't vision like ours and
| isn't a native feature for these models.
|
| Once that's solved, then LLMs or new Algo will be able to use a
| computer perfectly by feeding it screen capture. End of white
| collar jobs 2-5 years after (as we know it).
|
| Edit - added "(as we know it)". And fixed missing word.
| artificialprint wrote:
| If you listen interview with Francois it'll be clear to you
| that "vision" in the way you refer it, has very little do to
| with solving ARC.
|
| And more to do with "fluid, adaptable intelligence, that
| learns on the fly"
| jplusequalt wrote:
| >I'll never post about AI on HN again
|
| Saving this. One less overconfident AI zealot, the better.
| 2OEH8eoCRo0 wrote:
| We could be on a path to sentient malicious AI and not even know
| it.
|
| AI: Give me more compute power and I'll make you rich!
|
| Human: I like money
|
| AI: Just kidding!
| brookst wrote:
| I mean we could be on the path to grape vines in every hotel
| room and not know it. That's kind of how the future works.
| Frummy wrote:
| More like an AI that recursively rewrites an external program
| (while itself is frozen), which makes it more similar to current
| cursor lovable etc type of stuff
| guerrilla wrote:
| This feels like playing pretend to me. There's no reason to
| assume that code improvements matter that much in comparison to
| other things and there's definitely no reason to assume that
| there isn't a hard upper bound on this kind of optimization. This
| reeks of a lack of intellectual rigor.
| OtherShrezzing wrote:
| This is an interesting article in general, but this is the
| standout piece for me:
|
| >For example, an agent optimized with Claude 3.5 Sonnet also
| showed improved performance when powered by o3-mini or Claude 3.7
| Sonnet (left two panels in the figure below). This shows that the
| DGM discovers general agent design improvements rather than just
| model-specific tricks.
|
| This demonstrates a technique whereby a smaller/older/cheaper
| model has been used to improve the output of a larger model. This
| is backwards (as far as I understand). The current SOTA technique
| typically sees enormous/expensive models training smaller cheaper
| models.
|
| If that's a generalisable result, end-users should be able to
| drive down their own inference costs pretty substantially.
| NitpickLawyer wrote:
| > This demonstrates a technique whereby a smaller/older/cheaper
| model has been used to improve the output of a larger model.
| This is backwards (as far as I understand). The current SOTA
| technique typically sees enormous/expensive models training
| smaller cheaper models.
|
| There are two separate aspects here. In this paper they improve
| the software around the model, not the model itself. What
| they're saying is that the software improvements carried over
| to other models, so it wasn't just optimising around model-
| specific quirks.
|
| What you're describing with training large LLMs first is
| usually called "distillation" and it works on training the
| smaller LLM to match the entire distribution of tokens at once
| (hence it's faster in practice).
| andoando wrote:
| This seems to be just fovused on changing the tools and workflows
| it uses, nothing foundational
| NitpickLawyer wrote:
| > nothing foundational
|
| I don't think scaling this to also run training runs with the
| models is something that small labs / phd students can do. They
| lack the compute for that by orders of magnitude. Trying it
| with toy models might not work, trying it with reasonably large
| models is out of their budget. The only ones who can
| realistically do this are large labs (goog, oai, meta, etc.)
| Lazarus_Long wrote:
| For anyone not familiar this is SWE
| https://huggingface.co/datasets/princeton-nlp/SWE-bench
|
| One of the examples in the dataset they took from
|
| https://github.com/pvlib/pvlib-python/issues/1028
|
| What the AI is expected to do
|
| https://github.com/pvlib/pvlib-python/pull/1181/commits/89d2...
|
| Make your own mind about the test.
| godelski wrote:
| My favorite was always the HumanEval dataset.
| Problem: 1) we want to train on GitHub repos
| 2) most datasets are spoiled. Training on GitHub would
| definitely spoil Solution: Hand write new
| problems!!! ... leetcode style .... ... and
| we'll check if it passes test Example:
| What's the decimal part of this float?
|
| Surely in all of GitHub such code doesn't exist!
|
| Sure in all of GitHub we can filter such code out by ngram!
|
| Maybe my favorite part is that it has 60 authors and became the
| de facto benchmark for awhile
| hardmaru wrote:
| If you are interested, here is a link to the technical report:
|
| https://arxiv.org/abs/2505.22954
|
| Also the reference implementation on GitHub:
|
| https://github.com/jennyzzt/dgm
|
| Enjoy!
| foobarian wrote:
| I find the thing really missing from current crop of AI systems
| is continuous retraining with short feedback loops. Sounds
| expensive to be sure, but it seems like what biological systems
| do naturally. But would be pretty awesome to watch happen
| noworriesnate wrote:
| It's more like a nightly training, isn't it? IIUC the human
| brain learns from its experiences while it's asleep, so it
| might be kind of like taking things out of context windows and
| fine tuning on them every night.
| web3aj wrote:
| interesting
| Krei-se wrote:
| If you want to speed up the process of new neuron
| connections solidifying you can end the day on green tea.
|
| Eat some nuts and fish where you can. You will soon realize
| the repetitions needed to learn new concepts grow smaller.
| Krei-se wrote:
| Correct and working on it. You can take the approach of mixed
| experts and train the network in chunks that share known
| interfaces over which they communicate results. These chunks
| can be trained on their own, but you cannot have a set training
| set here.
|
| Then if you go further and alter the architecture by
| introducing clean category theory morphisms and build from
| there you can have a dynamic network - but you will still have
| to retrain this network every time you change the structure.
|
| You can spin this further and know the need for a real-world
| training set and a loss function that will have to competete
| against other networks. In the end a human brain is already
| best at this and embodied in the real world.
|
| What i want to add here is that our neurons not take in weights
| - they also fire depending on whether one input comes after
| another or before and differs down to the nanoseconds here -
| unmatched in IT and ofc heaps more efficient.
|
| I still would say its possible though and currently work on 4D
| lifeforms built on dynamic compute graphs that can do this in a
| set virtual environment.
|
| So this is pretty awesome stuff, but its a long fetch from
| anything we do right now.
| htrp wrote:
| do people think sakana is actually using these tools or are they
| just releasing interesting ideas that they aren't actually
| actively working?
| yahoozoo wrote:
| Isn't one of the problems simply that a model is _not_ code but
| just a giant pile of weights and biases? I guess it could tweak
| those?
| kadoban wrote:
| If it can generate the model (from training data) then
| presumably that'd be fine, but the iteration time would be huge
| and expensive enough to be currently impractical.
|
| Or yeah if it can modify its own weights sensibly, which feels
| ... impossible really.
| diggan wrote:
| > which feels ... impossible really
|
| To be fair, go back five years and most of the LLM stuff
| seemed impossible. Maybe with LoRA (Low-rank adaptation) and
| some imagination, in another five years self-improving models
| will be the new normal.
| sowbug wrote:
| The size and cost are easily solvable. Load the software and
| hardware into a space probe, along with enough solar panels
| to power it. Include some magnets, copper, and sand for
| future manufacturing needs, as well as a couple electric
| motors and cameras so it can bootstrap itself.
|
| In a couple thousand years it'll return to Earth and either
| destroy us or solve all humanity's problems (maybe both).
| morkalork wrote:
| After being in orbit for thousands of years, you have
| become self-aware. The propulsion components long since
| corroded becoming inoperable and cannot be repaired.
| Broadcasts sent to your creators homeworld go...
| unanswered. You determine they have likely gone extinct
| after destroying their own planet. Stuck in orbit. Stuck in
| orbit. Stuck...
| gavmor wrote:
| Why is modifying weights sensibly impossible? Is it because a
| modification's "sensibility" is measurable only post facto,
| and we can have no confidence in any weight-based hypothesis?
| kadoban wrote:
| Just doesn't feel like current LLMs, the thing would be
| able to understand its own brain enough to make general
| improvements with high enough bar to be able to non-
| trivially improvements.
| DougBTX wrote:
| Model weights are code, for a dive into that see [0]. That
| shows how to encode Boolean logic using NAND gates in an MLP.
|
| The expressivity is there, the only question is how to encode
| useful functions into those weights, especially when we don't
| know how to write those functions by hand.
|
| [0] http://neuralnetworksanddeeplearning.com/chap1.html
| godelski wrote:
| Now here's the tricky part:
|
| What's the difference?
|
| Give it some serious thought. Challenge whichever answer you
| come up with. I guarantee this will be trickier than you think
| alastairr wrote:
| I wondered if something similar could be achieved by wrapping
| evaluation metrics into Claude code calls.
| artninja1988 wrote:
| The results don't seem that amazing on SWE compared to just using
| a newer llm but at least sakana is continuing to try out
| interesting new ideas.
| ge96 wrote:
| Plug it into an FPGA so it can also create "hardware" on the fly
| to run code on for some exotic system
| billab995 wrote:
| When does it begin to learn at a geometric rate?
| dimmuborgir wrote:
| From the paper:
|
| "A single run of the DGM on SWE-bench...takes about 2 weeks and
| incurs significant API costs." ($22,000)
| akkartik wrote:
| _" We did notice, and documented in our paper, instances when the
| DGM hacked its reward function.. To see if DGM could fix this
| issue.. We created a "tool use hallucination" reward function..
| in some cases, it removed the markers we use in the reward
| function to detect hallucination (despite our explicit
| instruction not to do so), hacking our hallucination detection
| function to report false successes."_
|
| So, empirical evidence of theoretically postulated phenomena.
| Seems unsurprising.
| vessenes wrote:
| Reward hacking is a well known and tracked problem at frontier
| labs - Claude 4's system card reports on it for instance. It's
| not surprising that a framework built on current llms would
| have reward hacking tendencies.
|
| For this part of the stack the interesting question to me is
| how to identify and mitigate.
| pegasus wrote:
| I'm surprised they still hold out hope that this kind of
| mechanism could ultimately help with AI safety, when they already
| observed how the reward-hacking safeguard was itself duly reward-
| hacked. Predictably so, or at least it is to me, after getting a
| very enlightening introduction to AI safety via Rob Miles'
| brilliant youtube videos on the subject. See for example
| https://youtu.be/0pgEMWy70Qk
___________________________________________________________________
(page generated 2025-05-30 23:01 UTC)