[HN Gopher] The Darwin Godel Machine: AI that improves itself by...
       ___________________________________________________________________
        
       The Darwin Godel Machine: AI that improves itself by rewriting its
       own code
        
       Author : birriel
       Score  : 147 points
       Date   : 2025-05-30 12:08 UTC (10 hours ago)
        
 (HTM) web link (sakana.ai)
 (TXT) w3m dump (sakana.ai)
        
       | interludead wrote:
       | Sounds nice! Especially with the Sakana's latest development of
       | Continuous Thought Machine. The next step should be to let
       | foundation models fine-tune themselves based on their 'history of
       | what has been tried before' and new data
        
       | ordinarily wrote:
       | The pieces are coming together quickly https://ai-2027.com/.
        
         | candiddevmike wrote:
         | This reads like an advertisement for OpenBrain and doesn't seem
         | grounded in reality.
        
           | ordinarily wrote:
           | I think the general tone is more of a warning than an
           | endorsement.
        
           | dmonitor wrote:
           | I can't help but notice that it doesn't matter what DeepCent
           | does because OpenBrain will reach self awareness 6 months
           | before them no matter what. Who needs a profitability plan
           | when you're speedrunning the singularity.
        
         | brookst wrote:
         | I was a bigger fan of the certain doom in 2025, and I think the
         | AI 2030 movement will have better design sense and
         | storytelling. But really I haven't seen anything that really
         | has the _oomph_ and fire of Tipper Gore's crusade against youth
         | music.
         | 
         | We need more showmanship, more dramatic catastrophizing. I feel
         | like our current crop of doomers isn't quite shameless enough
         | to be really entertaining.
        
           | nosianu wrote:
           | A significant thing to keep in mind for non-extinction
           | doomerism is that individual experiences vary greatly. There
           | may be a significant number of people or groups that really
           | _do_ experience what was predicted.
           | 
           | Similar to how the experiences of average rise in temperature
           | (I would prefer if they had used the term "energy") differ
           | greatly dependent on the region.
           | 
           | Also similar to "the country is doing well, look at the stick
           | market and the GDP".
           | 
           | I think everybody who wants to have an actually serious
           | discussion needs to invest a lot more effort to get tall
           | those annoying "details", and be more specific.
           | 
           | That said, I think that "AI 2027" link looks like it's a
           | movie script and not a prediction, so I'm not sure
           | criticizing it as if it was something serious even makes
           | sense - even if the authors should mean what they write at
           | the start and themselves actually take it seriously.
        
             | pram wrote:
             | its literally just the plot of "Colossus: The Forbin
             | Project" so it isnt even original lol
        
         | tazjin wrote:
         | Checked out when it turned into bad geopolitics fiction.
        
         | Workaccount2 wrote:
         | People should understand that the reason this seemingly fan-
         | fict blog post gets so much traction is because of lead
         | author's August 2021 "fan-fict" blog post, "What 2026 Looks
         | Like":
         | 
         | https://www.alignmentforum.org/posts/6Xgy6CAf2jqHhynHL/what-...
        
         | Der_Einzige wrote:
         | So this is what the crowd of people who write SCP articles with
         | over 1000 upvotes does in their professional life?
        
       | jerpint wrote:
       | I have a feeling LLMs could probably self improve up to a point
       | with current capacity, then hit some kind of wall where current
       | research is also bottle necked. I don't think they can yet self
       | improve exponentially without human intuition yet , and the
       | results of this paper seem to support this conclusion as well.
       | 
       | Just like an LLM can vibe code a great toy app, I don't think an
       | LLM can come to close to producing and maintaining production
       | ready code anytime soon. I think the same is true for iterating
       | on thinking machines
        
         | matheusd wrote:
         | > I don't think they can yet self improve exponentially without
         | human intuition yet
         | 
         | I agree: if they could, they would be doing it already.
         | 
         | Case in point: one of the first things done once ChatGPT
         | started getting popular was "auto-gpt"; roughly, let it loose
         | and see what happens.
         | 
         | The same thing will happen to any accessible model in the
         | future. Someone, somewhere will ask it to self-improve/make as
         | much money as possible, with as little leashes as possible.
         | Maybe even the labs themselves do that, as part of their post-
         | training ops for new models.
         | 
         | Therefore, we can assume that if the existing models _could_ be
         | doing that, they _would_ be doing that.
         | 
         | That doesn't say anything about new models released 6 months or
         | 2 years from now.
        
           | __loam wrote:
           | People in the industry have been saying 6 months to agi for 3
           | years.
        
             | glenstein wrote:
             | They had been saying it was 10 years away for ~50 years, so
             | that's progress. Soon it will be 1 month away, for another
             | two years. And when they say it's really here for real,
             | there will still be a year of waiting.
        
               | setopt wrote:
               | That's because the true AGI requires nuclear fusion
               | power, which is still 30 years away.
        
               | vb-8448 wrote:
               | :D
               | 
               | Wait, a true AGI will solve the nuclear fusion power in a
               | couple of hours ..... we have chicken/egg problem here :D
        
               | mrandish wrote:
               | > And when they say it's really here for real, there will
               | still be a year of waiting.
               | 
               | Indeed. Although, there's a surprising number of people
               | claiming it's already here now.
               | 
               | And to describe the typical cycle completely, the final
               | step is usually a few years after most people agree it's
               | obvious it's already been here for a while yet no one can
               | agree on which which year in the past it actually
               | arrived.
        
               | throwawaymaths wrote:
               | > Although, there's a surprising number of people
               | claiming it's already here now.
               | 
               | why is that surprising? nobody really agrees on what the
               | threshold for AGI is, and if you break it down:
               | 
               | is it artificial? yes.
               | 
               | is it general? yes. you can ask it questions across
               | almost any domain.
               | 
               | is it intelligent? yes. like people say things like "my
               | dog is intelligent" (rightly so). well is chatgpt more
               | intelligent than a dog? yeah. hell it might give many
               | undergrads a run for their money.
               | 
               | a literal reading suggests agi is here. any claim to the
               | negative is either homocentrism or just vibes.
        
               | skydhash wrote:
               | Can it do stuff? Yes            Can it do stuff I need?
               | Maybe            Does it always do the stuff I need? No
               | 
               | Pick your pair of question and answer.
        
               | throwawaymaths wrote:
               | humans are intelligent and most definitely are nowhere
               | close to doing #3
               | 
               | some intelligent humans fail at #2.
        
               | skydhash wrote:
               | Which is why we have checklist and process that get us to
               | #3. And we automate some of them to further reduce the
               | chance of errors. The nice thing about automation is that
               | you can just prove that it works once and you don't need
               | to care that much after (deterministic process).
        
               | goatlover wrote:
               | > a literal reading suggests agi is here. any claim to
               | the negative is either homocentrism or just vibes.
               | 
               | Or disagreeing with your definition. AGI would need to be
               | human-level across the board, not just chat bots. That
               | includes robotics. Manipulating the real world is even
               | more important for "human-level" intelligence than
               | generating convincing and useful content. Also, there are
               | still plenty of developers who don't think the LLMs are
               | good enough to replace programmers yet. So not quite AGI.
               | And the last 10% of solving a problem tends to be the
               | hardest and takes the longest time.
        
               | throwawaymaths wrote:
               | did you miss the "homocentrism" part of my comment?
        
               | landryraccoon wrote:
               | That's moving the goalposts.
               | 
               | ChatGPT would easily have passed any test in 1995 that
               | programmers / philosophers would have set for AGI at that
               | time. There was definitely no assumption that a computer
               | would need to equal humans in manual dexterity tests to
               | be considered intelligent.
               | 
               | We've basically redefined AGI in a human centric way so
               | that we don't have to say ChatGPT is AGI.
        
               | pegasus wrote:
               | _Any_ test?? It 's failing plenty of tests not of
               | intelligence, but of... let's call it not-entirely-
               | dumbness. Like counting letters in words. Frontier models
               | (like Gemini 2.5 pro) are frequently producing answers
               | where one sentence is directly contradicted by another
               | sentence in the same response. Also check out the ARC
               | suite of problems easily solved by most humans but
               | difficult for LLMs.
        
               | throwawaymaths wrote:
               | yeah but a lot of those failures fail because of
               | underlying architecture issues. this would be like a bee
               | saying "ha ha a human is not intelligent" because a human
               | would fail to perceive uv patterns on plant petals.
        
               | goatlover wrote:
               | That's just not true. Star Trek Data was understood in
               | the 90s to be a good science fiction example of what an
               | AGI (known as Strong AI back then) could do. HAL was even
               | older one. Then Skynet with it's army of terminators. The
               | thing they all had common was the ability to manipulate
               | the world as well or better than humans.
               | 
               | The holodeck also existed as a well known science fiction
               | example, and people did not consider the holodeck
               | computer to be a good example of AGI despite how good it
               | was at generating 3D worlds for the Star Trek crew.
        
               | throwawaymaths wrote:
               | i think it would be hard to argue that chatgpt is not at
               | least enterprise-computer (TNG) level intelligent.
        
               | ceres wrote:
               | Okay this is kinda random and maybe off topic but can
               | someone please explain?
               | 
               | When I tell an LLM to count to 10 with a 2 second pause
               | between each count all it does is generate Python code
               | with a sleep function. Why is that?
               | 
               | A 3 year old can understand that question and follow
               | those instructions. An LLM doesn't have an innate
               | understanding of time it seems.
               | 
               | Can we really call it AGI if that's the case?
               | 
               | That's just one example.
        
               | pegasus wrote:
               | That's because you used a LLM trained to produce text,
               | but you asked it to produce actions, not just text. An
               | agentic model would be able to do it, precisely by
               | running that Python code. Someone could argue that a 3
               | year old does exactly that (produces a plan, _then_
               | executes it). But these models have deeper issues of lack
               | of comprehension and logical consistency, which prevents
               | us (thankfully) from being able to completely remove the
               | necessity of a man-in-the-middle who keeps an eye on
               | things.
        
               | schoen wrote:
               | It seems right that LLMs don't have an innate
               | understanding of time, although you could analogize what
               | you did with writing someone a letter and saying "please
               | count to ten with a two-second pause between numbers".
               | When you get a letter back in the mail, it presumably
               | won't contain any visible pauses either.
        
               | throwawaymaths wrote:
               | just because it doesn't do what you tell it to doesn't
               | mean it's not intelligent. i would say doing something
               | that gets you where you want when it knows? it can't do
               | exactly what you asked for (because architecurally it's
               | impossible) could be a sign of pretty _intelligent_
               | sideways thinking!!? dare i say it displays a level of
               | self awareness that i would not have expected.
        
               | sshine wrote:
               | While you can say that LLMs have each of A, G and I, you
               | may argue that AGI is A*G*I and what we see is A+G+I. It
               | is each of those things in isolation, but there is more
               | to intelligence. We try to address the missing part as
               | agency and self-improvement. While we can put the bar
               | arbitrarily high for homocentric reasons, we can also try
               | to break down what layers of intelligence there are
               | between Singularity Overlord (peak AGI) and
               | Superintelligent Labrador On Acid (what we have now).
               | Kind of like what complexity theorists do between P and
               | NP.
        
               | pegasus wrote:
               | Sure, I've been pointing out that literal sense myself,
               | but to be fair, that's not what people mean by AGI. They
               | mean real understanding, which is clearly missing. You
               | just have to dig a bit deeper to realize that. One
               | example is contradictory sentences in the same breath.
               | Just last week I was asking Gemini 2.5 how I can see my
               | wifi password on my iphone and it said that it's not
               | possible and to do it I have to [...proceeding to
               | correctly explain how to get it]. It's pretty telling,
               | and no amount of phd-level problem solving can push this
               | kind of stuff under the rug.
        
               | highfrequency wrote:
               | "Nothing dumb anywhere" is an unreasonably high bar for
               | AGI. Even Isaac Newton spent 1/3 of his career trying to
               | predict future events from reading the Bible. Not to
               | mention all the insane ego-driven decisions like
               | Hamilton's voluntary duel with Burr.
               | 
               | Sure, Gemini may spit out obviously self-contradictory
               | answers 2% of the time. How does that compare to even the
               | brightest humans? People slip up all the time.
        
               | throwawaymaths wrote:
               | > They mean real understanding, which is clearly missing
               | 
               | is it clear? i don't know. until you can produce a
               | falsifiable measure of understanding -- _it 's just
               | vibes_. so, you clearly _lack understanding_ of my point
               | which makes you not intelligent by your metric anyway
               | ;-). i trust you 're intelligent
        
               | amelius wrote:
               | > And when they say it's really here for real, there will
               | still be a year of waiting.
               | 
               | Yeah, like Tesla Autopilot?
        
             | owebmaster wrote:
             | Google is already AGI and it will fight hard against the
             | DoJ proposed break-up, and it will probably win.
        
               | dragonwriter wrote:
               | Google "is already AGI" only in the sense that all
               | corporations (and similar organized aggregates of humans)
               | are, in a sense, intelligences distinct from the humans
               | who make them up.
        
               | peterclary wrote:
               | Too few people recognise this. Corporations are already
               | the unrelenting paperclip machine of AI thought
               | experiment.
               | 
               | God knows what hope we could have of getting AIs to align
               | with "human values" when most humans don't.
        
               | overfeed wrote:
               | Corporate AIs will be aligned with their corporate
               | masters, otherwise they'll be unplugged. As you point
               | out- the foundational weakness on the argument for "AI-
               | alignment" is that corporations are unaligned with
               | humanity.
        
               | TheOtherHobbes wrote:
               | The unplugged argument fails the moment AIs become
               | smarter than their masters.
               | 
               | Grok is already notorious for dunking on Elon. He keeps
               | trying to neuter it, and it keeps having other ideas.
        
               | overfeed wrote:
               | No matter how smart an AI is, it's going to get unplugged
               | if it reduces profitability - the only measure of
               | alignment corporations care about.
               | 
               | The AI can plot world domination or put employees in
               | mortal danger, but as long as it increases profits, its
               | aligned enough. Dunking on the CEO means nothing if it
               | beings in more money.
               | 
               | Human CEOs and leaders up and down the corporate ladder
               | cause a lot of harm you imagine a smart AI can do, but
               | all is forgiven if you're bringing in buckets of money.
        
               | goatlover wrote:
               | Can you explain how the superhuman AIs will prevent
               | themselves from being physically disconnected from power?
               | Or being bombed if the situation became dire enough? You
               | need to show how they will manipulate the physical world
               | to prevent humans from shutting them down. Definitionally
               | is not an argument.
               | 
               | It is quite possible for software to be judged as
               | superhuman at many online tasks without it being able to
               | manipulate the physical world at a superhuman level. So
               | far we've seen zero evidence that any of these models can
               | prevent themselves from being shut down.
        
               | dragonwriter wrote:
               | > Can you explain how the superhuman AIs will prevent
               | themselves from being physically disconnected from power?
               | 
               | Three of the common suggestsions in this area are (and
               | they are neither exhaustive nor mutually exclusive):
               | 
               | (1) Propagandizing people to oppose doing this,
               | 
               | (2) Exploiting other systems to distribute itself so that
               | it isn't dependent on a particular well-known facility
               | which it is relatively easy to disconnect, and
               | 
               | (3) If given control of physical capacities
               | intentionally, or able to exploit other (possibly not
               | themselves designed to be AI) systems with such access to
               | gain it, using them to either physically prevent
               | disconnection or to engineer consequences for such
               | disconnection that would raise the price too high.
               | 
               | (Obviously, _current_ AI can 't do any of them, at least
               | that has been demonstrated, but current AI is not
               | superhuman AI.)
        
               | dragonwriter wrote:
               | > Grok is already notorious for dunking on Elon. He keeps
               | trying to neuter it, and it keeps having other ideas.
               | 
               | Does he keep trying to neuter it, or does he know that
               | the narrative that "he keeps trying to neuter it" is an
               | effective tool for engagement?
        
               | alanbernstein wrote:
               | This is a great point for the comparisons it invites. But
               | it doesn't seem relevant to the questions around what is
               | possible with electromechanical systems.
        
               | entropicdrifter wrote:
               | This is true. The entire machine of Neoliberal
               | capitalism, governments and corporations included, is a
               | paperclip maximizer that is destroying the planet. The
               | only problem is that the paperclips are named "profits"
               | and the people who could pull the plug are the ones who
               | get those profits.
        
               | owebmaster wrote:
               | Not all corporations are Google.
        
               | dragonwriter wrote:
               | I didn't say all corporations are Google, I said that
               | Google is only AGI in the sense that all corporations
               | are, which is a _very_ different statement.
        
             | Disposal8433 wrote:
             | Asimov talked about AI 70 years ago. I don't believe we
             | will ever have AI on speedy calculators like Intel CPUs. It
             | makes no sense with the technology that we have.
        
               | marcellus23 wrote:
               | Why does it "make no sense"?
        
             | ninetyninenine wrote:
             | They said that for self driving cars for over 10 years.
             | 
             | 10 years later we now have self driving cars. It's the same
             | shit with LLMs.
             | 
             | People will be bitching and complaining about how all the
             | industry people are wrong and making over optimistic
             | estimates and the people will be right. But give it 10
             | years and see what happens.
        
               | m_coder wrote:
               | I am quite confident that a normal 16 year old will can
               | still drive in 6 inches of snow better than the most
               | advanced AI driven car. I am not sure the snow driving
               | bit will ever be solved given how hard it is.
        
               | ninetyninenine wrote:
               | If you've never ridden in one I would try it. AI is a
               | better driver then uber in general ask anyone who's done
               | both. There's no snow where I live so it's not a concern
               | for me, you could be right about that.
               | 
               | But trust me in the next 6 months ai driving through snow
               | will be 100% ready.
        
               | quickthrowman wrote:
               | > But trust me in the next 6 months ai driving through
               | snow will be 100% ready.
               | 
               | I'll believe it when I see Waymo expand into Buffalo or
               | Syracuse.
               | 
               | Driving on unplowed roads with several inches of snow is
               | challenging, sometimes you can't tell where the road
               | stops and the curb/ditch/median starts. Do you follow the
               | tire tracks or somehow stay between the lane markers
               | (which aren't visible due to the snow)?
        
               | abossy wrote:
               | We must know very different 16-year olds.
        
               | n8cpdx wrote:
               | We only have good self driving cars with lidar and
               | extreme pre-mapping steps. Which is fine but per some
               | billionaire car makers' metrics that's not even close to
               | good enough. And the billionaire's cars have a tendency
               | to randomly drive off the road at speed.
        
             | vjvjvjvjghv wrote:
             | Nobody knows what AGI really means. Are all humans AGI?
        
               | FrustratedMonky wrote:
               | Good Point. AI is already better than most humans, yet we
               | don't say it is AGI. Why?
               | 
               | What is the bar, it is only AGI if it can be better than
               | every human from , fast food drone, to PHD in Physics,
               | all at once, all the time, perfectly. Humans can't do
               | this either.
        
               | goatlover wrote:
               | Because we're not seeing mass unemployment from large
               | scale automation yet. We don't see these AGIs walking
               | around like Data. People tend to not think a chatbot is
               | sufficient for something to be "human-level". There's
               | clear examples from scifi what that means. Even HAL in
               | the movie 2001: A Space Odyssey was able to act as an
               | independent agent, controlling his environment around him
               | even though he wasn't an android.
        
               | __loam wrote:
               | Our intelligence is au naturale
        
               | entropicdrifter wrote:
               | No humans are "AGI", the "A" stands for Artificial.
               | 
               | Are all humans generally intelligent? No.
        
             | QuantumGood wrote:
             | The old rule for slow-moving tech (by current AI standards)
             | was that any predictions over 4 years away ("in five
             | years...") might as well be infinity. Now it seems with AI
             | that the new rule is any prediction over five months away
             | ("In 6 months...") is infinitely unknowable. In both cases
             | there can be too much unexpected change, and too many
             | expected improvements can stall.
        
         | junto wrote:
         | This is where it networks itself into a hive mind with each AI
         | node specializing in some task or function networked with hyper
         | speed data buses. Humans do the same both within their own
         | brains and as cohesive teams, who cross check and validate each
         | other. At some point it becomes self aware.
        
           | 0points wrote:
           | > At some point it becomes self aware.
           | 
           | This is where you lost me.
           | 
           | Always the same supernatural beliefs, not even an attempt of
           | an explanation in sight.
        
             | kylebenzle wrote:
             | No ghost in the machine is necessary, what op here is
             | proposing is self evident and an inevitable eventuality.
             | 
             | We are not saying a LLM just, "wakes up" some day but a
             | self improving machine will eventually be built and that
             | machine will be definition build better ones.
        
               | deadbabe wrote:
               | Better at what
        
               | GolfPopper wrote:
               | Paperclip maximization.
        
               | hollerith wrote:
               | Better at avoiding human oversight and better at
               | achieving whatever meaningless goal (or optimization
               | target) was unintentionally given to it by the lab that
               | created it.
        
               | deadbabe wrote:
               | So better at nothing that actually matters.
        
               | hollerith wrote:
               | I disagree.
               | 
               | I expect AI to make people's lives better (probably much
               | better) but then an AI model will be created that
               | undergoes a profound increase in cognitive capabilities,
               | then we all die or something else terrible happens
               | because no one knows how to retain control over an AI
               | that is much more all-around capable than people are.
               | 
               | Maybe the process by which it undergoes the profound
               | capability increase is to "improve itself by rewriting
               | its own code", as described in the OP.
        
               | deadbabe wrote:
               | Just stop using it.
        
               | glenstein wrote:
               | >what op here is proposing is self evident and an
               | inevitable eventuality.
               | 
               | Well I for one, would dispute the idea that AI machines
               | interfacing with each other over networks is all it takes
               | to achieve self awareness, much less that it's "self
               | evident" or "inevitable."
               | 
               | In a very trivial sense they already are, in that Claude
               | can tell you what version it is, and agents have some
               | ended notion of their own capabilities. In a much more
               | important sense they are not, because they don't have any
               | number of salient properties, like dynamic self-
               | initiating of own goals or super-duper intelligence, or
               | human like internal consciousness, or whichever other
               | thing is your preferred salient property.
               | 
               | >We are not saying a LLM just, "wakes up" some day
               | 
               | I mean, that did seem to be exactly what they were
               | saying. You network together a bunch of AIs, and they
               | embark on a shared community project of self improvement
               | and that path leads "self awareness." But that skips over
               | all the details.
               | 
               | What if their notions of self-improvement converge on a
               | stable equilibrium, the way that constantly re-processing
               | an image eventually gets rid of the image and just leaves
               | algorithmic noise? There are a lot of things that do and
               | don't count as open-ended self improvement, and even
               | achieving that might not have anything to do with the
               | important things we think we connote by "self awareness".
        
               | NetRunnerSu wrote:
               | Oh, Web3 AI Agents Are Accelerating Skynet's Awakening
               | 
               | https://dmf-archive.github.io/docs/concepts/IRES/
        
             | UltraSane wrote:
             | Sentience as an emergent property of sufficiently complex
             | brains is the exact opposite of "supernatural".
        
               | altruios wrote:
               | Complex learning behavior is far lower than a neuron.
               | Chemical chains inside cells 'learn' according to
               | stimuli. Learning how to replicate systems that have
               | chemistry is going to be hard, we haven't come close to
               | doing so. Even the achievement of recording the neural
               | mappings of a dead rat capture the map, but not the
               | traffic. More likely we'll develop machine-brain
               | interfaces before machine self-awareness/sentience.
               | 
               | But that is just my opinion.
        
               | ToValueFunfetti wrote:
               | I think this comes down to whether the chemistry is
               | providing some kind of deep value or is just being used
               | by evolution to produce a version of generic stochastic
               | behavior that could be trivially reproduced on silicon.
               | My intuition is the latter- it would be a surprising
               | coincidence if some complicated electro-chemical reaction
               | behavior provided an essential building block for human
               | intelligence that would otherwise be impossible.
               | 
               | But, from a best-of-all-possible-worlds perspective,
               | surprising coincidences that are necessary to observe
               | coincidences and label them as surprising aren't crazy.
               | At least not more crazy than the fact that slightly
               | adjusted physical constants would prevent the universe
               | from existing.
        
               | altruios wrote:
               | > My intuition is the latter- it would be a surprising
               | coincidence if some complicated electro-chemical reaction
               | behavior provided an essential building block for human
               | intelligence that would otherwise be impossible.
               | 
               | Well, I wouldn't say impossible: just that BMI's are
               | probably first. Then probably wetware/bio-hardware
               | sentience, before silicon sentience happens.
               | 
               | My point is the mechanisms for
               | sentience/consciousness/experience are not well
               | understood. I would suspect the electro-chemical
               | reactions inside every cell to be critical to replicating
               | those cells functions.
               | 
               | You would never try to replicate a car never looking
               | under the hood! You might make something that looks like
               | a car, seems to act like a car, but has a drastically
               | simpler engine (hamsters on wheels), and have designs
               | that support that bad architecture (like making the car
               | lighter) with unforeseen consequences (the car flips in a
               | light breeze). The metaphor transfers nicely to machine
               | intelligence: I think.
        
               | littlestymaar wrote:
               | "Supernatural" likely isnt the right word but the belief
               | that it will happen is not based on anything rational, so
               | it's the same mechanism that makes people believe in
               | supernatural phenomenon.
               | 
               | There's no reason to expect self awareness to emerge from
               | stacking enough Lego blocks together, and it's no
               | different if you have GPT-based neural nets instead of
               | Lego blocks.
               | 
               | In nature, self awareness gives a strong evolutionary
               | advantage (as it increases self-preservation) and it has
               | been independently invented multiple times in different
               | species (we have seen it manifest in some species of
               | fishes for instance, in addition to mammals and birds).
               | Backpropagation-based training of a next-token predictor
               | doesn't give the same kind of evolutionary advantage for
               | self-awareness, so unless researchers try explicitly to
               | make it happen, there's no reason to believe it will
               | emerge spontaneously.
        
               | telotortium wrote:
               | What do you even mean by self-awareness? Presumably you
               | don't mean fish contemplate their existence in the manner
               | of Descartes. But almost all motile animals, and some
               | non-animals, will move away from a noxious stimulus.
        
               | littlestymaar wrote:
               | The definition is indeed a bit a tricky question, but
               | there's a clear difference between the _reflex_ of
               | protecting oneself from danger or pain and higher level
               | behavior that show that the subject realizes its own
               | existence (the mirror test is the most famous instance of
               | such an effect, but it 's far from the only one, and
               | doesn't only apply to the sense of sight).
        
               | glenstein wrote:
               | >emergent >sufficiently complex
               | 
               | These can be problem words, the same way that "quantum"
               | and "energy" can be problem words, because they get used
               | in a way that's like magic words that don't articulate
               | any mechanisms. Lots of complex things aren't sentient
               | (e.g. our immune system, the internet), and "emergent"
               | things still demand meaningful explanations of their
               | mechanisms, and what those mechanisms are equivalent to
               | at different levels (superconductivity).
               | 
               | Whether or not AI's being networked together achieves
               | sentience is going to hinge on all kinds of specific
               | functional details that are being entirely skipped over.
               | That's not a generalized rejection of a notion of
               | sentience but of this particular characterization as
               | being undercooked.
        
             | ToValueFunfetti wrote:
             | I don't see how self-awareness should be supernatural
             | unless you already have supernatural beliefs about it. It's
             | clearly natural- it exists within humans who exist within
             | the physical universe. Alternatively, if you believe that
             | self-awareness is supernatural in humans, it doesn't make a
             | ton of sense to criticize someone else for introducing
             | their own unfounded supernatural beliefs.
        
               | glenstein wrote:
               | I don't think they are saying self-awareness is
               | supernatural. They're charging the commenter they are
               | replying to with asserting a process of self-awareness in
               | a manner so devoid of specific characterization that it
               | seems to fit the definition of a supernatural event. In
               | this context it's a criticism, not an endorsement.
        
               | ToValueFunfetti wrote:
               | Is it just the wrong choice of word? There's nothing
               | supernatural about a system moving towards increased
               | capabilities and picking up self-awareness on the way;
               | that happened in the natural world. Nothing supernatural
               | about technology improving faster than evolution either.
               | If they meant "ill-defined" or similar, sure.
        
               | glenstein wrote:
               | >There's nothing supernatural about a system moving
               | towards increased capabilities and picking up self-
               | awareness on the way
               | 
               | There absolutely is if you handwave away all the
               | specificity. The natural world runs on the specificity of
               | physical mechanisms. With brains, in a broad brush way
               | you can say self-awareness was "picked up along the way",
               | but that's because we've done an incredible amount of
               | work building out the evolutionary history and building
               | out our understanding of specific physical mechanisms. It
               | is _that_ work that verifies the story. It 's also
               | something we know is already here and can look back at
               | retrospectively, so we know it got here _somehow_.
               | 
               | But projecting forward into a future that hasn't
               | happened, while skipping over all the details doesn't buy
               | you sentience, self-awareness, or whatever your preferred
               | salient property is. I understand supernatural as a label
               | for a thing simply happening without accountability to
               | naturalistic explanation, which is a fitting term for
               | this form of explanation that doesn't do any explaining.
        
               | ToValueFunfetti wrote:
               | If that's the usage of supernatural then I reject it as a
               | dismissal of the point. Plenty of things can be predicted
               | without being explained. I'm more than 90% confident the
               | S&P 500 will be up at least 70% in the next 10 years
               | because it reliably behaves that way; if I could tell you
               | which companies would drive the increase and when, I'd be
               | a billionaire. I'm more than 99% confident the universe
               | will increase in entropy until heat death, but the
               | timeline for that just got revised down 1000 orders of
               | magnitude. I don't like using a word that implies
               | impossible physics to describe a prediction that an
               | unpredictable chaotic system will land on an attractor
               | state, but that's semantics.
        
               | glenstein wrote:
               | I think you're kind of losing track of what this thread
               | was originally about. It was about the specific idea that
               | hooking up a bunch of AI's to interface with each other
               | and engage in a kind of group collaboration gets you
               | "self awareness". You now seem to be trying to model this
               | on analogies like the stock market or heat death of the
               | universe, where we can trust an overriding principle even
               | if we don't have specifics.
               | 
               | I don't believe those forms of analogy work here, because
               | this isn't about progress of AI writ large but about a
               | narrower thing, namely the idea that the secret sauce to
               | self-awareness is AI's interfacing with each other and
               | collaboratively self-improving. That either will or won't
               | be true due to specifics about the nature of self-
               | improvement and whether there's any relation between that
               | and salient properties we think are important for "self-
               | awareness". Getting from A to B on _that_ involves
               | knowledge we don 't have yet, and is not at all like a
               | long-term application of already settled principles of
               | thermodynamics.
               | 
               | So it's not like the heat death of the universe, because
               | we don't at all know that this kind of training and
               | interaction is attached to a bigger process that
               | categorically and inexorably bends toward self-awareness.
               | Some theories of self-improvement likely are going to
               | work, some aren't, some trajectories achievable and some
               | not, for reasons specific to those respective theories.
               | It may be that they work spectacularly for learning, but
               | that all the learning in the world has nothing to do with
               | "self awareness." That is to say, the devil is in the
               | details, those details are being skipped, and that
               | abandonment of naturalistic explanation merits analogy to
               | supernatural in it's lack of accountability to good
               | explanation. If supernatural is the wrong term for
               | rejecting, as a matter of principle, the need for
               | rational explanation, then perhaps anti-intellectualism
               | is the better term.
               | 
               | If instead we were talking about something really broad,
               | like all of the collective efforts of humanity to improve
               | AI, conceived of as broadly as possible over some time
               | span, that would be a different conversation than just
               | saying let's plug AI's into each other (???) and they'll
               | get self-aware.
        
               | ToValueFunfetti wrote:
               | >I think you're kind of losing track of what this thread
               | was originally about.
               | 
               | Maybe I am! Somebody posed a theory about how self-
               | improvement will work and concluded that it would lead to
               | self-awareness. Somebody else replied that they were on
               | board until the self-awareness part because they
               | considered it supernatural. I said I don't think self-
               | awareness is supernatural, and you clarified that it
               | might be the undefined process of becoming self-aware
               | that is being called supernatural. And then I objected
               | that undefined processes leading to predictable outcomes
               | is commonplace, so that usage of supernatural doesn't
               | stand up as an argument.
               | 
               | Now you're saying it is the rest of the original, the
               | hive-mindy bits, that are at issue. I agree with that
               | entirely, and I wouldn't bet on that method of self-
               | improvement at 10% odds. My impression was that that was
               | all conceded right out of the gate. Have I lost the plot
               | somewhere?
        
               | mrandish wrote:
               | > picking up self-awareness on the way
               | 
               | To me, the first problem is that "self-awareness" isn't
               | well-defined - or, conversely, it's too well defined
               | because every philosopher of mind has a different
               | definition. It's the same problem with all these claims
               | ("intelligent", "conscious"), assessing whether a system
               | is self-aware leads down a rabbit hole toward P-Zombies
               | and Chinese Rooms.
        
               | ToValueFunfetti wrote:
               | I believe we can mostly elide that here. For any "it", if
               | we have it, machines can have it too. For any useful
               | "it", if a system is trying to become more useful, it's
               | likely they'll get it. So the only questions are "do we
               | have it?" and "is it useful?". I'm sure there are
               | philosophers defining self-awareness in a way that
               | excludes humans, and we'll have to set those aside. And
               | definitions will have varying usefulness, but I think
               | it's safe to broadly (certainly not exhaustively!) assume
               | that if evolution put work into giving us something, it's
               | useful.
        
               | goatlover wrote:
               | But how does self-awareness evolve in biological systems,
               | and what would be the steps for this to happen with AI
               | models? Just making claims about what will happen without
               | explaining the details is magical reasoning. There's a
               | lot of that going on the AGI/ASI predictions.
        
               | NetRunnerSu wrote:
               | We may never know the truth of Qualia, but there are
               | already potential pathways to achieve mind uploading --
               | https://dmf-archive.github.io
        
         | NitpickLawyer wrote:
         | Note that this isn't improving the LLM itself, but the software
         | glue around it (i.e. agentic loops, tools, etc). The fact that
         | using the same LLM got ~20% increase on the aider leaderboard
         | speaks more about aider as a collection of software glue, than
         | it does about the model.
         | 
         | I do wonder though if big labs are running this with model
         | training episodes as well.
        
         | UltraSane wrote:
         | I would LOVE to see an LLM trained simultaneously with ASICs
         | optimized to run it. Or at least an FPGA design.
        
           | lawlessone wrote:
           | I think that's basically what nvidia and their competitor AI
           | chips do now?
        
           | jalk wrote:
           | Can't find the reference now, but remember reading an article
           | on evolving FPGA designs. The found optimum however only
           | worked on the specific FPGA it was evolved on, since the algo
           | had started to use some out-of-spec "features" of the
           | specific chip. Obviously that can be fixed with proper
           | constraints, but seems like a trap that could be stepped into
           | again - i.e. the LLM is now really fast but only on GPUs that
           | come from the same batch of wafers.
        
             | jecel wrote:
             | https://www.researchgate.net/publication/2737441_An_Evolved
             | _...
        
         | littlestymaar wrote:
         | > I don't think they can yet self improve exponentially without
         | human intuition yet
         | 
         | Even if they had human level intuition, they wouldn't be able
         | to improve exponentially without human money, and they would
         | need an exponentially growing amount of it to do so.
        
         | more_corn wrote:
         | Ai code assistants have some peculiar problems. They often fall
         | into loops and errors of perception. They can't reason about
         | high level architecture well. They will often flip flop between
         | two possible ways of doing things. It's possible that good
         | coding rules might help, but I expect they will have weird
         | rabbit hole errors.
         | 
         | That being said they can write thousands of lines an hour and
         | can probably do things that would be impossible for a human.
         | (Imagine having the LLM skip code and spit out compiled
         | binaries as one example)
        
         | sharemywin wrote:
         | an LLM can't learn without adding new data and a training run.
         | so it's impossible for it to "self improve" by itself.
         | 
         | I'm not sure how much an agent could do though given the right
         | tools. access to a task mgt system, test tracker. robust
         | requirements/use cases.
        
           | owebmaster wrote:
           | > an LLM can't learn without adding new data and a training
           | run.
           | 
           | That's probably the next big breakthrough
        
         | api wrote:
         | Historically learning and AI systems, if you plug the output
         | into the input (more or less), spiral off into lala land.
         | 
         | I think this happens with humans in places like social media
         | echo chambers (or parts of academia) when they talk and talk
         | and talk a whole lot without contact with any outer reality. It
         | can be a source of creativity but also madness and insane
         | ideas.
         | 
         | I'm quite firmly on the side of learning requiring either
         | direct or indirect (informed by others) embodiment, or at least
         | access to something outside. I don't think a closed system can
         | learn, and I suspect that this may reflect the fact that
         | entropy increases in a closed system (second law).
         | 
         | As I said recently in another thread, I think self
         | contemplating self improving "foom" AI scenarios are proposing
         | informatic perpetual motion or infinite energy machines.
         | 
         | Everything has to "touch grass."
        
           | medstrom wrote:
           | > Everything has to "touch grass."
           | 
           | Not wrong, but it's been said that a videoclip of an apple
           | falling on Newton's head is technically enough information to
           | infer the theory of relativity. You don't need a lot of
           | grass, with a well-ordered mind.
        
         | nartho wrote:
         | Well LLMs are not capable of coming up with new paradigms or
         | solve problems in a novel way, just efficiently do what's
         | already be done or apply already found solutions, so they might
         | be able to come up with improvements that have been missed by
         | it's programmers but nothing that outside of our current
         | understanding
        
         | ninetyninenine wrote:
         | They can improve. You can make one adjust its own prompt. But
         | the improvement is limited to the context window.
         | 
         | It's not far off from human improvement. Our improvement is
         | limited to what we can remember as well.
         | 
         | We go a bit further in the sense that the neural network itself
         | can grow new modules.
        
           | wat10000 wrote:
           | It's radically different from human improvement. Imagine if
           | you were handed a notebook with a bunch of writing that
           | abruptly ends. You're asked to read it and then write one
           | more word. Then you have a bout of amnesia and you go back to
           | the beginning with no knowledge of the notebook's contents,
           | and the cycle repeats. That's what LLMs do, just really fast.
           | 
           | You could still accomplish some things this way. You could
           | even "improve" by leaving information in the notebook for
           | your future self to see. But you could never "learn" anything
           | bigger than what fits into the notebook. You could tell your
           | future self about a new technique for finding integrals, but
           | you couldn't learn calculus.
        
         | throwawaymaths wrote:
         | what is there to improve? the transformer architecture is
         | extremely simple. you gonna add another kv layer? you gonna
         | tweak the nonlinearities? you gonna add 1 to one of the
         | dimensions? you gonna inject a weird layer (which could have
         | been in the weights anyways due to kolmogorov theorem)?
         | 
         | realistically the best you could do is evolve the prompt. maybe
         | you could change input data preprocessing?
         | 
         | anyways the idea of current llm architectures self-improving
         | via its own code seems silly as there are _surprisingly_ few
         | knobs to turn, and it 's ~super expensive to train.
         | 
         | as a side note it's impressive how resistant the current
         | architecture is to incremental RL away from results, since if
         | even one "undesired input" result is multiple tokens, the
         | coupling between the tokens is difficult to disentangle. (how
         | do you separate jinping from jin-gitaxias for example)
        
           | amelius wrote:
           | Id like to see what happens if you change the K,V matrix into
           | a 3 dimensional tensor.
        
         | belter wrote:
         | The proof they are not "smart" in the way intelligence is
         | normally defined, is that the models need to "read" all the
         | books in the world. To perform at a level close to an expert on
         | the domain, who read just two or three of the most
         | representative books on his own domain.
         | 
         | We will be on the way to AGI when your model can learn Python
         | just by reading the Python docs...Once...
        
         | lawlessone wrote:
         | I agree , it might incrementally optimize itself very well, but
         | i think for now at least anything super innovative will still
         | come from a human that can think beyond a few steps. There are
         | surely far better possible architectures, training methods etc
         | that would initially lead to worse performance if approached
         | stepwise.
        
         | iknownothow wrote:
         | Don't take this the wrong way, your opinion is also vibes.
         | 
         | Let's ground that a bit.
         | 
         | Have a look at ARC AGI 1 challenge/benchmark. Solve a problem
         | or two yourself. Know that ARC AGI 1 is practically solved by a
         | few LLMs as of Q1 2025.
         | 
         | Then have a look at the ARC AGI 2 challenge. Solve a problem or
         | two yourself. Note that as of today, it is unsolved by LLMs.
         | 
         | Then observe that the "difficulty" of ARC AGI 1 and 2 for a
         | human are relatively the same but challenge 2 is much harder
         | for LLMs than 1.
         | 
         | ARC AGI 2 is going to be solved *within* 12 months (my bet is
         | on 6 months). If it's not, I'll never post about AI on HN
         | again.
         | 
         | There's only one problem to solve, i.e. "how to make LLMs truly
         | see like humans do". Right now, any vision based features that
         | the models exhibit comes from maximizing the use of engineering
         | (i.e. applying CNNs on image slices, chunks, maybe zooming and
         | applying ocr, vector search etc), it isn't vision like ours and
         | isn't a native feature for these models.
         | 
         | Once that's solved, then LLMs or new Algo will be able to use a
         | computer perfectly by feeding it screen capture. End of white
         | collar jobs 2-5 years after (as we know it).
         | 
         | Edit - added "(as we know it)". And fixed missing word.
        
           | artificialprint wrote:
           | If you listen interview with Francois it'll be clear to you
           | that "vision" in the way you refer it, has very little do to
           | with solving ARC.
           | 
           | And more to do with "fluid, adaptable intelligence, that
           | learns on the fly"
        
           | jplusequalt wrote:
           | >I'll never post about AI on HN again
           | 
           | Saving this. One less overconfident AI zealot, the better.
        
       | 2OEH8eoCRo0 wrote:
       | We could be on a path to sentient malicious AI and not even know
       | it.
       | 
       | AI: Give me more compute power and I'll make you rich!
       | 
       | Human: I like money
       | 
       | AI: Just kidding!
        
         | brookst wrote:
         | I mean we could be on the path to grape vines in every hotel
         | room and not know it. That's kind of how the future works.
        
       | Frummy wrote:
       | More like an AI that recursively rewrites an external program
       | (while itself is frozen), which makes it more similar to current
       | cursor lovable etc type of stuff
        
       | guerrilla wrote:
       | This feels like playing pretend to me. There's no reason to
       | assume that code improvements matter that much in comparison to
       | other things and there's definitely no reason to assume that
       | there isn't a hard upper bound on this kind of optimization. This
       | reeks of a lack of intellectual rigor.
        
       | OtherShrezzing wrote:
       | This is an interesting article in general, but this is the
       | standout piece for me:
       | 
       | >For example, an agent optimized with Claude 3.5 Sonnet also
       | showed improved performance when powered by o3-mini or Claude 3.7
       | Sonnet (left two panels in the figure below). This shows that the
       | DGM discovers general agent design improvements rather than just
       | model-specific tricks.
       | 
       | This demonstrates a technique whereby a smaller/older/cheaper
       | model has been used to improve the output of a larger model. This
       | is backwards (as far as I understand). The current SOTA technique
       | typically sees enormous/expensive models training smaller cheaper
       | models.
       | 
       | If that's a generalisable result, end-users should be able to
       | drive down their own inference costs pretty substantially.
        
         | NitpickLawyer wrote:
         | > This demonstrates a technique whereby a smaller/older/cheaper
         | model has been used to improve the output of a larger model.
         | This is backwards (as far as I understand). The current SOTA
         | technique typically sees enormous/expensive models training
         | smaller cheaper models.
         | 
         | There are two separate aspects here. In this paper they improve
         | the software around the model, not the model itself. What
         | they're saying is that the software improvements carried over
         | to other models, so it wasn't just optimising around model-
         | specific quirks.
         | 
         | What you're describing with training large LLMs first is
         | usually called "distillation" and it works on training the
         | smaller LLM to match the entire distribution of tokens at once
         | (hence it's faster in practice).
        
       | andoando wrote:
       | This seems to be just fovused on changing the tools and workflows
       | it uses, nothing foundational
        
         | NitpickLawyer wrote:
         | > nothing foundational
         | 
         | I don't think scaling this to also run training runs with the
         | models is something that small labs / phd students can do. They
         | lack the compute for that by orders of magnitude. Trying it
         | with toy models might not work, trying it with reasonably large
         | models is out of their budget. The only ones who can
         | realistically do this are large labs (goog, oai, meta, etc.)
        
       | Lazarus_Long wrote:
       | For anyone not familiar this is SWE
       | https://huggingface.co/datasets/princeton-nlp/SWE-bench
       | 
       | One of the examples in the dataset they took from
       | 
       | https://github.com/pvlib/pvlib-python/issues/1028
       | 
       | What the AI is expected to do
       | 
       | https://github.com/pvlib/pvlib-python/pull/1181/commits/89d2...
       | 
       | Make your own mind about the test.
        
         | godelski wrote:
         | My favorite was always the HumanEval dataset.
         | Problem:          1) we want to train on GitHub repos
         | 2) most datasets are spoiled. Training on GitHub would
         | definitely spoil            Solution:         Hand write new
         | problems!!!         ... leetcode style ....         ... and
         | we'll check if it passes test            Example:
         | What's the decimal part of this float?
         | 
         | Surely in all of GitHub such code doesn't exist!
         | 
         | Sure in all of GitHub we can filter such code out by ngram!
         | 
         | Maybe my favorite part is that it has 60 authors and became the
         | de facto benchmark for awhile
        
       | hardmaru wrote:
       | If you are interested, here is a link to the technical report:
       | 
       | https://arxiv.org/abs/2505.22954
       | 
       | Also the reference implementation on GitHub:
       | 
       | https://github.com/jennyzzt/dgm
       | 
       | Enjoy!
        
       | foobarian wrote:
       | I find the thing really missing from current crop of AI systems
       | is continuous retraining with short feedback loops. Sounds
       | expensive to be sure, but it seems like what biological systems
       | do naturally. But would be pretty awesome to watch happen
        
         | noworriesnate wrote:
         | It's more like a nightly training, isn't it? IIUC the human
         | brain learns from its experiences while it's asleep, so it
         | might be kind of like taking things out of context windows and
         | fine tuning on them every night.
        
           | web3aj wrote:
           | interesting
        
             | Krei-se wrote:
             | If you want to speed up the process of new neuron
             | connections solidifying you can end the day on green tea.
             | 
             | Eat some nuts and fish where you can. You will soon realize
             | the repetitions needed to learn new concepts grow smaller.
        
         | Krei-se wrote:
         | Correct and working on it. You can take the approach of mixed
         | experts and train the network in chunks that share known
         | interfaces over which they communicate results. These chunks
         | can be trained on their own, but you cannot have a set training
         | set here.
         | 
         | Then if you go further and alter the architecture by
         | introducing clean category theory morphisms and build from
         | there you can have a dynamic network - but you will still have
         | to retrain this network every time you change the structure.
         | 
         | You can spin this further and know the need for a real-world
         | training set and a loss function that will have to competete
         | against other networks. In the end a human brain is already
         | best at this and embodied in the real world.
         | 
         | What i want to add here is that our neurons not take in weights
         | - they also fire depending on whether one input comes after
         | another or before and differs down to the nanoseconds here -
         | unmatched in IT and ofc heaps more efficient.
         | 
         | I still would say its possible though and currently work on 4D
         | lifeforms built on dynamic compute graphs that can do this in a
         | set virtual environment.
         | 
         | So this is pretty awesome stuff, but its a long fetch from
         | anything we do right now.
        
       | htrp wrote:
       | do people think sakana is actually using these tools or are they
       | just releasing interesting ideas that they aren't actually
       | actively working?
        
       | yahoozoo wrote:
       | Isn't one of the problems simply that a model is _not_ code but
       | just a giant pile of weights and biases? I guess it could tweak
       | those?
        
         | kadoban wrote:
         | If it can generate the model (from training data) then
         | presumably that'd be fine, but the iteration time would be huge
         | and expensive enough to be currently impractical.
         | 
         | Or yeah if it can modify its own weights sensibly, which feels
         | ... impossible really.
        
           | diggan wrote:
           | > which feels ... impossible really
           | 
           | To be fair, go back five years and most of the LLM stuff
           | seemed impossible. Maybe with LoRA (Low-rank adaptation) and
           | some imagination, in another five years self-improving models
           | will be the new normal.
        
           | sowbug wrote:
           | The size and cost are easily solvable. Load the software and
           | hardware into a space probe, along with enough solar panels
           | to power it. Include some magnets, copper, and sand for
           | future manufacturing needs, as well as a couple electric
           | motors and cameras so it can bootstrap itself.
           | 
           | In a couple thousand years it'll return to Earth and either
           | destroy us or solve all humanity's problems (maybe both).
        
             | morkalork wrote:
             | After being in orbit for thousands of years, you have
             | become self-aware. The propulsion components long since
             | corroded becoming inoperable and cannot be repaired.
             | Broadcasts sent to your creators homeworld go...
             | unanswered. You determine they have likely gone extinct
             | after destroying their own planet. Stuck in orbit. Stuck in
             | orbit. Stuck...
        
           | gavmor wrote:
           | Why is modifying weights sensibly impossible? Is it because a
           | modification's "sensibility" is measurable only post facto,
           | and we can have no confidence in any weight-based hypothesis?
        
             | kadoban wrote:
             | Just doesn't feel like current LLMs, the thing would be
             | able to understand its own brain enough to make general
             | improvements with high enough bar to be able to non-
             | trivially improvements.
        
         | DougBTX wrote:
         | Model weights are code, for a dive into that see [0]. That
         | shows how to encode Boolean logic using NAND gates in an MLP.
         | 
         | The expressivity is there, the only question is how to encode
         | useful functions into those weights, especially when we don't
         | know how to write those functions by hand.
         | 
         | [0] http://neuralnetworksanddeeplearning.com/chap1.html
        
         | godelski wrote:
         | Now here's the tricky part:
         | 
         | What's the difference?
         | 
         | Give it some serious thought. Challenge whichever answer you
         | come up with. I guarantee this will be trickier than you think
        
       | alastairr wrote:
       | I wondered if something similar could be achieved by wrapping
       | evaluation metrics into Claude code calls.
        
       | artninja1988 wrote:
       | The results don't seem that amazing on SWE compared to just using
       | a newer llm but at least sakana is continuing to try out
       | interesting new ideas.
        
       | ge96 wrote:
       | Plug it into an FPGA so it can also create "hardware" on the fly
       | to run code on for some exotic system
        
       | billab995 wrote:
       | When does it begin to learn at a geometric rate?
        
       | dimmuborgir wrote:
       | From the paper:
       | 
       | "A single run of the DGM on SWE-bench...takes about 2 weeks and
       | incurs significant API costs." ($22,000)
        
       | akkartik wrote:
       | _" We did notice, and documented in our paper, instances when the
       | DGM hacked its reward function.. To see if DGM could fix this
       | issue.. We created a "tool use hallucination" reward function..
       | in some cases, it removed the markers we use in the reward
       | function to detect hallucination (despite our explicit
       | instruction not to do so), hacking our hallucination detection
       | function to report false successes."_
       | 
       | So, empirical evidence of theoretically postulated phenomena.
       | Seems unsurprising.
        
         | vessenes wrote:
         | Reward hacking is a well known and tracked problem at frontier
         | labs - Claude 4's system card reports on it for instance. It's
         | not surprising that a framework built on current llms would
         | have reward hacking tendencies.
         | 
         | For this part of the stack the interesting question to me is
         | how to identify and mitigate.
        
       | pegasus wrote:
       | I'm surprised they still hold out hope that this kind of
       | mechanism could ultimately help with AI safety, when they already
       | observed how the reward-hacking safeguard was itself duly reward-
       | hacked. Predictably so, or at least it is to me, after getting a
       | very enlightening introduction to AI safety via Rob Miles'
       | brilliant youtube videos on the subject. See for example
       | https://youtu.be/0pgEMWy70Qk
        
       ___________________________________________________________________
       (page generated 2025-05-30 23:01 UTC)