[HN Gopher] Further human + AI + proof assistant work on Knuth's...
___________________________________________________________________
Further human + AI + proof assistant work on Knuth's "Claude
Cycles" problem
_Knuth Claude 's Cycles note update: problem now fully solved, by
LLMs_ - https://news.ycombinator.com/item?id=47306926 - March 2026
(2 comments)
https://chatgpt.com/share/69aaab4b-888c-8003-9a02-d1df80f9c7...
_Claude 's Cycles [pdf]_ -
https://news.ycombinator.com/item?id=47230710 - March 2026 (362
comments)
Author : mean_mistreater
Score : 118 points
Date : 2026-03-28 18:38 UTC (4 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| adrithmetiqa wrote:
| Super interesting but what does this mean for us mere mortals?
| brcmthrowaway wrote:
| Learn plumbing
| NitpickLawyer wrote:
| I know your reply was half joking, so please take this the
| same way, but ... are you sure about that?
| https://www.youtube.com/watch?v=p1ip68Vv7NE
| piloto_ciego wrote:
| This is truly amazing. Do people not really realize how
| amazing stuff like this is? I feel like I'm taking crazy
| pills here, but man, it certainly feels like we're on the
| edge of something quite amazing...
| siva7 wrote:
| Autonomous robots murdering humans in warfare? That's at
| least the sense i got from reading this news site the
| past few days
| dakolli wrote:
| AI isn't replacing anything, get over yourself.
| brcmthrowaway wrote:
| Arent you using Claude?
| oytis wrote:
| There is no reason why market for plumbing will get much
| larger than it is now (which is not too large)
| Hasslequest wrote:
| Surely AI has to take a shit eventually. What's all this
| racket about water usage?
| incognito124 wrote:
| Where I live it's bathroom and kitchen tiling
| radu_floricica wrote:
| This is kindof the opposite? Man + AI > either man or AI. I'd
| say "learn to work with Claude" is the better lesson here.
| zoogeny wrote:
| For now. The term people use is "centaur", like the half-
| man-half-horse of mythology.
|
| The AI CEO's are pointing out that when chess was "solved",
| in that Kasparov was famously beaten by deep blue, there
| was a window of time after that event where grandmasters +
| computers were the strongest players. The
| knowledge/experience of a grandmaster paired with the
| search/scoring of the engines was an unbeatable pair.
|
| However, that was just a window in time. Eventually engines
| alone were capable of beating grandmaster + engine pairs.
| Think about that carefully. It implies something. The human
| involvement eventually became an impediment.
|
| Whether you believe this will transfer to other domains is
| up to you to decide.
| dataviz1000 wrote:
| I got Claude to self reference and update its own instructions
| to solve making a typed proxy API of any website. After a week,
| scores of iterations, it can reverse engineer any website. The
| first few days I had to be deeply involved with each iteration
| loop. Domain knowledge is helpful. Each time I saw a problem I
| would ask Claude to update its instructions so it doesn't
| happen again. Then less and less. Eventually it got to the
| point it was updating and improving the metrics every iteration
| unsupervised.
|
| Edit: This is going to have huge ramifications for the tech
| security industry as these systems will be able to break
| security systems as easily it solved the proof. The sooner the
| good guys, if there are any left, understand this the better it
| will be for everybody.
|
| > Super interesting but what does this mean for us mere
| mortals?
|
| I would go for a 2 or 3 hour walk with my phone using the
| remote control feature looking every 5 - 10 minutes to make
| sure it doesn't need human help. I went to the coffeeshop and
| drank very good coffee listening to music. Then at night I sat
| and had a beer thinking about T.S. Eliot's 'The Wasteland', the
| effect of industrialization in England at that time and his
| views of how ennui affected the aristocracy.
| DrewADesign wrote:
| > I went to the coffeeshop and drank very good coffee
| listening to music. Then at night I sat and had a beer
| thinking about T.S. Eliot's 'The Wasteland', the effect of
| industrialization in England at that time and his views of
| how ennui affected the aristocracy.
|
| Well, for those among us that are not aristocracy already,
| except for the vanishingly small number of people required to
| oversee such processes, we're probably the closest we're
| going to get to it. If they don't need people to do the tech
| labor, we've got way more people than we need, so that's a
| huge oversupply of tech skills, which means tech skills are
| rapidly becoming worthless. Glad to see how fast we're moving
| in our very own race to the bottom!
| drfloyd51 wrote:
| I kind of feel like software engineers working on improving
| AI are traitors working against other SE's trying to make a
| living.
|
| However...
|
| I have to acknowledge my craft of SE has been putting
| people out of work for decades. I myself came up with
| business process improvement that directly let the company
| release about 20 people. I did this twice.
|
| So... fair play.
| mannanj wrote:
| Aren't the true traitors still the ones paying the SE to
| do that work? The managerial slave-master class?
| drfloyd51 wrote:
| You always have a choice to make. You make it everyday.
| Get up. Go to a legitimate job. Work.
|
| You probably choose not to steal, rob, impersonate
| someone else, or generally make money illegally.
|
| It can be traitors all the way down.
| marsten wrote:
| In the grand scheme it's good to invent things that
| replace human labor. It frees up people to do more
| interesting things. The goal should be to put everyone
| out of a job.
| pixl97 wrote:
| >It frees up people to do more interesting things
|
| Like beg on the corners and starve in the street? Trying
| to figure out how the basics of capitalism where labor is
| exchanged for money is not going to work well when the
| only jobs left are side gigs. Something will have to
| change and a lot of People will fight said change.
| DowsingSpoon wrote:
| I've thought about this myself. Couple of points:
|
| 1) It's not my job to fix all the problems of Capitalism.
| It's painful to try to fight the system without
| collective action. My family and I have to eat too.
|
| 2) We have had a solution all along for the particular
| problem of AI putting devs out of work. It's called
| professional licensure, and you can see it in action in
| engineering and medical fields. Professional Software
| Engineers would assume a certain amount of liability and
| responsibility for the software they develop. That's
| regardless of whether they develop it with LLM tools or
| something else.
|
| For example, you let your tools write slop that you ship
| without even looking? And it goes on to wreak havoc?
| That's professional malpractice. Bad engineer.
|
| If we do this then Software Engineers become the
| responsible humans in the loop of so-called "AI" systems.
| drfloyd51 wrote:
| It's not your job to fix capitalism. But it is your job
| to evaluate if your money making skill comes at too high
| a price for others.
|
| Say you found a job shooting people in the head for
| money. Like if you work for ICE or something...
|
| You need to feed your family. Is this job ok? You may
| decide yes. I decided no. I will find another way to feed
| my family.
|
| You don't get to escape consequences because you are a
| small cog in a large system.
|
| In the bigger picture, automation should free people from
| labor. But that requires some very greedy people to relax
| their grip ever so slightly. I imagine they see
| automation as a way to reduce reliance on labor, and if
| they don't need labor, they don't need people. So let
| them starve and stop having kids.
| slopinthebag wrote:
| We will come up with new jobs, like we have for all of
| human history. I think even in an abundance utopia people
| will still work - we need purpose to sustain our
| existence.
|
| The work will become even more fulfilling however.
| amelius wrote:
| > The goal should be to put everyone out of a job.
|
| Yeah, but why does it need to take the fun jobs first,
| like painting, writing poems, coding, making music, ...
|
| I want the AI to cook, do the dishes, take out the trash,
| etc.
| psychoslave wrote:
| Lol,a race to the bottom where too many tech savvy people
| are left unemployed while a few "privileged" get a
| decreasing buying power to maintain security of the digital
| tools that keep the whole digital dependent civilizations
| afloat?
|
| Sounds like a great starting plot for an interesting story.
| frizlab wrote:
| > I would go for a 2 or 3 hour walk with my phone using the
| remote control feature looking every 5 - 10 minutes to make
| sure it doesn't need human help.
|
| That is a nightmarish scenario tbh
| falcor84 wrote:
| Nightmarish?! In comparison to the average person's actual
| job? I'm pretty sure that many people out there would sign
| up for a battle royale for a chance at such a job.
| siva7 wrote:
| Would they? I'd love to get in touch
| falcor84 wrote:
| My clients have been burned before. Once you set up the
| battle royale with a trusted third party validating that
| there'll be an assured good job at the end, I promise
| I'll have enough candidates for you to fill up the first
| 10 competitions.
| ChrisClark wrote:
| So sitting at a desk is nicer than a walk outside for you?
| Why would relaxation be a nightmare?
| frizlab wrote:
| Checking one's phone every 5 to 10 minutes is nothing but
| relaxation. One needs to have the mind at ease to relax.
| dataviz1000 wrote:
| That nightmarish scenario is what T.S. Eliot was describing
| in "The Wasteland" which "portrays deep, existential ennui
| and boredom as defining symptoms of modern life following
| World War I."
|
| Later this boredom was described by the Stones, "And though
| she's not really ill / There's a little yellow pill / She
| goes running for the shelter of a mother's little helper".
|
| It is a nightmare. Mostly what I'm thinking about while the
| agents are running is how bored I'm going to be. That is
| the joke, my deep thought on T.S. Eliot are about the
| wasteland this thing is going to create.
| troupo wrote:
| > I would go for a 2 or 3 hour walk with my phone using the
| remote control feature looking every 5 - 10 minutes
|
| 2-3 hours "walking" while having to check in every 5-10
| minutes?
|
| If I have to check in every 5-10 minutes, I won't taste
| coffee _or_ hear that there 's good music playing.
| xvector wrote:
| Just Claude code a push notification feature then
| virtue3 wrote:
| That's fucking insane. Thank you for sharing.
|
| I had a bad feeling we were basically already there.
| colechristensen wrote:
| I have similar amounts of success (pretty good!) standing in
| line at a coffee shop talking to people who work for me
| through some action that needs to be taken and doing the same
| with AI.
|
| However I do not trust AI anywhere near as much as I trust
| the humans. The AI is super capable but also occasionally a
| psychopath toddler. I sat in amused astonishment when faced
| with job 2 not running because job 1 was failing Claude went
| in to the database, changed the failure record to success,
| triggered job 2 which produced harmful garbage, and then
| claimed victory. Only the most troubled person would even
| think of doing that, but Claude thought it was the best
| solution.
| dunder_cat wrote:
| > Edit: This is going to have huge ramifications for the tech
| security industry as these systems will be able to break
| security systems as easily it solved the proof. The sooner
| the good guys, if there are any left, understand this the
| better it will be for everybody.
|
| What can the good guys do? Fire up Claude to improve their
| systems? Unless you have it working fully autonomously to
| counter-act abuse, I don't see how you can beat the "bad
| guys". There may be some industries where this is a solved
| problem (e.g. you can do all the validation server-sided,
| religiously follow best practices to prevent and mitigate
| abuse), but a lot of stuff like multiplayer video games will
| be doomed unless they move to a "you must use a locked down
| system we control" model. I honestly don't consider it
| liberating as someone that has various hobby projects, that
| now in addition to plain old DDoS I'll also have people spin
| up layer 7 attacks with just their credit card. It almost
| makes me want to give up instead of pushing forward in a
| world where the worst of the worst has access to the best of
| the best.
| ale wrote:
| This type of slop comment is somehow worse than spam.
|
| >After a week, scores of iterations, it can reverse engineer
| any website
|
| Cool, let's see the proof.
| dataviz1000 wrote:
| I posted a link but don't want to spam HN more than I have.
|
| It is proof-of-concept. Seriously burns some tokens (~80k -
| ~200k) but doesn't require AI after to scrape and automate
| a website so if all the people at Browser Use, Browser
| Base, and every one pounding every website used it, I
| think, the net benefit would be in the billions. I would
| recommend using it in isolation. Nonetheless, it works very
| very well on my machine.
|
| > This type of slop comment is somehow worse than spam.
|
| Please don't be mean.
| emp17344 wrote:
| There is no proof, just a self-congratulatory word salad
| with dubious authenticity.
|
| It's insane how insufferable this place is now.
| dataviz1000 wrote:
| Here is a description of the iteration loop. [0] I'm
| working on another draft that will be much more polished
| and have better explanations of the iteration loop.
|
| > There is no proof, just a self-congratulatory word
| salad with dubious authenticity.
|
| I worked 8 days straight on that and have been working
| non-stop on the second draft that is much cleaner and
| safer. I'm a human being. Please don't be mean. If
| humanity does come to end, it won't be because of AI, it
| will be because we can't stop being assholes to each
| other.
|
| [0]
| https://github.com/adam-s/intercept/tree/main?tab=readme-
| ov-...
| TrainedMonkey wrote:
| My understanding is that, if confirmed, this demonstrates that
| AI can find novel solutions. This is a strong counterpoint to
| generative-AI-is-strictly-limited-to-training-data.
| dijksterhuis wrote:
| https://en.wikipedia.org/wiki/AlphaFold ...
|
| we've had AlphaFold for a while. it's not a novel that we
| have ML solutions that can find, erm, novel solutions.
|
| however, by and large, most LLMs as typically used by most
| individuals aren't solving novel problems. and in those
| scenarios, we often end up with regurgitated/most
| common/lowest common denominator outputs... it's a
| probability distribution thing.
| psychoslave wrote:
| Put in the hands of great mathematicians, pencil and paper
| proved able to write proofs of open problems.
| heliumtera wrote:
| That llms in the middle of everything will continue until
| morale improve because llms can generate text on top of
| bullshit made up problems
| muskstinks wrote:
| Another signal that we still have relevant progress in ai.
|
| Also that it is now good enough to make researchers faster.
| gnarlouse wrote:
| out of curiosity, i wonder if people are taking stabs at p!=np
| vatsachak wrote:
| I've always said this but AI will win a fields medal before being
| able to manage a McDonald's.
|
| Math seems difficult to us because it's like using a hammer (the
| brain) to twist in a screw (math).
|
| LLMs are discovering a lot of new math because they are great at
| low depth high breadth situations.
|
| I predict that in the future people will ditch LLMs in favor of
| AlphaGo style RL done on Lean syntax trees. These should be able
| to think on much larger timescales.
|
| Any professional mathematician will tell you that their arsenal
| is ~ 10 tricks. If we can codify those tricks as latent vectors
| it's GG
| smokel wrote:
| I think this is mostly about existing legislature, not about
| technology.
|
| In any other context than when your paycheck depends on it, you
| would probably not be following orders from a random manager.
| If your paycheck depended on following the instructions of an
| AI robot, the world might start to look pretty scary real soon.
| vatsachak wrote:
| There's a lot to being a manager
|
| - Coherent customer interaction
|
| - Common sense judgements
|
| - Scheduling
|
| - Quality control
|
| All which are baked into humans but not so much into LLMs
|
| Even if it were legal to have an LLM as a GM, I think it
| would fair poorly
| throw3747488 wrote:
| AI actually has to follow all rules, even the bad rules. Like
| when autonomous car drives super carefully.
|
| Imagine mcdonald management would enforce dog related rules.
| No more filthy muppets! If dog harasses customers, AI would
| call cops, and sue for restraining order! If dog defecates in
| middle of restaurant, everything would get desinfected, not
| just smeared with towels!
|
| Nutters would crucify AI management!
| vatsachak wrote:
| Tricks are nothing but patterns in the logical formulae we
| reduce.
|
| Ergo these are latent vectors in our brain. We use analogies
| like geometry in order to use Algebraic Geometry to solve
| problems in Number Theory.
|
| An AI trained on Lean Syntax trees might develop it's own weird
| versions of intuition that might actually properly contain
| ours.
|
| If this sounds far fetched, look at Chess. I wonder if anyone
| has dug into StockFish using mechanistic interpretability
| slopinthebag wrote:
| Stockfish's power comes from mostly search, and the ML
| techniques it uses are mainly about better search, i.e.
| pruning branches more efficiently.
| vatsachak wrote:
| The weights must still have some understanding of the chess
| board. Though there is always the chance that it makes no
| sense to us
| slopinthebag wrote:
| Even that is probably too much. It has no understanding
| of what "chess" is, or what a chess board is, or even
| what a game is. And yet it crushes every human with ease.
| It's pretty nuts haha.
| anematode wrote:
| Actually, the neural net itself is fairly imprecise.
| Search is required for it to achieve good play. Here's an
| example of me beating Stockfish 18 at depth 1:
| https://lichess.org/XmITiqmi
| Sopel wrote:
| chess is just a simple mathematical construct so that's
| not surprising
| hollerith wrote:
| Does Stockfish have weights or use a neural net? I know
| older versions did not.
| Sopel wrote:
| yes
| emp17344 wrote:
| Why must it involve understanding? I feel like you're
| operating under the assumption that functionalism is the
| "correct" philosophical framework without considering
| alternative views.
| Sopel wrote:
| The ML techniques it uses are only about evaluation, but
| you were close
| myffical wrote:
| Some DeepMind researchers used mechanistic interpretability
| techniques to find concepts in AlphaZero and teach them to
| human chess Grandmasters:
| https://www.pnas.org/doi/10.1073/pnas.2406675122
| slopinthebag wrote:
| > AI will win a fields medal before being able to manage a
| McDonald's
|
| Of course, because it takes multi-modal intelligence to manage
| a McDonalds. I.e. it requires human intelligence.
|
| > I predict that in the future people will ditch LLMs in favor
| of AlphaGo style RL
|
| Same for coding as well. LLM's might be the interface we use
| with other forms of AI though.
| vatsachak wrote:
| Something like building Linux is more akin to managing a
| McDonald's than it is to a 10 page technical proof in
| Algebraic Groups.
|
| Programming is more multimodal than math.
|
| Something like performance engineering might be free lunch
| though
| slopinthebag wrote:
| Yeah, it's hard to compare management and programming but
| they're both multimodal in very different ways. But there's
| gonna be entire domains in which AI dominates much like
| stockfish, but stockfish isn't managing franchises and
| there is no reason to expect that anytime soon.
|
| I feel like something people miss when they talk about
| intelligence is that humans have _incredible_ breadth. This
| is really what differentiates us from artificial forms of
| intelligence as well as other animals. Plus we have agency,
| the ability to learn, the ability to critically think, from
| first principles, etc.
| vatsachak wrote:
| Exactly. It's what the execs are missing.
|
| Also animals thrive in underspecified environments, while
| AIs like very specific environments. Math is the most
| specified field there is lol
| bitwize wrote:
| But LLMs have proven themselves better at programming than
| most professional programmers.
|
| Don't argue. If you think Hackernews is a representative
| sample of the field then you haven't been in the field long
| enough.
|
| What LLMs have actually done is put the dream of software
| engineering within reach. Creativity is inimical to
| software engineering; the goal has long been to provide a
| universal set of reusable components which can then be
| adapted and integrated into any system. The hard part was
| always providing libraries of such components, and then
| integrating them. LLMs have largely solved these problems.
| Their training data contains vast amounts of solved
| programming problems, and they are able to adapt these in
| vector space to whatever the situation calls for.
|
| We are _already there_. Software engineering as it was long
| envisioned is now possible. And if you 're not doing it
| with LLMs, you're _going_ to be left behind. Multimodal
| human-level thinking need only be undertaken at the highest
| levels: deciding what to build and maybe choosing the
| components to build it. LLMs will take care of the rest.
| abcde666777 wrote:
| A bit optimistic I'd say. It's put _some_ software
| engineering within reach of _some_ people who couldn 't
| do it prior. Where 'some' might be a lot, but still far
| from all.
|
| I was thinking the other day of how things would go if
| some of my less tech savvy clients tried to vibe code the
| things I implement for them, and frankly I could only
| imagine hilarity ensuing. They wouldn't be able to steer
| it correctly at all and would inevitably get stuck.
|
| Someone needs to experiment with that actually: putting
| the full set of agentic coding tools in the hands of
| grandma and recording the outcome.
| hodgehog11 wrote:
| > Programming is more multimodal than math
|
| I have no idea how you come to this conclusion, when the
| evidence on the ground for those training models suggests
| it is precisely the opposite.
|
| We are much further along the path of writing code than
| writing new maths, since the latter often requires some
| degree of representational fluency of the world we live in
| to be relevant. For example, proving something about braid
| groups can require representation by grid diagrams, and we
| know from ARC-AGI that LLMs don't do great with this.
|
| Programming does not have this issue to the same extent;
| arguably, it involves the subset of maths that is
| exclusively problem solving using standard representations.
| The issues with programming are primarily on the difficulty
| with handling large volumes of text reliably.
| NamlchakKhandro wrote:
| I've never seen you say that
| vatsachak wrote:
| You will have to take my word that I started saying this in
| Dec 2024 lol
| madrox wrote:
| > I've always said this but AI will win a fields medal before
| being able to manage a McDonald's.
|
| I love this and have a corollary saying: the last job to be
| automated will be QA.
|
| This wave of technology has triggered more discussion about the
| types of knowledge work that exist than any other, and I think
| we will be sharper for it.
| bitwize wrote:
| The ownership class will be sharper. They will know how to
| exploit capital and turn it into more capital with vastly
| increased efficiency. Everybody else will be hosed.
| ryanar wrote:
| Are they actually producing new math? In the most recent ACM
| issue there was an article about testing AI against a math
| bench that was privately built by mathematicians, and what they
| found is that even though AI can solve some problems, it never
| truly has come up with something novel and new in mathematics,
| it is just good at drawing connections between existing
| research and putting a spin on it.
| hodgehog11 wrote:
| It's finding constructions and counterexamples. That's
| different from finding new proof techniques, but still
| extremely useful, and still gives way to novel findings.
| pks016 wrote:
| Interesting but not surprising to me. Once a field expert guides
| the models, they most likely will reach a solution. The models
| are good at lazy work for experts. For hard or complicated
| questions, many a time the models have blind spots.
| smithcoin wrote:
| When I was younger I remember a point of demarcation for me was
| learning the 4chan adage "trolls trolling trolls", and
| approaching all internet interactions with skepticism. While I
| have been sure that Reddit for a while has succumbed to being
| "dead internet". This thread is another moment for me- I can no
| longer recognize who is a bot, and who has honest intentions.
| ftchd wrote:
| https://xcancel.com/BoWang87/status/2037648937453232504
___________________________________________________________________
(page generated 2026-03-28 23:00 UTC)