hngopher.com

       [HN Gopher] AI just proved Erdos Problem #124
       ___________________________________________________________________
        
       AI just proved Erdos Problem #124
        
       https://twitter.com/vladtenev/status/1994922827208663383
        
       Author : nl
       Score  : 231 points
       Date   : 2025-11-30 05:19 UTC (1 days ago)
        
 (HTM) web link (www.erdosproblems.com)
 (TXT) w3m dump (www.erdosproblems.com)
        
       | ares623 wrote:
       | [flagged]
        
         | bogdan wrote:
         | Seriously, blame the investors.
        
           | wasmainiac wrote:
           | Nah, everyone is to blame here in bubble.
        
         | dang wrote:
         | " _Please don 't post shallow dismissals, especially of other
         | people's work. A good critical comment teaches us something._"
         | 
         | " _Don 't be curmudgeonly. Thoughtful criticism is fine, but
         | please don't be rigidly or generically negative._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
       | menaerus wrote:
       | This seems to be 2nd in row proof from the same author by using
       | the AI models. First time it was the ChatGPT which wrote the
       | formal Lean proof for Erdos Problem #340.
       | 
       | https://arxiv.org/html/2510.19804v1#Thmtheorem3
       | 
       | > In over a dozen papers, beginning in 1976 and spanning two
       | decades, Paul Erdos repeatedly posed one of his "favourite"
       | conjectures: every finite Sidon set can be extended to a finite
       | perfect difference set. We establish that {1, 2, 4, 8, 13} is a
       | counterexample to this conjecture.
        
         | NooneAtAll3 wrote:
         | that one was vibe-coded
         | 
         | Ai was given step-by-step already found proof, and asked
         | "please rewrite in Lean"
         | 
         | ---
         | 
         | here Ai did the proof itself
        
       | wasmainiac wrote:
       | Ok... has this been verified? I see no publication or at least an
       | announcement on Harmonics webpage. If this is a big deal, you
       | think it would be a big deal, or is this just hype?
        
         | singularity2001 wrote:
         | verified by lean so 99.99% yes
        
           | cluckindan wrote:
           | Lean verified a proof of a solution to a problem, but was it
           | the same problem as Erdos problem #124?
           | 
           | https://www.erdosproblems.com/forum/thread/124#post-1899
        
             | wasmainiac wrote:
             | > My summary is that Aristotle solved "a" version of this
             | problem (indeed, with an olympiad-style proof), but not
             | "the" version.
             | 
             | > I agree that the [BEGL96] problem is still open (for
             | now!), and your plan to keep this problem open by changing
             | the statement is reasonable. Alternatively, one could add
             | another problem and link them. I have no preference. --
             | BorisAlexeev
             | 
             | There we go, so there is hype to some degree.
        
           | aaomidi wrote:
           | Is there some good literature to read about lean? First time
           | I'm hearing about it and it seems pretty cool.
        
       | adt wrote:
       | Related, independent, and verified:
       | 
       | GPT-5 solved Erdos problem #848 (combinatorial number theory):
       | 
       | https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8...
       | 
       | https://lifearchitect.ai/asi/
        
         | ares623 wrote:
         | FTL? Seriously?
        
         | demirbey05 wrote:
         | I read how GPT-5 contributed to proof. It is not fully solved
         | by GPT-5 instead assisted. For more look here
         | https://www.math.columbia.edu/~msawhney/Problem_848.pdf
        
       | esperent wrote:
       | More interesting discussion than on Twitter here:
       | 
       | https://www.erdosproblems.com/forum/thread/124#post-1892
        
         | dang wrote:
         | Ok, we've changed the link to that from
         | https://twitter.com/vladtenev/status/1994922827208663383, but
         | I've included the latter in the toptext. Thanks!
        
       | demirbey05 wrote:
       | This is response from mathematician: "This is quite something,
       | congratulations to Boris and Aristotle!
       | 
       | On one hand, as the nice sketch provided below by tsaf confirms,
       | the final proof is quite simple and elementary - indeed, if one
       | was given this problem in a maths competition (so therefore
       | expected a short simple solution existed) I'd guess that
       | something like the below would be produced. On the other hand, if
       | something like this worked, then surely the combined talents of
       | Burr, Erdos, Graham, and Li would have spotted it.
       | 
       | Normally, this would make me suspicious of this short proof, in
       | that there is overlooked subtlety. But (a) I can't see any and
       | (b) the proof has been formalised in Lean, so clearly it just
       | works!
       | 
       | Perhaps this shows what the real issue in the [BEGL96] conjecture
       | is - namely the removal of 1 and the addition of the necessary
       | gcd condition. (And perhaps at least some subset of the authors
       | were aware of this argument for the easier version allowing 1,
       | but this was overlooked later by Erdos in [Er97] and [Er97e],
       | although if they were aware then one would hope they'd have
       | included this in the paper as a remark.)
       | 
       | At the moment I'm minded to keep this as open, and add the gcd
       | condition in the main statement, and note in the remarks that the
       | easier (?) version allowing 1 and omitting the gcd condition,
       | which was also asked independently by Erdos, has been solved."
       | 
       | The commentator is saying: "I can't believe this famous problem
       | was solved so easily. I would have thought it was a fake proof,
       | but the computer verified it. It turns out the solution works
       | because it addresses a slightly different set of constraints
       | (regarding the number 1) than what Erdos originally struggled
       | with. (Generated by Gemini)
        
         | aaomidi wrote:
         | I think that person is the owner of the website discussing it
         | too.
         | 
         | He was cited https://the-decoder.com/leading-openai-researcher-
         | announced-...
         | 
         | Where a while back OpenAI made a misleading claim about solving
         | some of these problems.
        
         | dang wrote:
         | > Generated by Gemini
         | 
         | Please don't post generated comments here. We want HN to be a
         | place for humans writing in their own voice.
        
       | ComplexSystems wrote:
       | Are you kidding? This is an incredible result. Stuff like this is
       | the most important stuff happening in AI right now. Automated
       | theorem proving? It's not too far to say the entire singular
       | point of the technology was to get us to this.
        
         | ares623 wrote:
         | This is me being snarky and ignorant, but if it solved one
         | problem and it is automated what's stopping it from solving all
         | the others? That's what's ultimately being sold by the tweet
         | right.
        
           | aeve890 wrote:
           | >This is me being snarky and ignorant, but if it solved one
           | problem and it is automated what's stopping it from solving
           | all the others?
           | 
           | Yeah that's the crux of the matter. How do AI did it? Using
           | already existing math. If we need new math to prove Collatz,
           | Goldbach or Riemman, LLMs are simply SOL. That's what's
           | missing and hype boys always avoid to mention.
        
             | dwohnitmok wrote:
             | > How do AI did it? Using already existing math. If we need
             | new math to prove Collatz, Goldbach or Riemman, LLMs are
             | simply SOL.
             | 
             | An unproved theorem now proved is by definition new math.
             | Will LLMs get you to Collatz, Goldbach, or Riemann?
             | Unclear.
             | 
             | But it's not like there's some magical, entirely unrelated
             | to existing math, "new math" that was required to solve all
             | the big conjectures of the past. They proceeded, as always,
             | by proving new theorems one by one.
        
               | yorwba wrote:
               | Yes, "new math" is neither magical nor unrelated to
               | existing math, but that doesn't mean any new theorem or
               | proof is automatically "new math." I think the term is
               | usually reserved for the definition of a new kind of
               | mathematical object, about which you prove theorems
               | relating it to existing math, which then allows you to
               | construct qualitatively new proofs by transforming
               | statements into the language of your new kind of object
               | and back.
               | 
               | I think eventually LLMs will also be used as part of
               | systems that come up with new, broadly useful
               | definitions, but we're not there yet.
        
               | aeve890 wrote:
               | >An unproved theorem now proved is by definition new
               | math.
               | 
               | No. By _new math_ I mean new mathematical constructs and
               | theories like (to mention the "newest" ones) category
               | theory, information theory, homotopy type theory, etc.
               | Something like Cantor inventing set theory, or Shannon
               | with information theory, or Euler with graph theory.
               | 
               | AFAIK no new field of mathematics as been created _by_
               | AI. Feel free to correct me.
        
           | ComplexSystems wrote:
           | Ultimately the main thing that will stop it from solving
           | literally "all the others" are things like the impossibility
           | of solving the halting problem, considerations like P [?] NP,
           | etc. But as we have just seen, despite these impossibility
           | theorems, AI systems are still able to make substantive
           | progress on solving important open real-world problems.
        
             | thaumasiotes wrote:
             | > But as we have just seen, despite these impossibility
             | theorems, AI systems are still able to make substantive
             | progress on solving important open real-world problems.
             | 
             | You seem to be smuggling in the assumption that this
             | problem was "important".
        
           | adi_kurian wrote:
           | Sucks that fuckers dupe suckers. Personally, I find the news
           | to be incredible.
        
         | BanditDefender wrote:
         | I think it is way too far to say that!
         | 
         | We've had automated theorem proving since the 60s. What we need
         | is automated theorem _discovery._ Erdos discovered these
         | theorems even if he wasn 't really able to prove them. Euler
         | and Gauss discovered a ton of stuff they couldn't prove. It is
         | weird that nobody considers this to be intelligence. Instead
         | intelligence is a little game AI plays with Lean.
         | 
         | AI researchers keep trying to reduce intelligence into
         | something tiny and approachable, like automated theorem
         | proving. It's easy: you write the theorem you want proven and
         | hope you get a proof. It works or it doesn't. Nice and
         | benchmarkable.
         | 
         | Automated axiom creation seems a lot harder. How is an LLM
         | supposed to know that "between any two points there is a line"
         | formalizes an important property of physical space? Or how to
         | suggest an alternative to Turing machines / lambda calculus
         | that expresses the same underlying idea?
        
           | kvemkon wrote:
           | > What we need is automated theorem discovery.
           | 
           | I've been thinking mathematicians have fun doing math, making
           | discoveries, crafting proofs.
           | 
           | Does Tour de France & Co. make no sense since small,
           | lightweight and powerful e-bicycles appeared?
           | 
           | Using computer as a helper like bicycles is one thing, using
           | LLMs seems more like e-bicycle and is something another.
        
           | ComplexSystems wrote:
           | > We've had automated theorem proving since the 60s.
           | 
           | By that logic, we've had LLMs since the 60s!
           | 
           | > What we need is automated theorem discovery.
           | 
           | I don't see any reason you couldn't train a model to do this.
           | You'd have to focus it on generating follow-up questions to
           | ask after reading a corpus of literature, playing around with
           | some toy examples in Python and making a conjecture out of
           | it. This seems much easier than training it to actually
           | complete an entire proof.
           | 
           | > Erdos discovered these theorems even if he wasn't really
           | able to prove them. Euler and Gauss discovered a ton of stuff
           | they couldn't prove. It is weird that nobody considers this
           | to be intelligence.
           | 
           | Who says they don't? I wouldn't be surprised if HarmonicMath,
           | DeepMind, etc have also thought about this kind of thing.
           | 
           | > Automated axiom creation seems a lot harder. How is an LLM
           | supposed to know that "between any two points there is a
           | line" formalizes an important property of physical space?
           | 
           | That's a good question! It would be interesting to see if
           | this is an emergent property of multimodal LLMs trained
           | specifically on this kind of thing. You would need
           | mathematical reasoning, visual information and language
           | encoded into some shared embedding space where similar things
           | are mapped right next to each other geometrically.
        
             | AdieuToLogic wrote:
             | >> We've had automated theorem proving since the 60s.
             | 
             | > By that logic, we've had LLMs since the 60s!
             | 
             | From a bit earlier[0], actually:
             | Progressing to the 1950s and 60s            We saw the
             | development of the first language models.
             | 
             | Were those "large"? I'm sure at the time they were thought
             | to be so.
             | 
             | 0 - https://ai-researchstudies.com/history-of-large-
             | language-mod...
        
         | dang wrote:
         | (We detached this subthread from
         | https://news.ycombinator.com/item?id=46094763.)
        
       | BanditDefender wrote:
       | "Mathematicial superintelligence" is so obnoxious. Why exactly do
       | they think it is called an Erdos problem when Erdos didn't find
       | the solution? Because Erdos discovered the real mathematics: the
       | conjecture!
       | 
       | These people treat math research as if it is a math homework
       | assignment. There needs to be an honest discussion about what the
       | LLM is doing here. When you bang your head against a math problem
       | you blindly try a bunch of dumb ideas that don't actually work.
       | It wastes a lot of paper. The LLM automates a lot of that.
       | 
       | It is actually pretty cool that modern AI can help speed this up
       | and waste less paper. It is very similar to how classical AI
       | (Symbolica) sped up math research and wasted less paper. But we
       | need to have an honest discussion about how we are using the
       | computer as a tool. Instead malicious idiots like Vlad Tenev are
       | making confident predictions about mathematical
       | superintelligence. So depressing.
        
       | consumer451 wrote:
       | For reference, this is from: https://harmonic.fun/
       | 
       | related article:
       | 
       | > Is Math the Path to Chatbots That Don't Make Stuff Up?
       | 
       | https://www.nytimes.com/2024/09/23/technology/ai-chatbots-ch...
        
       | mmmmbbbhb wrote:
       | Meanwhile I have to use ai just to understand the problem
       | statement.
        
         | lolive wrote:
         | Which makes you the top 1% power users of AI. [the other 99%
         | asking for the breast size of whoever starlette appeared on TV]
        
           | sebastiennight wrote:
           | That's not what we mean when we're asking for "the release of
           | bigger models"
        
             | Dilettante_ wrote:
             | 200DD parameters
        
               | lolive wrote:
               | Bigger is not always better. [Gemini told me that
               | yesterday, during my daily psychotherapy]
        
       | sheepscreek wrote:
       | It's crazy that the empathetic AI I sometimes share my life's
       | mundane problems with wrote this. It understands such high level
       | maths and speaks in a lexicon with words I can't even pronounce.
       | Incredible versatility. If this isn't AGI (or even super
       | intelligence) then what is?
       | 
       | Sure it suffers from amnesia and cannot get basic things right
       | sometimes, but one is a design limitation that can be overcome
       | and the other a possible problem caused by training (we're
       | discovering that overt focus during training on adherence to
       | prompt somewhat lobotomizes their general competence).
        
         | vrighter wrote:
         | As for the first: Everything points towards the problem NOT
         | being able to be overcome with current architectures.
         | 
         | For the 2nd: We are completely 100% sure this cannot be solved.
         | This is not a training issue. This is the case for any
         | statistical machine. No amount of training can ever solve this.
        
         | encyclopedism wrote:
         | This is a poor take on what LLM's actually are. They don't
         | understand in any real sense.
         | 
         | If you had enough paper and ink and the patience to go through
         | it all you could take all the training data and manually step
         | through and train the same model. Then once you have trained
         | the model you could use more pen and paper to step through the
         | correct prompts to arrive at the answer. All of this would be a
         | completely mechanical process.
        
       | magicalist wrote:
       | The overhyped tweet from the robinhood guy raising money for his
       | AI startup is nicely brought into better perspective by Thomas
       | Bloom (including that #124 is not from the cited paper, "Complete
       | sequences of sets of integer powers "/BEGL96):
       | 
       | > _This is a nice solution, and impressive to be found by AI,
       | although the proof is (in hindsight) very simple, and the
       | surprising thing is that Erdos missed it. But there is definitely
       | precedent for Erdos missing easy solutions!_
       | 
       | > _Also this is not the problem as posed in that paper_
       | 
       | > _That paper asks a harder version of this problem. The problem
       | which has been solved was asked by Erdos in a couple of later
       | papers._
       | 
       | > _One also needs to be careful about saying things like 'open
       | for 30 years'. This does not mean it has resisted 30 years of
       | efforts to solve it! Many Erdos problems (including this one)
       | have just been forgotten about it, and nobody has seriously tried
       | to solve it._[1]
       | 
       | And, indeed, Boris Alexeev (who ran the problem) agrees:
       | 
       | > _My summary is that Aristotle solved "a" version of this
       | problem (indeed, with an olympiad-style proof), but not "the"
       | version._
       | 
       | > _I agree that the [BEGL96] problem is still open (for now!),
       | and your plan to keep this problem open by changing the statement
       | is reasonable. Alternatively, one could add another problem and
       | link them. I have no preference._ [2]
       | 
       | Not to rain on the parade out of spite, it's just that this is
       | neat, but not like, unusually neat compared to the last few
       | months.
       | 
       | [1] https://twitter.com/thomasfbloom/status/1995083348201586965
       | 
       | [2] https://www.erdosproblems.com/forum/thread/124#post-1899
        
         | NooneAtAll3 wrote:
         | I was so happy for this result until I saw your mention of
         | robinhood hype :/
        
         | NooneAtAll3 wrote:
         | reading the original paper and the lean statement that got
         | proven, it's kinda fascinating what exactly is considered
         | interesting and hard in this problem
         | 
         | roughly, what lean theorem (and statement on the website) asks
         | is this: take some numbers t_i, for each of them form all the
         | powers t_i^j, then combine all into multiset T. Barring some
         | necessary conditions, prove that you can take subset of T to
         | sum to any number you want
         | 
         | what Erdosh problem in the paper asks is to add one more step -
         | arbitrarily cut off beginnings of t_i^j power sequences before
         | merging. Erdosh-and-co conjectured that only finite amount of
         | subset sums stop being possible
         | 
         | "subsets sum to any number" is an easy condition to check
         | (that's why "olympiad level" gets mentioned in the discussion)
         | - and it's the "arbitrarily cut off" that's the part that og
         | question is all about, while "only finite amount disappear" is
         | hard to grasp formulaically
         | 
         | so... overhyped yes, not actually erdos problem proven yes,
         | usual math olympiad level problems are solvable by current
         | level of Ai as was shown by this year's IMO - also yes (just
         | don't get caught by https://en.wikipedia.org/wiki/AI_effect on
         | the backlash since olympiads are haaard! really!)
        
           | emp17344 wrote:
           | I'd be less skeptical about this year's IMO claims if we had
           | any information at all on how it was done.
        
         | chihuahua wrote:
         | So in short, it was an easy problem that had already been
         | solved thousands of years ago and the proof was so simple that
         | it doesn't really count, and the AI used too many em-dashes in
         | its response and it totally sucks.
        
           | sebastianz wrote:
           | > problem that had already been solved thousands of years ago
           | 
           | If by this you refer to "Aristotle" in the parent post - it's
           | not _that_ Aristotle. This is  "Aristotle AI" - the name of
           | their product.
        
         | bgwalter wrote:
         | Also in the thread comments:
         | 
         | "I also wonder whether this 'easy' version of the problem has
         | actually appeared in some mathematical competition before now,
         | which would of course pollute the training data if Aristotle
         | [Ed.: the clanker's name] had seen this solution already
         | written up somewhere."
        
         | smcl wrote:
         | See this is one of the reasons I struggle to get on board the
         | AI hype train. Any time I've seen some breathless claim about
         | it's capabilities that feels a bit too good to be true, someone
         | with knowledge in the domain takes a closer look and it turns
         | out to have been exaggerated and meant to draw eyeballs and
         | investors to some fledgling AI company.
         | 
         | I just feel like if we were genuinely on the cusp of an AI
         | revolution like it is claimed, we wouldn't need to keep seeing
         | this sort of thing. Like I feel like a lot of the industry is
         | full of flim-flam men trying to scam people, and if the tech
         | was as capable as we keep getting told it is there'd be no need
         | for dishonesty or sleight of hand.
        
           | jimkleiber wrote:
           | The crypto train kinda ran out of steam, so all aboard the AI
           | train.
           | 
           | That being said, I think AI has a lot more immediately useful
           | cases than cryptocurrency. But it does feel a bit overhyped
           | by people who stand to gain a tremendous amount of money.
           | 
           | I might get slammed/downvoted on HN for this, but really
           | wondering how much of VC is filled with get-rich-quick
           | cheerleading vs supporting products that will create strong
           | and lasting growth.
        
             | _heimdall wrote:
             | I don't think you really need to wonder about how much is
             | cheer leading. Effectively all of VC public statements will
             | be cheer leading for companies they already invested in.
             | 
             | The more interesting one is the closed door conversations.
             | Earlier this year, for example, it seemed there was a
             | pattern of VCs heavily invested in AI asking the other
             | software companies they invested in to figure out how to
             | make AI useful for them and report back. I.e. "we invested
             | heavily in hype, tell us how to make it real."
        
             | soulofmischief wrote:
             | From my perspective, having worked in both industries and
             | simply following my passions and opportunities, all I see
             | is that the same two bandwagons who latched onto crypto
             | either to grift or just egotistically talk shit have moved
             | over to the latest technological breakthrough, meanwhile
             | those of us silently working on interesting things are
             | consantly rolling our eyes over comments from both sides of
             | the peanut gallery.
        
           | MangoToupe wrote:
           | The thing is, we genuinely _are_ going through an AI
           | revolution. I don 't even think that's that breathless of a
           | claim. The contention is over whether it's about to
           | revolutionize our _economy_ , which is a far harder claim to
           | substantiate and _should_ be largely self-substantiating if
           | it is going to happen.
        
           | andrepd wrote:
           | If OpenAI saw an imminent path to AGI in 6 months (or in 5
           | years for that matter) they would not be pivoting to become a
           | banal big tech ad company.
           | 
           | Short AI and tech, and just hope you get the timing right.
        
           | encyclopedism wrote:
           | I have commented elsewhere but this bears repeating
           | 
           | If you had enough paper and ink and the patience to go
           | through it, you could take all the training data and manually
           | step through and train the same model. Then once you have
           | trained the model you could use even more pen and paper to
           | step through the correct prompts to arrive at the answer. All
           | of this would be a completely mechanical process. This really
           | does bear thinking about. It's amazing the results that LLM's
           | are able to acheive. But let's not kid ourselves and start
           | throwing about terms like AGI or emergence just yet. It makes
           | a mechanical process seem magical (as do computers in
           | general).
           | 
           | I should add it also makes sense as to why it would, just
           | look at the volume of human knowledge (the training data).
           | It's the training data with the mass quite literally of
           | mankind's knowledge, genius, logic, inferences, language and
           | intellect that does the heavy lifting.
        
       | WhyOhWhyQ wrote:
       | I predict the way AI will be useful in science from the
       | perspective of mathematics is by figuring out combinatorially
       | complex solutions to problems that would otherwise not be
       | interesting to (or far too complex to be comprehended by) humans.
       | With such capabilities it could be imagined then that the AI will
       | be useful for designing super materials, or doing fancy things
       | with biology / medicine, and generally finding useful patterns in
       | complex systems.
        
         | estebarb wrote:
         | However, they at most would be the heuristic function of a
         | search mechanism. A good heuristic, but heuristic at most. For
         | search we need to identify when to abandon a path and which
         | other entry point is promising. I'm not sure our current
         | techniques are good for this kind of problems.
        
         | bluecalm wrote:
         | I think it will be even more helpful to know a simple proof
         | doesn't exist because AI has tried for long enough and didn't
         | find it. Once people know there is no easy proof of say Collatz
         | or Twin Primes Conjencture those will not be as alluring to
         | waste your time on.
        
         | Someone wrote:
         | But abstract mathematics doesn't care about solutions to
         | problems; it cares about understanding problem spaces. I do not
         | think current AI helps with that.
         | 
         | Problems like the one discussed also aren't interesting to
         | applied mathematicians, either, because of lack of
         | applications.
         | 
         | But yes, if this kind of AI produces new materials, solves
         | diseases, etc. they will be very useful. We wouldn't care
         | whether they arrived at those solutions through valid
         | reasoning, though. A sloppy AI that has better
         | 'guesses/intuitions' than humans or that can guess and check
         | 'guesses/intuitions' for correctness faster would be very
         | useful.
        
           | adastra22 wrote:
           | And engineers don't care about abstract mathematics: we care
           | about math that solves problems. Being able to solve more
           | problems with less human-years of effort is a big win.
        
       | newmana wrote:
       | "Recently, yet another category of low-hanging fruit has been
       | identified as within reach of automated tools: problems which,
       | due to a technical flaw in their description, are unexpectedly
       | easy to resolve. Specifically, problem #124
       | https://www.erdosproblems.com/124 was a problem that was stated
       | in three separate papers of Erdos, but in two of them he omitted
       | a key hypothesis which made the problem a consequence of a known
       | result (Brown's criterion). However, this fact was not noticed
       | until Boris Alexeev applied the Aristotle tool to this problem,
       | which autonomously located (and formalized in Lean) a solution to
       | this weaker version of the problem within hours."
       | 
       | https://mathstodon.xyz/@tao/115639984077620023
        
         | adastra22 wrote:
         | That doesn't seem very fair. The problem was stated, and
         | remained unsolved for all this time. You can't take away that
         | accomplishment just because the solution seems easy in
         | hindsight.
        
           | zozbot234 wrote:
           | It's technically true that this version of the problem was
           | "low-hanging fruit", so that's an entirely fair assessment.
           | Systematically spotting low-hanging fruit that others had
           | missed _is_ an accomplishment, but it 's quite different from
           | solving a genuinely hard problem and we shouldn't conflate
           | the two.
        
             | adastra22 wrote:
             | My point is stronger than that. Some things only appear low
             | hanging fruit in hindsight. My own field of physics is full
             | of examples. Saying "oh that should've been easy" is wrong
             | more often than it is right.
        
               | zozbot234 wrote:
               | Sure, but unless _all_ solvable problems can be said to
               | "appear as low-hanging fruit in hindsight" this doesn't
               | detract from Tao's assessment in any way. Solving a
               | genuinely complex problem is a different matter than
               | spotting simpler solutions that others had missed.
               | 
               | In this case, the solution might have been missed before
               | simply because the difference between the "easy" and
               | "hard" formulations of the problem wasn't quite clear,
               | including perhaps to Erdos, prior to it being restated
               | (manually) as a Lean goal to be solved. So this is a
               | success story for formalization as much as AI.
        
               | menaerus wrote:
               | One of the math academics on that thread says the
               | following:
               | 
               | > My point is that basic ideas reappear at many places;
               | humans often fail to realize that they apply in a
               | different setting, while a machine doesn't have this
               | problem! I remember seeing this problem before and
               | thinking about it briefly. I admit that I haven't noticed
               | this connection, which is only now quite obvious to me!
               | 
               | Doesn't this sound extremely familiar to all of us who
               | were taught difficult/abstract topics? Looking at the
               | problem, you don't have a slightest idea what is it about
               | but then someone comes along and explains the innerbits
               | and it suddenly "clicks" for you.
               | 
               | So, yeah, this is exactly what I think is happening here.
               | The solution was there, and it was simple, but nobody
               | discovered it up until now. And now that we have an
               | explanation for it we say "oh, it was really simple".
               | 
               | The bit which makes it very interesting is that this
               | hasn't been discovered before and now it has been
               | discovered by the AI model.
               | 
               | Tao challenges this by hypothesizing that it actually was
               | done before but never "released" officially, and which is
               | why the model was able to solve the problem. However,
               | there's no evidence (yet) for his hypothesis.
        
               | magicalist wrote:
               | Is your argument that Terence Tao says it was a
               | consequence from a known result and he categorizes it as
               | low hanging fruit, but to you it _feels_ like one of
               | those things that 's only obvious in retrospect after
               | it's explained to you, and without "evidence" of Tao's
               | claim, you're going to go with your vibes?
               | 
               | What exactly would constitute evidence?
        
               | menaerus wrote:
               | Known result from where? I failed to see the reference
               | where it was proven before. Maybe I missed it. Do you
               | have a reference?
        
               | adastra22 wrote:
               | Consequence from a known result describes literally every
               | proof.
        
               | ak_111 wrote:
               | It's a combination of the problem appearing to be low-
               | hanging fruit in hindsight and the fact that _almost
               | nobody actually seemed to have checked whether it was
               | low-hanging in the first place_. We know it's the latter
               | because the problem was essentially uncited for the last
               | three decades, and it didn 't seem to have spread by word
               | of mouth (spreading by word of mouth is how many
               | interesting problems get spread in math).
               | 
               | This is different from the situation you are talking
               | about, where a problem genuinely appears difficult,
               | attracts sustained attention, and is cited repeatedly as
               | many people attempt partial results or variations. Then
               | eventually someone discovers a surprisingly simple
               | solution to the original problem which basically make all
               | the previous paper look ridiculous in hindsight.
               | 
               | In those cases, the problem only looks "easy" in
               | hindsight, and the solution is rightly celebrated because
               | there is clear evidence that many competent
               | mathematicians tried and failed before.
               | 
               | Are there any evidence that this problem was ever
               | attempted by a serious mathematician?
        
       | vatsachak wrote:
       | I know the problem the AI solved wasn't that hard but one should
       | consider math toast within the next 10-15 years.
       | 
       | Software engineering is more subtle since you actually care about
       | the performance and semantics of a program. Unless you get really
       | good at program specification, it would be difficult to fully
       | automate.
       | 
       | But with math the only thing you care about is whether the moves
       | you made were right.
        
         | emil-lp wrote:
         | > one should consider math toast within the next 10-15 years.
         | 
         | I keep a list of ridiculously failed predictions, and this goes
         | into it.
         | 
         | Can I ask you to be more concrete? What does "math is toast"
         | mean to you?
        
           | n4r9 wrote:
           | I'd like to see some other instances from your list.
        
       | harshitaneja wrote:
       | The amount of goal post shifting is so amusing to see. Yes, sure
       | this was probably not an "important" or a particularly
       | "challenging" problem which had been open for a while. Sure,
       | maybe it remained open because it didn't get enough eyeballs from
       | the right people to care about spending time on it. Yes, there is
       | too much overhyping and we are all tired of it somewhat. I still
       | think if someone 10 years ago told me we would get "AI" to a
       | stage where it can solve olympiad level problems and getting gold
       | medals in IMO on top of doing so with input not in a very
       | structured input but rather our complex, messy human natural
       | language and being able to do so while interpreting, to various
       | degrees of meaning what interpreting means, image and video data
       | and doing so in almost real time I would have called you nuts and
       | this thing in such a duration sci-fi. So some part of me feels
       | crazy how quickly we have normalized to this new reality.
       | 
       | A reality where we are talking about if the problem solved by the
       | automated model using formal verification was too easy.
       | 
       | Don't get me wrong, I am not saying any of this means we get AGI
       | or something or even if we continue to see improvements. We can
       | still appreciate things. It doesn't need to be a binary. What a
       | time to be alive!
        
         | Davidzheng wrote:
         | Yeah, I agree. As a mathematician it's easy to say that it's
         | not a super hard proof. But then again, the first proofs found
         | by running AI on a large bucket of open problems was always
         | going to be some easy proofs on a not-very-worked-on problem.
         | The fact that no one did it before them definitely shows real
         | progress being made. When you're an expert, it's hard to lose
         | track of the fact that things that you consider trivial vs very
         | hard may be in fact a very small distance in the grand scheme
         | of things (rel to entities of different oom strengths)
        
         | magicalist wrote:
         | > _The amount of goal post shifting is so amusing to see_
         | 
         | Can you be specific about the goal posts being shifted? Like
         | the specific comments you're referring to here. Maybe I'm just
         | falling for the bait, but non specific claims like this seem
         | designed just to annoy while having nothing specific to
         | converse about.
         | 
         | I got to the end of your comment and counting all the claims
         | you discounted, the only goal post I see left is that people
         | aren't using a sufficiently excited tone while sifting fact
         | from hype? A lot of us follow this work pretty closely and
         | don't feel the need to start every post with "there is no need
         | for excitement to abate, still exciting! but...".
         | 
         | > _I am not saying any of this means we get AGI or something or
         | even if we continue to see improvements. We can still
         | appreciate things. It doesn 't need to be a binary._
         | 
         | You'll note, however, that the hype guys happily include
         | statements like "Vibe proving is here" in their posts with no
         | nuance, all binary. Why not call them out?
        
           | BigParm wrote:
           | Maybe the Turing test for one. Maybe etc.
        
             | mrguyorama wrote:
             | Except nobody ever actually considered the "turing test" to
             | be anything other than a curiosity in the early days of a
             | certain branch of philosophy.
             | 
             | If the turing test is a goal, then we passed it 60 years
             | ago and AGI has been here since the LISP days. If the
             | turing test is not a goal (which is the correct
             | interpretation), nobody should care what a random nobody
             | thinks about an LLM "passing" it.
             | 
             | "LLMs pass the turing test so they are intelligent (or
             | whatever)" is not a valid argument full stop, because "the
             | turing test" was never a real thing ever meant to actually
             | tell the difference between human intelligence and
             | artificial intelligence, and was never formalized, and
             | never evaluated for its ability to do so. The entire point
             | of the turing test was to _be part of a conversation about
             | thinking machines_ in a world where that was an interesting
             | proposition.
             | 
             | The only people who ever took the turing test as a "goal"
             | were the misinformed public. Again, that interpretation of
             | the turing test has been passed by things like ELIZA and
             | _markov chain based IRC bots_.
        
           | zug_zug wrote:
           | Well there's a comment here saying "I won't consider it
           | 'true' AI until it solves all millenium problems"... That
           | goalpost seems to be defining AI as not only human level but
           | as superhuman level (e.g. 1 in a million level intellect or
           | harder)
        
             | nitwit005 wrote:
             | That's the only such comment I found. The amount of goal
             | pushing would seem to be 1.
        
         | Libidinalecon wrote:
         | I am rather pro-AI in general but I just can't imagine in 2015
         | what I would think if you told me that we would have AI that
         | could solve an Erdos problem from natural language but it can't
         | answer my work emails.
         | 
         | It actually doesn't help me at all at anything at my job and I
         | really wish it could.
         | 
         | That isn't really goal post moving as much as a very strange
         | shape to the goal post.
        
         | jvanderbot wrote:
         | The most amazing thing about AI is how it's amazing and
         | disappointing at the same time.
         | 
         | Everything is amazing and nobody is happy:
         | https://www.youtube.com/watch?v=nUBtKNzoKZ4
         | 
         | "How quickly the world owes him something, he knew existed only
         | 10 seconds ago"
        
       | kelseyfrog wrote:
       | An Erdos problem is great, but all major LLMs still score 0 out
       | of 7 on all Millenium Prize Problems[1]. I, and I'm sure other
       | like me, won't consider it "true" AI until AI saturate the
       | MillenniumPrizeProblemBench.
       | 
       | 1. https://www.claymath.org/millennium-problems/
       | 
       | 1.
        
         | jfengel wrote:
         | That's interesting, because what I'm looking for from AI is the
         | opposite. Rather than looking to do something humans find hard,
         | I'm more interested in the problems that humans find easy, like
         | driving a car or counting the number of "r"s in a word.
         | 
         | Having it go beyond that to do things that we can't do would be
         | a nice bonus. But given the way AI has been going, I'd say
         | there's a decent chance that it solves a Millennium Prize
         | problem before it solves driving.
        
       ___________________________________________________________________
       (page generated 2025-12-01 23:02 UTC)