[HN Gopher] AI just proved Erdos Problem #124
___________________________________________________________________
AI just proved Erdos Problem #124
https://twitter.com/vladtenev/status/1994922827208663383
Author : nl
Score : 231 points
Date : 2025-11-30 05:19 UTC (1 days ago)
(HTM) web link (www.erdosproblems.com)
(TXT) w3m dump (www.erdosproblems.com)
| ares623 wrote:
| [flagged]
| bogdan wrote:
| Seriously, blame the investors.
| wasmainiac wrote:
| Nah, everyone is to blame here in bubble.
| dang wrote:
| " _Please don 't post shallow dismissals, especially of other
| people's work. A good critical comment teaches us something._"
|
| " _Don 't be curmudgeonly. Thoughtful criticism is fine, but
| please don't be rigidly or generically negative._"
|
| https://news.ycombinator.com/newsguidelines.html
| menaerus wrote:
| This seems to be 2nd in row proof from the same author by using
| the AI models. First time it was the ChatGPT which wrote the
| formal Lean proof for Erdos Problem #340.
|
| https://arxiv.org/html/2510.19804v1#Thmtheorem3
|
| > In over a dozen papers, beginning in 1976 and spanning two
| decades, Paul Erdos repeatedly posed one of his "favourite"
| conjectures: every finite Sidon set can be extended to a finite
| perfect difference set. We establish that {1, 2, 4, 8, 13} is a
| counterexample to this conjecture.
| NooneAtAll3 wrote:
| that one was vibe-coded
|
| Ai was given step-by-step already found proof, and asked
| "please rewrite in Lean"
|
| ---
|
| here Ai did the proof itself
| wasmainiac wrote:
| Ok... has this been verified? I see no publication or at least an
| announcement on Harmonics webpage. If this is a big deal, you
| think it would be a big deal, or is this just hype?
| singularity2001 wrote:
| verified by lean so 99.99% yes
| cluckindan wrote:
| Lean verified a proof of a solution to a problem, but was it
| the same problem as Erdos problem #124?
|
| https://www.erdosproblems.com/forum/thread/124#post-1899
| wasmainiac wrote:
| > My summary is that Aristotle solved "a" version of this
| problem (indeed, with an olympiad-style proof), but not
| "the" version.
|
| > I agree that the [BEGL96] problem is still open (for
| now!), and your plan to keep this problem open by changing
| the statement is reasonable. Alternatively, one could add
| another problem and link them. I have no preference. --
| BorisAlexeev
|
| There we go, so there is hype to some degree.
| aaomidi wrote:
| Is there some good literature to read about lean? First time
| I'm hearing about it and it seems pretty cool.
| adt wrote:
| Related, independent, and verified:
|
| GPT-5 solved Erdos problem #848 (combinatorial number theory):
|
| https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8...
|
| https://lifearchitect.ai/asi/
| ares623 wrote:
| FTL? Seriously?
| demirbey05 wrote:
| I read how GPT-5 contributed to proof. It is not fully solved
| by GPT-5 instead assisted. For more look here
| https://www.math.columbia.edu/~msawhney/Problem_848.pdf
| esperent wrote:
| More interesting discussion than on Twitter here:
|
| https://www.erdosproblems.com/forum/thread/124#post-1892
| dang wrote:
| Ok, we've changed the link to that from
| https://twitter.com/vladtenev/status/1994922827208663383, but
| I've included the latter in the toptext. Thanks!
| demirbey05 wrote:
| This is response from mathematician: "This is quite something,
| congratulations to Boris and Aristotle!
|
| On one hand, as the nice sketch provided below by tsaf confirms,
| the final proof is quite simple and elementary - indeed, if one
| was given this problem in a maths competition (so therefore
| expected a short simple solution existed) I'd guess that
| something like the below would be produced. On the other hand, if
| something like this worked, then surely the combined talents of
| Burr, Erdos, Graham, and Li would have spotted it.
|
| Normally, this would make me suspicious of this short proof, in
| that there is overlooked subtlety. But (a) I can't see any and
| (b) the proof has been formalised in Lean, so clearly it just
| works!
|
| Perhaps this shows what the real issue in the [BEGL96] conjecture
| is - namely the removal of 1 and the addition of the necessary
| gcd condition. (And perhaps at least some subset of the authors
| were aware of this argument for the easier version allowing 1,
| but this was overlooked later by Erdos in [Er97] and [Er97e],
| although if they were aware then one would hope they'd have
| included this in the paper as a remark.)
|
| At the moment I'm minded to keep this as open, and add the gcd
| condition in the main statement, and note in the remarks that the
| easier (?) version allowing 1 and omitting the gcd condition,
| which was also asked independently by Erdos, has been solved."
|
| The commentator is saying: "I can't believe this famous problem
| was solved so easily. I would have thought it was a fake proof,
| but the computer verified it. It turns out the solution works
| because it addresses a slightly different set of constraints
| (regarding the number 1) than what Erdos originally struggled
| with. (Generated by Gemini)
| aaomidi wrote:
| I think that person is the owner of the website discussing it
| too.
|
| He was cited https://the-decoder.com/leading-openai-researcher-
| announced-...
|
| Where a while back OpenAI made a misleading claim about solving
| some of these problems.
| dang wrote:
| > Generated by Gemini
|
| Please don't post generated comments here. We want HN to be a
| place for humans writing in their own voice.
| ComplexSystems wrote:
| Are you kidding? This is an incredible result. Stuff like this is
| the most important stuff happening in AI right now. Automated
| theorem proving? It's not too far to say the entire singular
| point of the technology was to get us to this.
| ares623 wrote:
| This is me being snarky and ignorant, but if it solved one
| problem and it is automated what's stopping it from solving all
| the others? That's what's ultimately being sold by the tweet
| right.
| aeve890 wrote:
| >This is me being snarky and ignorant, but if it solved one
| problem and it is automated what's stopping it from solving
| all the others?
|
| Yeah that's the crux of the matter. How do AI did it? Using
| already existing math. If we need new math to prove Collatz,
| Goldbach or Riemman, LLMs are simply SOL. That's what's
| missing and hype boys always avoid to mention.
| dwohnitmok wrote:
| > How do AI did it? Using already existing math. If we need
| new math to prove Collatz, Goldbach or Riemman, LLMs are
| simply SOL.
|
| An unproved theorem now proved is by definition new math.
| Will LLMs get you to Collatz, Goldbach, or Riemann?
| Unclear.
|
| But it's not like there's some magical, entirely unrelated
| to existing math, "new math" that was required to solve all
| the big conjectures of the past. They proceeded, as always,
| by proving new theorems one by one.
| yorwba wrote:
| Yes, "new math" is neither magical nor unrelated to
| existing math, but that doesn't mean any new theorem or
| proof is automatically "new math." I think the term is
| usually reserved for the definition of a new kind of
| mathematical object, about which you prove theorems
| relating it to existing math, which then allows you to
| construct qualitatively new proofs by transforming
| statements into the language of your new kind of object
| and back.
|
| I think eventually LLMs will also be used as part of
| systems that come up with new, broadly useful
| definitions, but we're not there yet.
| aeve890 wrote:
| >An unproved theorem now proved is by definition new
| math.
|
| No. By _new math_ I mean new mathematical constructs and
| theories like (to mention the "newest" ones) category
| theory, information theory, homotopy type theory, etc.
| Something like Cantor inventing set theory, or Shannon
| with information theory, or Euler with graph theory.
|
| AFAIK no new field of mathematics as been created _by_
| AI. Feel free to correct me.
| ComplexSystems wrote:
| Ultimately the main thing that will stop it from solving
| literally "all the others" are things like the impossibility
| of solving the halting problem, considerations like P [?] NP,
| etc. But as we have just seen, despite these impossibility
| theorems, AI systems are still able to make substantive
| progress on solving important open real-world problems.
| thaumasiotes wrote:
| > But as we have just seen, despite these impossibility
| theorems, AI systems are still able to make substantive
| progress on solving important open real-world problems.
|
| You seem to be smuggling in the assumption that this
| problem was "important".
| adi_kurian wrote:
| Sucks that fuckers dupe suckers. Personally, I find the news
| to be incredible.
| BanditDefender wrote:
| I think it is way too far to say that!
|
| We've had automated theorem proving since the 60s. What we need
| is automated theorem _discovery._ Erdos discovered these
| theorems even if he wasn 't really able to prove them. Euler
| and Gauss discovered a ton of stuff they couldn't prove. It is
| weird that nobody considers this to be intelligence. Instead
| intelligence is a little game AI plays with Lean.
|
| AI researchers keep trying to reduce intelligence into
| something tiny and approachable, like automated theorem
| proving. It's easy: you write the theorem you want proven and
| hope you get a proof. It works or it doesn't. Nice and
| benchmarkable.
|
| Automated axiom creation seems a lot harder. How is an LLM
| supposed to know that "between any two points there is a line"
| formalizes an important property of physical space? Or how to
| suggest an alternative to Turing machines / lambda calculus
| that expresses the same underlying idea?
| kvemkon wrote:
| > What we need is automated theorem discovery.
|
| I've been thinking mathematicians have fun doing math, making
| discoveries, crafting proofs.
|
| Does Tour de France & Co. make no sense since small,
| lightweight and powerful e-bicycles appeared?
|
| Using computer as a helper like bicycles is one thing, using
| LLMs seems more like e-bicycle and is something another.
| ComplexSystems wrote:
| > We've had automated theorem proving since the 60s.
|
| By that logic, we've had LLMs since the 60s!
|
| > What we need is automated theorem discovery.
|
| I don't see any reason you couldn't train a model to do this.
| You'd have to focus it on generating follow-up questions to
| ask after reading a corpus of literature, playing around with
| some toy examples in Python and making a conjecture out of
| it. This seems much easier than training it to actually
| complete an entire proof.
|
| > Erdos discovered these theorems even if he wasn't really
| able to prove them. Euler and Gauss discovered a ton of stuff
| they couldn't prove. It is weird that nobody considers this
| to be intelligence.
|
| Who says they don't? I wouldn't be surprised if HarmonicMath,
| DeepMind, etc have also thought about this kind of thing.
|
| > Automated axiom creation seems a lot harder. How is an LLM
| supposed to know that "between any two points there is a
| line" formalizes an important property of physical space?
|
| That's a good question! It would be interesting to see if
| this is an emergent property of multimodal LLMs trained
| specifically on this kind of thing. You would need
| mathematical reasoning, visual information and language
| encoded into some shared embedding space where similar things
| are mapped right next to each other geometrically.
| AdieuToLogic wrote:
| >> We've had automated theorem proving since the 60s.
|
| > By that logic, we've had LLMs since the 60s!
|
| From a bit earlier[0], actually:
| Progressing to the 1950s and 60s We saw the
| development of the first language models.
|
| Were those "large"? I'm sure at the time they were thought
| to be so.
|
| 0 - https://ai-researchstudies.com/history-of-large-
| language-mod...
| dang wrote:
| (We detached this subthread from
| https://news.ycombinator.com/item?id=46094763.)
| BanditDefender wrote:
| "Mathematicial superintelligence" is so obnoxious. Why exactly do
| they think it is called an Erdos problem when Erdos didn't find
| the solution? Because Erdos discovered the real mathematics: the
| conjecture!
|
| These people treat math research as if it is a math homework
| assignment. There needs to be an honest discussion about what the
| LLM is doing here. When you bang your head against a math problem
| you blindly try a bunch of dumb ideas that don't actually work.
| It wastes a lot of paper. The LLM automates a lot of that.
|
| It is actually pretty cool that modern AI can help speed this up
| and waste less paper. It is very similar to how classical AI
| (Symbolica) sped up math research and wasted less paper. But we
| need to have an honest discussion about how we are using the
| computer as a tool. Instead malicious idiots like Vlad Tenev are
| making confident predictions about mathematical
| superintelligence. So depressing.
| consumer451 wrote:
| For reference, this is from: https://harmonic.fun/
|
| related article:
|
| > Is Math the Path to Chatbots That Don't Make Stuff Up?
|
| https://www.nytimes.com/2024/09/23/technology/ai-chatbots-ch...
| mmmmbbbhb wrote:
| Meanwhile I have to use ai just to understand the problem
| statement.
| lolive wrote:
| Which makes you the top 1% power users of AI. [the other 99%
| asking for the breast size of whoever starlette appeared on TV]
| sebastiennight wrote:
| That's not what we mean when we're asking for "the release of
| bigger models"
| Dilettante_ wrote:
| 200DD parameters
| lolive wrote:
| Bigger is not always better. [Gemini told me that
| yesterday, during my daily psychotherapy]
| sheepscreek wrote:
| It's crazy that the empathetic AI I sometimes share my life's
| mundane problems with wrote this. It understands such high level
| maths and speaks in a lexicon with words I can't even pronounce.
| Incredible versatility. If this isn't AGI (or even super
| intelligence) then what is?
|
| Sure it suffers from amnesia and cannot get basic things right
| sometimes, but one is a design limitation that can be overcome
| and the other a possible problem caused by training (we're
| discovering that overt focus during training on adherence to
| prompt somewhat lobotomizes their general competence).
| vrighter wrote:
| As for the first: Everything points towards the problem NOT
| being able to be overcome with current architectures.
|
| For the 2nd: We are completely 100% sure this cannot be solved.
| This is not a training issue. This is the case for any
| statistical machine. No amount of training can ever solve this.
| encyclopedism wrote:
| This is a poor take on what LLM's actually are. They don't
| understand in any real sense.
|
| If you had enough paper and ink and the patience to go through
| it all you could take all the training data and manually step
| through and train the same model. Then once you have trained
| the model you could use more pen and paper to step through the
| correct prompts to arrive at the answer. All of this would be a
| completely mechanical process.
| magicalist wrote:
| The overhyped tweet from the robinhood guy raising money for his
| AI startup is nicely brought into better perspective by Thomas
| Bloom (including that #124 is not from the cited paper, "Complete
| sequences of sets of integer powers "/BEGL96):
|
| > _This is a nice solution, and impressive to be found by AI,
| although the proof is (in hindsight) very simple, and the
| surprising thing is that Erdos missed it. But there is definitely
| precedent for Erdos missing easy solutions!_
|
| > _Also this is not the problem as posed in that paper_
|
| > _That paper asks a harder version of this problem. The problem
| which has been solved was asked by Erdos in a couple of later
| papers._
|
| > _One also needs to be careful about saying things like 'open
| for 30 years'. This does not mean it has resisted 30 years of
| efforts to solve it! Many Erdos problems (including this one)
| have just been forgotten about it, and nobody has seriously tried
| to solve it._[1]
|
| And, indeed, Boris Alexeev (who ran the problem) agrees:
|
| > _My summary is that Aristotle solved "a" version of this
| problem (indeed, with an olympiad-style proof), but not "the"
| version._
|
| > _I agree that the [BEGL96] problem is still open (for now!),
| and your plan to keep this problem open by changing the statement
| is reasonable. Alternatively, one could add another problem and
| link them. I have no preference._ [2]
|
| Not to rain on the parade out of spite, it's just that this is
| neat, but not like, unusually neat compared to the last few
| months.
|
| [1] https://twitter.com/thomasfbloom/status/1995083348201586965
|
| [2] https://www.erdosproblems.com/forum/thread/124#post-1899
| NooneAtAll3 wrote:
| I was so happy for this result until I saw your mention of
| robinhood hype :/
| NooneAtAll3 wrote:
| reading the original paper and the lean statement that got
| proven, it's kinda fascinating what exactly is considered
| interesting and hard in this problem
|
| roughly, what lean theorem (and statement on the website) asks
| is this: take some numbers t_i, for each of them form all the
| powers t_i^j, then combine all into multiset T. Barring some
| necessary conditions, prove that you can take subset of T to
| sum to any number you want
|
| what Erdosh problem in the paper asks is to add one more step -
| arbitrarily cut off beginnings of t_i^j power sequences before
| merging. Erdosh-and-co conjectured that only finite amount of
| subset sums stop being possible
|
| "subsets sum to any number" is an easy condition to check
| (that's why "olympiad level" gets mentioned in the discussion)
| - and it's the "arbitrarily cut off" that's the part that og
| question is all about, while "only finite amount disappear" is
| hard to grasp formulaically
|
| so... overhyped yes, not actually erdos problem proven yes,
| usual math olympiad level problems are solvable by current
| level of Ai as was shown by this year's IMO - also yes (just
| don't get caught by https://en.wikipedia.org/wiki/AI_effect on
| the backlash since olympiads are haaard! really!)
| emp17344 wrote:
| I'd be less skeptical about this year's IMO claims if we had
| any information at all on how it was done.
| chihuahua wrote:
| So in short, it was an easy problem that had already been
| solved thousands of years ago and the proof was so simple that
| it doesn't really count, and the AI used too many em-dashes in
| its response and it totally sucks.
| sebastianz wrote:
| > problem that had already been solved thousands of years ago
|
| If by this you refer to "Aristotle" in the parent post - it's
| not _that_ Aristotle. This is "Aristotle AI" - the name of
| their product.
| bgwalter wrote:
| Also in the thread comments:
|
| "I also wonder whether this 'easy' version of the problem has
| actually appeared in some mathematical competition before now,
| which would of course pollute the training data if Aristotle
| [Ed.: the clanker's name] had seen this solution already
| written up somewhere."
| smcl wrote:
| See this is one of the reasons I struggle to get on board the
| AI hype train. Any time I've seen some breathless claim about
| it's capabilities that feels a bit too good to be true, someone
| with knowledge in the domain takes a closer look and it turns
| out to have been exaggerated and meant to draw eyeballs and
| investors to some fledgling AI company.
|
| I just feel like if we were genuinely on the cusp of an AI
| revolution like it is claimed, we wouldn't need to keep seeing
| this sort of thing. Like I feel like a lot of the industry is
| full of flim-flam men trying to scam people, and if the tech
| was as capable as we keep getting told it is there'd be no need
| for dishonesty or sleight of hand.
| jimkleiber wrote:
| The crypto train kinda ran out of steam, so all aboard the AI
| train.
|
| That being said, I think AI has a lot more immediately useful
| cases than cryptocurrency. But it does feel a bit overhyped
| by people who stand to gain a tremendous amount of money.
|
| I might get slammed/downvoted on HN for this, but really
| wondering how much of VC is filled with get-rich-quick
| cheerleading vs supporting products that will create strong
| and lasting growth.
| _heimdall wrote:
| I don't think you really need to wonder about how much is
| cheer leading. Effectively all of VC public statements will
| be cheer leading for companies they already invested in.
|
| The more interesting one is the closed door conversations.
| Earlier this year, for example, it seemed there was a
| pattern of VCs heavily invested in AI asking the other
| software companies they invested in to figure out how to
| make AI useful for them and report back. I.e. "we invested
| heavily in hype, tell us how to make it real."
| soulofmischief wrote:
| From my perspective, having worked in both industries and
| simply following my passions and opportunities, all I see
| is that the same two bandwagons who latched onto crypto
| either to grift or just egotistically talk shit have moved
| over to the latest technological breakthrough, meanwhile
| those of us silently working on interesting things are
| consantly rolling our eyes over comments from both sides of
| the peanut gallery.
| MangoToupe wrote:
| The thing is, we genuinely _are_ going through an AI
| revolution. I don 't even think that's that breathless of a
| claim. The contention is over whether it's about to
| revolutionize our _economy_ , which is a far harder claim to
| substantiate and _should_ be largely self-substantiating if
| it is going to happen.
| andrepd wrote:
| If OpenAI saw an imminent path to AGI in 6 months (or in 5
| years for that matter) they would not be pivoting to become a
| banal big tech ad company.
|
| Short AI and tech, and just hope you get the timing right.
| encyclopedism wrote:
| I have commented elsewhere but this bears repeating
|
| If you had enough paper and ink and the patience to go
| through it, you could take all the training data and manually
| step through and train the same model. Then once you have
| trained the model you could use even more pen and paper to
| step through the correct prompts to arrive at the answer. All
| of this would be a completely mechanical process. This really
| does bear thinking about. It's amazing the results that LLM's
| are able to acheive. But let's not kid ourselves and start
| throwing about terms like AGI or emergence just yet. It makes
| a mechanical process seem magical (as do computers in
| general).
|
| I should add it also makes sense as to why it would, just
| look at the volume of human knowledge (the training data).
| It's the training data with the mass quite literally of
| mankind's knowledge, genius, logic, inferences, language and
| intellect that does the heavy lifting.
| WhyOhWhyQ wrote:
| I predict the way AI will be useful in science from the
| perspective of mathematics is by figuring out combinatorially
| complex solutions to problems that would otherwise not be
| interesting to (or far too complex to be comprehended by) humans.
| With such capabilities it could be imagined then that the AI will
| be useful for designing super materials, or doing fancy things
| with biology / medicine, and generally finding useful patterns in
| complex systems.
| estebarb wrote:
| However, they at most would be the heuristic function of a
| search mechanism. A good heuristic, but heuristic at most. For
| search we need to identify when to abandon a path and which
| other entry point is promising. I'm not sure our current
| techniques are good for this kind of problems.
| bluecalm wrote:
| I think it will be even more helpful to know a simple proof
| doesn't exist because AI has tried for long enough and didn't
| find it. Once people know there is no easy proof of say Collatz
| or Twin Primes Conjencture those will not be as alluring to
| waste your time on.
| Someone wrote:
| But abstract mathematics doesn't care about solutions to
| problems; it cares about understanding problem spaces. I do not
| think current AI helps with that.
|
| Problems like the one discussed also aren't interesting to
| applied mathematicians, either, because of lack of
| applications.
|
| But yes, if this kind of AI produces new materials, solves
| diseases, etc. they will be very useful. We wouldn't care
| whether they arrived at those solutions through valid
| reasoning, though. A sloppy AI that has better
| 'guesses/intuitions' than humans or that can guess and check
| 'guesses/intuitions' for correctness faster would be very
| useful.
| adastra22 wrote:
| And engineers don't care about abstract mathematics: we care
| about math that solves problems. Being able to solve more
| problems with less human-years of effort is a big win.
| newmana wrote:
| "Recently, yet another category of low-hanging fruit has been
| identified as within reach of automated tools: problems which,
| due to a technical flaw in their description, are unexpectedly
| easy to resolve. Specifically, problem #124
| https://www.erdosproblems.com/124 was a problem that was stated
| in three separate papers of Erdos, but in two of them he omitted
| a key hypothesis which made the problem a consequence of a known
| result (Brown's criterion). However, this fact was not noticed
| until Boris Alexeev applied the Aristotle tool to this problem,
| which autonomously located (and formalized in Lean) a solution to
| this weaker version of the problem within hours."
|
| https://mathstodon.xyz/@tao/115639984077620023
| adastra22 wrote:
| That doesn't seem very fair. The problem was stated, and
| remained unsolved for all this time. You can't take away that
| accomplishment just because the solution seems easy in
| hindsight.
| zozbot234 wrote:
| It's technically true that this version of the problem was
| "low-hanging fruit", so that's an entirely fair assessment.
| Systematically spotting low-hanging fruit that others had
| missed _is_ an accomplishment, but it 's quite different from
| solving a genuinely hard problem and we shouldn't conflate
| the two.
| adastra22 wrote:
| My point is stronger than that. Some things only appear low
| hanging fruit in hindsight. My own field of physics is full
| of examples. Saying "oh that should've been easy" is wrong
| more often than it is right.
| zozbot234 wrote:
| Sure, but unless _all_ solvable problems can be said to
| "appear as low-hanging fruit in hindsight" this doesn't
| detract from Tao's assessment in any way. Solving a
| genuinely complex problem is a different matter than
| spotting simpler solutions that others had missed.
|
| In this case, the solution might have been missed before
| simply because the difference between the "easy" and
| "hard" formulations of the problem wasn't quite clear,
| including perhaps to Erdos, prior to it being restated
| (manually) as a Lean goal to be solved. So this is a
| success story for formalization as much as AI.
| menaerus wrote:
| One of the math academics on that thread says the
| following:
|
| > My point is that basic ideas reappear at many places;
| humans often fail to realize that they apply in a
| different setting, while a machine doesn't have this
| problem! I remember seeing this problem before and
| thinking about it briefly. I admit that I haven't noticed
| this connection, which is only now quite obvious to me!
|
| Doesn't this sound extremely familiar to all of us who
| were taught difficult/abstract topics? Looking at the
| problem, you don't have a slightest idea what is it about
| but then someone comes along and explains the innerbits
| and it suddenly "clicks" for you.
|
| So, yeah, this is exactly what I think is happening here.
| The solution was there, and it was simple, but nobody
| discovered it up until now. And now that we have an
| explanation for it we say "oh, it was really simple".
|
| The bit which makes it very interesting is that this
| hasn't been discovered before and now it has been
| discovered by the AI model.
|
| Tao challenges this by hypothesizing that it actually was
| done before but never "released" officially, and which is
| why the model was able to solve the problem. However,
| there's no evidence (yet) for his hypothesis.
| magicalist wrote:
| Is your argument that Terence Tao says it was a
| consequence from a known result and he categorizes it as
| low hanging fruit, but to you it _feels_ like one of
| those things that 's only obvious in retrospect after
| it's explained to you, and without "evidence" of Tao's
| claim, you're going to go with your vibes?
|
| What exactly would constitute evidence?
| menaerus wrote:
| Known result from where? I failed to see the reference
| where it was proven before. Maybe I missed it. Do you
| have a reference?
| adastra22 wrote:
| Consequence from a known result describes literally every
| proof.
| ak_111 wrote:
| It's a combination of the problem appearing to be low-
| hanging fruit in hindsight and the fact that _almost
| nobody actually seemed to have checked whether it was
| low-hanging in the first place_. We know it's the latter
| because the problem was essentially uncited for the last
| three decades, and it didn 't seem to have spread by word
| of mouth (spreading by word of mouth is how many
| interesting problems get spread in math).
|
| This is different from the situation you are talking
| about, where a problem genuinely appears difficult,
| attracts sustained attention, and is cited repeatedly as
| many people attempt partial results or variations. Then
| eventually someone discovers a surprisingly simple
| solution to the original problem which basically make all
| the previous paper look ridiculous in hindsight.
|
| In those cases, the problem only looks "easy" in
| hindsight, and the solution is rightly celebrated because
| there is clear evidence that many competent
| mathematicians tried and failed before.
|
| Are there any evidence that this problem was ever
| attempted by a serious mathematician?
| vatsachak wrote:
| I know the problem the AI solved wasn't that hard but one should
| consider math toast within the next 10-15 years.
|
| Software engineering is more subtle since you actually care about
| the performance and semantics of a program. Unless you get really
| good at program specification, it would be difficult to fully
| automate.
|
| But with math the only thing you care about is whether the moves
| you made were right.
| emil-lp wrote:
| > one should consider math toast within the next 10-15 years.
|
| I keep a list of ridiculously failed predictions, and this goes
| into it.
|
| Can I ask you to be more concrete? What does "math is toast"
| mean to you?
| n4r9 wrote:
| I'd like to see some other instances from your list.
| harshitaneja wrote:
| The amount of goal post shifting is so amusing to see. Yes, sure
| this was probably not an "important" or a particularly
| "challenging" problem which had been open for a while. Sure,
| maybe it remained open because it didn't get enough eyeballs from
| the right people to care about spending time on it. Yes, there is
| too much overhyping and we are all tired of it somewhat. I still
| think if someone 10 years ago told me we would get "AI" to a
| stage where it can solve olympiad level problems and getting gold
| medals in IMO on top of doing so with input not in a very
| structured input but rather our complex, messy human natural
| language and being able to do so while interpreting, to various
| degrees of meaning what interpreting means, image and video data
| and doing so in almost real time I would have called you nuts and
| this thing in such a duration sci-fi. So some part of me feels
| crazy how quickly we have normalized to this new reality.
|
| A reality where we are talking about if the problem solved by the
| automated model using formal verification was too easy.
|
| Don't get me wrong, I am not saying any of this means we get AGI
| or something or even if we continue to see improvements. We can
| still appreciate things. It doesn't need to be a binary. What a
| time to be alive!
| Davidzheng wrote:
| Yeah, I agree. As a mathematician it's easy to say that it's
| not a super hard proof. But then again, the first proofs found
| by running AI on a large bucket of open problems was always
| going to be some easy proofs on a not-very-worked-on problem.
| The fact that no one did it before them definitely shows real
| progress being made. When you're an expert, it's hard to lose
| track of the fact that things that you consider trivial vs very
| hard may be in fact a very small distance in the grand scheme
| of things (rel to entities of different oom strengths)
| magicalist wrote:
| > _The amount of goal post shifting is so amusing to see_
|
| Can you be specific about the goal posts being shifted? Like
| the specific comments you're referring to here. Maybe I'm just
| falling for the bait, but non specific claims like this seem
| designed just to annoy while having nothing specific to
| converse about.
|
| I got to the end of your comment and counting all the claims
| you discounted, the only goal post I see left is that people
| aren't using a sufficiently excited tone while sifting fact
| from hype? A lot of us follow this work pretty closely and
| don't feel the need to start every post with "there is no need
| for excitement to abate, still exciting! but...".
|
| > _I am not saying any of this means we get AGI or something or
| even if we continue to see improvements. We can still
| appreciate things. It doesn 't need to be a binary._
|
| You'll note, however, that the hype guys happily include
| statements like "Vibe proving is here" in their posts with no
| nuance, all binary. Why not call them out?
| BigParm wrote:
| Maybe the Turing test for one. Maybe etc.
| mrguyorama wrote:
| Except nobody ever actually considered the "turing test" to
| be anything other than a curiosity in the early days of a
| certain branch of philosophy.
|
| If the turing test is a goal, then we passed it 60 years
| ago and AGI has been here since the LISP days. If the
| turing test is not a goal (which is the correct
| interpretation), nobody should care what a random nobody
| thinks about an LLM "passing" it.
|
| "LLMs pass the turing test so they are intelligent (or
| whatever)" is not a valid argument full stop, because "the
| turing test" was never a real thing ever meant to actually
| tell the difference between human intelligence and
| artificial intelligence, and was never formalized, and
| never evaluated for its ability to do so. The entire point
| of the turing test was to _be part of a conversation about
| thinking machines_ in a world where that was an interesting
| proposition.
|
| The only people who ever took the turing test as a "goal"
| were the misinformed public. Again, that interpretation of
| the turing test has been passed by things like ELIZA and
| _markov chain based IRC bots_.
| zug_zug wrote:
| Well there's a comment here saying "I won't consider it
| 'true' AI until it solves all millenium problems"... That
| goalpost seems to be defining AI as not only human level but
| as superhuman level (e.g. 1 in a million level intellect or
| harder)
| nitwit005 wrote:
| That's the only such comment I found. The amount of goal
| pushing would seem to be 1.
| Libidinalecon wrote:
| I am rather pro-AI in general but I just can't imagine in 2015
| what I would think if you told me that we would have AI that
| could solve an Erdos problem from natural language but it can't
| answer my work emails.
|
| It actually doesn't help me at all at anything at my job and I
| really wish it could.
|
| That isn't really goal post moving as much as a very strange
| shape to the goal post.
| jvanderbot wrote:
| The most amazing thing about AI is how it's amazing and
| disappointing at the same time.
|
| Everything is amazing and nobody is happy:
| https://www.youtube.com/watch?v=nUBtKNzoKZ4
|
| "How quickly the world owes him something, he knew existed only
| 10 seconds ago"
| kelseyfrog wrote:
| An Erdos problem is great, but all major LLMs still score 0 out
| of 7 on all Millenium Prize Problems[1]. I, and I'm sure other
| like me, won't consider it "true" AI until AI saturate the
| MillenniumPrizeProblemBench.
|
| 1. https://www.claymath.org/millennium-problems/
|
| 1.
| jfengel wrote:
| That's interesting, because what I'm looking for from AI is the
| opposite. Rather than looking to do something humans find hard,
| I'm more interested in the problems that humans find easy, like
| driving a car or counting the number of "r"s in a word.
|
| Having it go beyond that to do things that we can't do would be
| a nice bonus. But given the way AI has been going, I'd say
| there's a decent chance that it solves a Millennium Prize
| problem before it solves driving.
___________________________________________________________________
(page generated 2025-12-01 23:02 UTC)