[HN Gopher] 'A-team' of math proves a critical link between addi...
___________________________________________________________________
'A-team' of math proves a critical link between addition and sets
Author : digital55
Score : 137 points
Date : 2023-12-06 15:55 UTC (2 days ago)
(HTM) web link (www.quantamagazine.org)
(TXT) w3m dump (www.quantamagazine.org)
| blowski wrote:
| The maths goes over my head, but this paragraph was very
| interesting:
|
| > Tao then kicked off an effort to formalize the proof in Lean, a
| programming language that helps mathematicians verify theorems.
| In just a few weeks, that effort succeeded. Early Tuesday morning
| of December 5, Tao announced that Lean had proved the conjecture
| without any "sorrys" -- the standard statement that appears when
| the computer can't verify a certain step. This is the highest-
| profile use of such verification tools since 2021, and marks an
| inflection point in the ways mathematicians write proofs in terms
| a computer can understand. If these tools become easy enough for
| mathematicians to use, they might be able to substitute for the
| often prolonged and onerous peer review process, said Gowers.
| robertlagrant wrote:
| > If these tools become easy enough for mathematicians to use
|
| Sick burn, Gowers.
| gowld wrote:
| It's honest constructive criticism. Lean is very hard to use.
| It's even hard to install and run.
| dumbo-octopus wrote:
| Elan makes it actually quite easy. It's among the better
| package managers out there, comparable to cargo.
|
| There's a single bash script to install both the runtime
| and the package manager on a variety of platforms, and the
| integrated build system and native module bundler is both
| powerful and isomorphic, so all you need to learn is Lean.
| Without any prior knowledge of Lean I was able to patch
| existing build scripts to support platform specific C
| extensions in less than half an hour.
| kmill wrote:
| I helped a very little with the formalization (I filled in some
| trivial algebraic manipulations, and I was there for some Lean
| technical support).
|
| It's exciting to see how quickly the work was done, but it's
| worth keeping in mind that a top mathematician leading a
| formalization effort is very exciting, so he could very easily
| scramble a team of around 20 experienced Lean users.
|
| There aren't enough experienced Lean users to go around yet for
| any old project, so Gowers's point about ease of use is an
| important one.
|
| Something that was necessary for the success of this was years
| of development that had already been done for both Lean and
| mathlib. It's reassuring that mathlib is being developed in
| such a way that new mathematics can be formalized using it.
| Like usual though, there was plenty missing. I think this drove
| a few thousand more lines of general probability theory
| development.
| cs702 wrote:
| _> I helped a very little..._
|
| Thank You for helping with the effort, even if it was only "a
| very little."
|
| I love coming across comments like yours on HN.
| ska wrote:
| >Something that was necessary for the success of this was
| years of development that had already been
|
| For projects like this it is often very thankless work in the
| beginning, and can be a real grind.
|
| You need at least one person with a vision of how cool it
| will be in the (poorly defined) future, and a lot of
| determination.
| zozbot234 wrote:
| > Something that was necessary for the success of this was
| years of development that had already been done for both Lean
| and mathlib
|
| Yes but the bulk of the work on this project was background
| as well. So the ease of use problem should be solving itself
| over time as more and more prereqs get filled in.
|
| BTW, I find it interesting that mathlib is apparently also
| getting refactored to accommodate constructive proofs better,
| as part of the porting effort to Lean 4. This might encourage
| more CS- and program-verification minded folks to join the
| effort, and maybe some folks in math-foundations too (though
| Lean suffers there by not being able to work with the
| homotopy-types axioms).
| kmill wrote:
| Yes, sure, and that's how mathlib usually gets developed,
| project by project and PRing missing background material.
| I'm one of the mathlib maintainers, and I wanted to be sure
| there's acknowledgment here for the years of Boubaki-style
| work among hundreds of volunteers to build up the
| mathematics. In a way, now that the paper's been formalized
| it's time for the really hard work, which is figuring out
| how to restructure it (and mathlib) so that it can easily
| support future work. That's writing libraries for you.
|
| Where are you getting the idea mathlib is being refactored
| for constructive proofs? I wouldn't say that's on the road
| map. In fact, the core Lean developers are uninterested in
| pursuing supporting this.
|
| Since you're here, are you able to say why constructive
| proofs are interesting to program verification or (T)CS
| people? I've never heard anyone in TCS even know what
| constructivity is -- Lean supports their case where you
| write a program/algorithm and then (with classical logic)
| prove correctness and other properties. What would
| constructive logic give them?
| kmill wrote:
| I'll mention that the "Lean way" for constructivity is
| that you write `def`s for constructions. These can depend
| on classical reasoning for well-typedness if you want
| (like for being sure a number is in range when indexing
| an array), but there is a computability checker. In
| particular, the compiler sees if there is a way to lower
| the `def` into some simply typed IR, and this IR can be
| further compiled to C. If you trust the compiler, then
| you can trust whether it's real construction.
|
| (Of course, you can also go and prove a definition
| evaluates to some value if you don't want to trust the
| evaluator.)
| zozbot234 wrote:
| > [W]rite a program/algorithm and then (with classical
| logic) prove correctness. What would constructive logic
| give them?
|
| Compared to the usual case of writing the "algorithm"
| (i.e. the construction/decision part) and "proof"
| separately, it basically offers better structuring and
| composability. Imagine writing an "algorithm" that itself
| relies for correctness on non-trivial preconditions, or
| must maintain some invariants etc. Proving correctness of
| a complete development that might involve wiring up
| several of these together is quite possible if you can
| use constructive proofs, but becomes quite tedious if you
| have to work with the two facets separately.
|
| > Where are you getting the idea mathlib is being
| refactored for constructive proofs?
|
| It's not being refactored _for_ constructive proofs, but
| proofs that do use classical reasoning are being marked
| as such in a more fine-grained fashion rather than
| leaving it as the default for the entire development.
| Which makes it at least possible to add more constructive
| content.
| kmill wrote:
| Yeah, I can imagine that, and I've used sigma types to
| carry it out (`Subtype`s in particular, or perhaps custom
| structures). Why does the proof in the sigma type have to
| be constructive?
|
| Certainly this is the road to "dependently typed hell"
| and it's good to avoid mingling proofs with data if you
| can, but avoiding the law of the excluded middle doesn't
| make any of that any easier.
|
| This is also a bit of a distraction since what I really
| mean regarding TCS is that in my experience is that they
| do not see the proofs themselves as being objects-with-
| properties. They're perfectly happy with non-constructive
| reasoning to support constructions. I am looking to be
| convinced that constructive logic is worth it, but this
| mathematician hasn't seen anything compelling beyond work
| on foundations themselves.
|
| Regarding mathlib, could you say what in particular
| you're talking about, since it's not ringing a bell, and
| I follow the changelogs.
| zozbot234 wrote:
| > Regarding mathlib, could you say what in particular
| you're talking about, since it's not ringing a bell, and
| I follow the changelogs.
|
| There's been several commits where general appeals to
| classical reasoning in some development have been
| replaced by more fine-grained assumptions of decidability
| for some objects, or marking uses of classical reasoning
| in some specific proofs. This has not happened throughout
| mathlib of course, just in some specific places. But it
| might still show one way of making the development a bit
| friendlier to alternate foundations.
|
| > ...but avoiding the law of the excluded middle doesn't
| make any of that any easier.
|
| Constructivist developments are not about "avoiding the
| law of excluded middle". They're more about leveraging
| the way we _already_ structure mathematical proofs (and
| mathematical reasoning more generally) to make it easier
| to understand where simple direct reasoning _has_ in fact
| been used, and a theorem can thus be said to be useful
| for such purposes. If all we did was proofs by
| contradiction and excluded middle, there would be no
| point to it - but direct proof is actually rather common.
|
| > Why does the proof in the sigma type have to be
| constructive?
|
| It doesn't, _if_ the constructibility part has been
| stated separately (e.g. stating that some property is
| decidable, or that some object is constructible). That 's
| in fact how one can "write the program/algorithm and the
| logic separately" in a principled way.
| kmill wrote:
| Many times, Decidable assumptions are added simply
| because they appear in the terms in a theorem statement,
| and doing so makes applying such a theorem easier.
| There's the technical annoyance that Decidable instances
| are equal but not necessarily defeq, so if the wrong
| instances are present (such as the classical ones) then
| you need simp-like automation to rewrite instances before
| unifying. Maybe this is what you're seeing?
|
| There have also been attempts to make polynomials
| computable, which peppers everything with decidability,
| but it's not a good data type for real large-scale
| computations, so it's not clear what the future is there.
| Maybe the defeqs are worth it, I don't know.
|
| Re constructibility, this is all at too high of a level
| to really know what you're talking about, or why it's
| beneficial to write proofs themselves in a certain way.
| I'm not really even sure what you mean by "constructive".
| To me, I see no problem with writing a recursive
| definition that requires a non-constructive proof of
| termination by well-founded recursion -- is that
| constructive to you?
| brap wrote:
| Why did the verification step take so long? I imagine just
| verifying proofs is very efficient, no?
|
| Or do they mean formalizing the proof in Lean is what took
| weeks?
| adastra22 wrote:
| They're talking about formalizing the proof (aka writing Lean
| "code").
| spadufed wrote:
| If anybody's interested in learning more about Lean, he's been
| posting his experiences with the project over at
| @tao@mathstodon.xyz
| Affric wrote:
| On your opening sentence against this one from the article:
|
| > feels like really one of the most basic things that we didn't
| understand
|
| Nothing can prepare us for the depth of mathematics.
| owlbite wrote:
| 10 years ago a crack professor unit was defunded by an academic
| tribunal for misconduct they didn't commit. These men promptly
| escaped from a maximum teaching schedule to the research
| underground. Today, still wanted by the activist community, they
| survive as proovers of fortune. If you have a problem that no one
| else can solve, and if you can find them, maybe you can hire The
| A-Team.
| Agingcoder wrote:
| That's funny thanks :-)
| 082349872349872 wrote:
| My favourite bit of every episode was the iconic oplisis when
| Prof. Dr. P.H.D. Baracus first adds structure preserving maps
| to every object in sight then uses the snake lemma to weld them
| together with connecting homomorphisms. "I pity the fool who
| doesn't chase diagrams"
| CobrastanJorji wrote:
| I got a little sick of the recurring joke where Prof. Dr.
| Baracus would complain about flat, two-dimensional, infinite
| surfaces, and then the team would trick him into using one
| anyway.
| jfengel wrote:
| Goddamn that's a long way to go to make that joke.
|
| Worth the journey, but those are brain cells I haven't used
| in decades. I really wish I could garbage-collect some of
| the 1980s pop culture neurons and put them to better use.
| shrimp_emoji wrote:
| I feel like this about Call of Duty maps.
| sumtechguy wrote:
| A lone mathematician battles the forces of evil with the
| help of a virtually indestructible and artificially
| intelligent assistant CGPT.
| gosub100 wrote:
| I love it when a discrete manifold comes together!
| kleiba wrote:
| What do you think, how long until we can give conjectures to a
| specially trained LLM to come up with a proof?
| brap wrote:
| My uneducated hunch is that we're just a few years away from
| "proof search" being a solved problem. But that's only a part
| of mathematical research.
| prmph wrote:
| I think you are much too optimistic.
|
| Proof verification, though still hard to automate, seems at
| least tractable. I'm not a mathematician by any means, but
| something tells me automated proof search, in the general
| case, would require solving the halting problem, which means
| it is impossible even in principle.
| dumbo-octopus wrote:
| In the general case, proof search is obviously impossible,
| for the reason you state. But the case need not be general.
| A proof search machine that takes in a theorem and outputs
| `"proof: ..." | "antiProof: ..." | "idk"` would be quite
| powerful indeed.
| gpm wrote:
| def decider(_): print("idk")
|
| Done ;)
|
| ---
|
| More seriously, the benchmark would be something along
| the lines of "and it is better at finding proofs than
| humans are", like how chess engines haven't solved chess,
| but they are so much better than humans that they always
| win in a competition between the two.
|
| There's no good reason to think that humans are capable
| of finding proofs that are hard to find algorithmically.
| brap wrote:
| >There's no good reason to think that humans are capable
| of finding proofs that are hard to find algorithmically
|
| That's really all I was trying to say, not sure why I'm
| getting downvoted
| dellamonica wrote:
| It doesn't require anything like that.
|
| Math proofs are of NP complexity. If you had access to a
| non deterministic Turing machine you could enumerate all
| possible proofs of a given length and check them all in
| poly time.
|
| That does not say anything about LLMs though. Personally, I
| believe they could be quite helpful to mathematicians in a
| way similar to copilot for software programming.
| explaininjs wrote:
| > you could enumerate all possible proofs of a given
| length
|
| That does not help us with proof search as we do not know
| the target length.
| bawolff wrote:
| A proof longer than the size of the universe is also
| pretty useless and probably not something we need to
| worry about.
|
| Like i guess you are saying we couldn't really use such a
| machine to determine whether a certain conjecture just
| has a very long proof/disproof or is actually
| undecidable. Which sure, but umm i think that is the sort
| of problem most mathematicians would love to have.
|
| The real reason non-deterministic turing machines can't
| help us is that they dont actually exist.
| dellamonica wrote:
| Of course it would, you would enumerate lengths too. If
| the lengths need to be larger than polynomially bounded
| then we can be sure it would never be found by a human
| anyway.
| Jweb_Guru wrote:
| > Math proofs are of NP complexity.
|
| This is only true for very specific kinds of proofs, and
| doesn't apply to stuff that uses CiC like Lean 4. This is
| because many proof steps proceed by conversion, which can
| require running programs of _extremely_ high complexity
| (much higher than exponential) in order to determine
| whether two terms are equal. If this weren 't the case,
| it would be much much easier to prove things like
| termination of proof verification in CiC (which is
| considered a very difficult problem that requires appeal
| to large cardinal axioms!). There _are_ formalisms where
| verifying each proof steps is much lower complexity, but
| these can be proven to have exponentially (or greater)
| longer proofs on some problems (whether these cases are
| relevant in practice is often debated, but I do think the
| amount of real mathematics that 's been formalized in
| CiC-based provers suggests that the extra power can be
| useful).
| zozbot234 wrote:
| Yes, executing a calculation as part of a proof after
| separately proving that it is the "right" calculation to
| solve the problem is quite a common way of proceeding.
| Even common things like algebraic simplifications of all
| kinds (including solving equations, etc.) can be seen as
| instances of this, but this kind of thing has notably
| come up in the solution of famous problems like the
| 4-color problem, or the Kepler conjecture - both of which
| have been computer-verified.
| dellamonica wrote:
| This is all very interesting but it seems that we're just
| taking different views on what is the instance size. If
| it is the length of the theorem statement in some
| suitable encoding and the goal is to find a proof, of any
| possible length, then yeah, this is way too hard.
|
| I'm taking the view that the (max) length of the proof
| can be taken as a parameter for the complexity because
| anything too long would not have any chance of being
| found by a human. It may also not be trusted by
| mathematicians anyway... do you know if the hardware is
| bug free, the compiler is 100% correct and no cosmic
| particle corrupted some part of your exponential length
| proof? It's a tough sell.
| waynecochran wrote:
| Neither RH nor P vs NP will succumb.
| vakkermans wrote:
| Finding a proof is a search in a large space, akin to searching
| from abstractions to concretions. LLMs don't do anything like
| this, and so you're looking at the planning problem again. It's
| not clear to me how framing this particular problem as a
| language problem is helpful in any case.
| thisismyswamp wrote:
| Playing chess & go is also search in a large tree of moves
| leading to particular game states
| pxeger1 wrote:
| But AlphaGo etc don't use any kind of language-based AI, so
| LLMs (which this thread was about) are no good.
| thisismyswamp wrote:
| The next step seems to be applying past advances in
| reinforcement learning with modern transformer based
| models
| mattsan wrote:
| Which multiple teams are working on - OpenAI (Q*), and
| Meta just released a reinforcement learning framework
| greysphere wrote:
| The final state in chess is a single* state which yes, then
| branches out to N checkmate configurations and then N*M
| one-move-from-checkmates, and so on. (*Technically it's
| won/lost/draw.)
|
| The equivalent final state in theorem proving is unique to
| each theorem so such a system would need to handle an
| additional layer-of-generalization.
| sfink wrote:
| Perhaps an LLM could produce a plausible-looking proof that
| is completely and utterly incorrect.
|
| As a party trick.
| generalizations wrote:
| Wouldn't that be the point of pairing it with Lean? You
| wouldn't get false positives.
| Sharlin wrote:
| Doesn't mean you'd get true positives either. Garbage in,
| garbage out.
| bawolff wrote:
| At the same level of success as a mathematician? Never.
|
| At the same level of success as your average high school
| graduate? Probably already can.
| kaba0 wrote:
| I assume an average high school graduate can solve a sudoku.
| LLMs are very interesting but anything "thinking"-related is
| not their strong suit.
| auntienomen wrote:
| Open conjectures are usually solved by inventing new
| mathematical frameworks that embed the context of the existing
| question. Inventing new definitions is trivial, but there's no
| rules that constrain the space of moves. And inventing
| _interesting_ new definitions is extremely hard. I wouldn't bet
| on LLMs doing this ever.
| auntienomen wrote:
| What would be nice is an LLM to aid in translation between
| mathematics and a theorem proving language like Lean. That
| might fuel a revolution in how mathematics is done.
| kaba0 wrote:
| For trivial stuff, it might be able to even now (but those can
| probably be solved algorithmically as well). For more complex
| stuff, they are not even scratching the surface, I would
| believe -- LLMs can't really do long, complex
| thoughts/inferences, which are essential for coming up with
| proofs, or even just to solve a sudoku (which they can't do --
| no, writing a program (which was likely part of its training
| set) and executing that doesn't count).
| llwu wrote:
| I contributed a few trivial proofs to this project, and I tried
| enlisting GPT-4, Copilot, Moogle, and Morph Prover to help out
| (I did not try LLMStep or ReProver).
|
| Out of these:
|
| - GPT-4 answered maybe one or two questions about syntax
|
| - Copilot autocompleted maybe 0.5%
|
| - Moogle was helpful much more often but still often pointed to
| the wrong theorem in the right file; in most of those cases I
| could have just done go-to-def to get to the right file and
| scroll through it
|
| Note that these results are actually very good! And Terence Tao
| seems to have a positive opinion as well. But we're very far
| away from automatically proving conjectures IMO.
|
| I will say that Lean has great metaprogramming facilities to
| accept any type of AI innovation that might emerge. Currently I
| find the most helpful tactics to be `aesop?`, which is a
| classical search tactic that tries a bunch of substitutions and
| simplifications; and `exact?`, when you know there's a trivial
| proof but not sure of the name. But the UX for AI can be as
| simple as, for example, typing the `gptf` or `llmstep` tactic,
| and that is fully hooked into the Lean state (goals,
| hypotheses, etc). So there's a lot of opportunity for
| incremental progress in improving existing workflows (which
| consist of dispatching trivial manipulations with
| search/simplification tactics)
| srcreigh wrote:
| It will never happen. Some open math problems are equivalent to
| the halting problem; and we know that finite computers can't
| solve the halting problem.
| steego wrote:
| Check out this paper:
|
| https://leandojo.org/
|
| People have already trained models to assist suggestion
| tactics. They then linked it up to ChatGPT to interactively
| solve proofs.
|
| In this scenario, ChatGPT asks the model for tactic
| suggestions, applies it to the proof and uses the feedback from
| Lean to then proceed with the next step.
|
| FYI, The programmatic interface to Lean was written by an
| OpenAI employee who was on the Lean team a few years ago.
|
| Also, check out Lean's roadmap. They aspire to position Lean to
| becoming a target for LLMs because it has been designed for
| verification from the ground up.
|
| As math and compsci nerds contribute to mathlib, all of those
| proofs are also building up a huge corpus that will likely be
| leveraged for both verification and optimization.
|
| If AI can make verification a lot easier, then we're likely
| going to see verification change programming similarly to the
| way it changed electronics.
| logtempo wrote:
| Yesterday I was wondering if it's possible to have an AI that can
| "recreate" the mathematics as we know it, and even go further and
| explore unexplored paths.
|
| Though prduced by a video explaining how AI helped to find
| optimization in the computation of matrix multiplication.
| jmj wrote:
| I am working on that for my PhD!
___________________________________________________________________
(page generated 2023-12-08 23:00 UTC)