[HN Gopher] AlphaProof's Greatest Hits
___________________________________________________________________
AlphaProof's Greatest Hits
Author : rishicomplex
Score : 241 points
Date : 2024-11-17 17:20 UTC (1 days ago)
(HTM) web link (rishimehta.xyz)
(TXT) w3m dump (rishimehta.xyz)
| wslh wrote:
| If you were to bet on solving problems like "P versus NP" using
| these technologies combined with human augmentation (or vice
| versa), what would be the provable time horizon for achieving
| such a solution? I think we should assume that the solution is
| also expressible in the current language of math/logic.
| hiddencost wrote:
| No one is focused on those. They're much more focused on more
| rote problems.
|
| You might find them used to accelerate research math by helping
| them with lemmas and checking for errors, and formalizing
| proofs. That seems realistic in the next couple of years.
| nybsjytm wrote:
| There are some AI guys like Christian Szegedy who predict
| that AI will be a "superhuman mathematician," solving
| problems like the Riemann hypothesis, by the end of 2026. I
| don't take it very seriously, but that kind of
| prognostication is definitely out there.
| titanomachy wrote:
| I'm sure AI can "solve" the Riemann hypothesis already,
| since a human proved it and the proof is probably in its
| training data.
| nybsjytm wrote:
| No, nobody has proved it.
|
| Side point, there is no existing AI which can prove - for
| example - the Poincare conjecture, even though that has
| already been proved. The details of the proof are far too
| dense for any present chatbot like ChatGPT to handle, and
| nothing like AlphaProof is able either since the scope of
| the proof is well out of the reach of Lean or any other
| formal theorem proving environment.
| Davidzheng wrote:
| what does this even mean? Surely an existing AI could
| reguritate all of Perelman's arxiv papers if we trained
| them to do that. Are you trying to make a case that the
| AI doesn't understand the proof it's giving? Because then
| I think there's no clear goal-line.
| nybsjytm wrote:
| You don't even need AI to regurgitate Perelman's papers,
| you can do that in three lines of python.
|
| What I meant is that there's no AI you can ask to explain
| the details of Perelman's proof. For example, if there's
| a lemma or a delicate point in a proof that you don't
| understand, you can't ask an AI to clarify it.
| Davidzheng wrote:
| link to this prediction? The famous old prediction of
| Szegedy was IMO gold by 2026 and that one is basically
| confirmed right? I think 2027/2028 personally is a
| breakeven bet for superhuman mathematician.
| nybsjytm wrote:
| I consider it unconfirmed until it happens! No idea where
| I saw it but it was probably on twitter.
| zarzavat wrote:
| Probably a bad example, P vs NP is the most likely of the
| millennium problems to be unsolvable, so the answer may be
| "never".
|
| I'll bet the most technical open problems will be the ones to
| fall first. What AIs lack in creativity they make up for in
| ability to absorb a large quantity of technical concepts.
| wslh wrote:
| Thank you for the response. I have a follow-up question:
| Could these AIs contribute to advancements in resolving the P
| vs NP problem? I recall that the solution to Fermat's Last
| Theorem relied on significant progress in elliptic curves.
| Could we now say that these AI systems might play a similar
| role in advancing our understanding of P vs NP?
| ccppurcell wrote:
| Just my guess as a mathematician. But if LLMs are good for
| anything it will be for finding surprising connections and
| applying our existing tools in ways beyond human search.
| There's a huge space of tools and problems, and human
| intuition and brute force searching can only go so far. I
| can imagine that LLMs might start to find combinatorial
| proofs of topological theorems, maybe even novel theorems.
| Or vice versa. But I find it difficult to imagine them
| inventing new tools and objects that are really useful.
| nemonemo wrote:
| Do you have a past example where this already-proven
| theorem/new tools/objects would have been only possible
| by human but not AI? Any such example would make your
| arguments much more approachable by non-mathematicians.
| meindnoch wrote:
| Ok, then the AI should formally prove that it's "unsolvable"
| (however you meant it).
| less_less wrote:
| Unsolvable would mean that no proof exists, and no disproof
| exists, in whatever axiom system (eg ZF). While in some
| cases you can hack around this by proving eg "no proof and
| no disproof exist in ZF, if ZF is consistent", IIUC that's
| not always possible. It's like asking "when will AI be
| strong enough that it can always decide correctly whether a
| given computer program halts?"
|
| Then again, it's fairly likely that P=?NP is decidable but
| we just don't have any idea how to prove it. In that case
| the question is more or less "what's the time horizon until
| AI is vastly better than humans at formal math?", to which
| the answer is certainly "we don't know, there may be
| obstacles, just wait and see".
| uptownfunk wrote:
| The hard part is in the creation of new math to solve these
| problems not in the use of existing mathematics. So new objects
| (groups rings fields) etc have to be theorized, their
| properties understood, and then that new machinery used to
| crack the existing problems. I think we will get to a place
| (around 5 years) where AI will be able to solve these problems
| and create these new objects. I don't think it's one of
| technology I think it's more financial. Meaning, there isn't
| much money to be made doing this (try and justify it for
| yourself) and so the lack of focus here. I think this is a red
| herring and there is a gold mine in there some where but it
| will likely take someone with a lot of cash to fund it out of
| passion (Vlad Tenev / Harmonic, or Zuck and Meta AI, or the
| Google / AlphaProof guys) but in the big tech world, they are
| just a minnow project in a sea of competing initiatives. And so
| that leaves us at the mercy of open research, which if it is a
| compute bound problem, is one that may take 10-20 years to
| crack. I hope I see a solution to RH in my lifetime (and in
| language that I can understand)
| wslh wrote:
| I understand that a group of motivated individuals, even
| without significant financial resources, could attempt to
| tackle these challenges, much like the way free and open-
| source software (FOSS) is developed. The key ingredients
| would be motivation and intelligence, as well as a shared
| passion for advancing mathematics and solving foundational
| problems.
| uptownfunk wrote:
| Ok but how do you get around needing a 10k or 100k h100
| cluster
| wslh wrote:
| It is well known that cloud services like Google Cloud
| subsidizes some projects and we don't even know if in a
| few years improvements will arise.
| uptownfunk wrote:
| Possible but unlikely given how much demand there is and
| the pressure to deliver returns to shareholders, however
| sure it is possible. Right now search is very
| inefficient, the search space is massive. That is the
| main problem. You can have many sequences of text that
| sound plausible, but of them a much smaller number will
| be logically valid. This is the main challenge. Once we
| can search efficiently not just in semantically valid
| space but I suppose what you can call syntactically valid
| space then we will be able to crack this.
| Davidzheng wrote:
| I think there's significant financial incentives for big tech
| given the scarcity of benchmarks for intelligence which are
| not saturated
| empath75 wrote:
| > I think we will get to a place (around 5 years) where AI
| will be able to solve these problems and create these new
| objects.
|
| For all we know, buried deep in AlphaProof's attempts to
| solve these toy problems, it already tried and discarded
| several new ideas.
| sbierwagen wrote:
| More information about the language used in the proofs:
| https://en.wikipedia.org/wiki/Lean_(proof_assistant)
| sincerely wrote:
| in the first question, why do they even specify [?]n[?] (and
| [?]2n[?] and so on) when n is an integer?
| rishicomplex wrote:
| Alpha need not be an integer, we have to prove that it is
| sincerely wrote:
| Should have read more carefully, thank you!
| Robotenomics wrote:
| "Only 5/509 participants solved P6"
| nybsjytm wrote:
| This has to come with an asterisk, which is that participants
| had approximately 90 minutes to work on each problem while
| AlphaProof computed for three days for each of the ones it
| solved. Looking at this problem specifically, I think that many
| participants could have solved P6 without the time limit.
|
| (I think you should be very skeptical of anyone who hypes
| AlphaProof without mentioning this - which is not to suggest
| that there's nothing there to hype)
| auggierose wrote:
| Certainly an interesting information that AlphaProof needed
| three days. But does it matter for evaluating the importance
| of this result? No.
| nybsjytm wrote:
| I agree that the result is important regardless. But the
| tradeoff of computing time/cost with problem complexity is
| hugely important to think about. Finding a proof in a
| formal language is trivially solvable in theory since you
| just have to search through possible proofs until you find
| one ending with the desired statement. The whole practical
| question is how much time it takes.
|
| Three days per problem is, by many standards, a
| 'reasonable' amount of time. However there are still
| unanswered questions, notably that 'three days' is not
| really meaningful in and of itself. How parallelized was
| the computation; what was the hardware capacity? And how
| optimized is AlphaProof for IMO-type problems (problems
| which, among other things, all have short solutions using
| elementary tools)? These are standard kinds of critical
| questions to ask.
| dash2 wrote:
| Though, if you start solving problems that humans can't
| or haven't solved, then questions of capacity won't
| matter much. A speedup in the movement of the maths
| frontier would be worth many power stations.
| nybsjytm wrote:
| For some time a 'superhuman math AI' could be useful for
| company advertising and getting the attention of VCs. But
| eventually it would be pretty clear that innovative math
| research, with vanishingly few exceptions, isn't very
| useful for making revenue. (I am a mathematician and this
| is meant with nothing but respect for math research.)
| airstrike wrote:
| "Making revenue" is far from being the only metric by
| which we deem something worthy.
| nybsjytm wrote:
| As a mathematician, of course I agree. But in a sentence
| like:
|
| > A speedup in the movement of the maths frontier would
| be worth many power stations
|
| who is it 'worth' it to? And to what end? I can say with
| some confidence that many (likely most, albeit certainly
| not all) mathematicians do not want data centers and
| power stations to guzzle energy and do their math for
| them. It's largely a vision imposed from without by
| Silicon Valley and Google research teams. What do they
| want it for and why is it (at least for now) "worth" it
| to them?
|
| Personally, I don't believe for a second that they want
| it for the good of the mathematical community. Of course,
| a few of their individual researchers might have their
| own personal and altruistic motivations; however I don't
| think this is so relevant.
| auggierose wrote:
| There is something called applied math, and there is a
| big gulf between applied math and pure math. This new
| technology has the potential of making use of much more
| "pure math" for applied math, unifying much of pure math,
| applied math, and programming. I don't really care about
| the RH, I care about that.
| empath75 wrote:
| The big exception being predicting market movements, and
| I can't imagine how much money the hedge funds are
| spending on this right now.
| auggierose wrote:
| > Finding a proof in a formal language is trivially
| solvable in theory since you just have to search through
| possible proofs until you find one ending with the
| desired statement. The whole practical question is how
| much time it takes.
|
| No, in my experience the whole practical question is, can
| it be found automatically, or can it not be found
| automatically? Because there is an exponential search
| space that conventional automated methods will not be
| able to chew through, it either works, or it doesn't.
| AlphaProof shows that for some difficult IMO problems, it
| works.
| andrepd wrote:
| More or less. Modern theorem provers, even fully automatic
| ones, can prove incredibly difficult problems if given
| enough time. With 3 days and terabytes of memory, perhaps
| they could? Would be interesting to compare Alphaproof with
| a standard theorem prover that is given similarly
| astronomical computing resources.
| auggierose wrote:
| That is an interesting thought. But I doubt that standard
| automated theorem proving techniques would have solved
| this in a 100 years, even with immense resources, as all
| they do is (basically) brute force search, and it remains
| an exponential search space. Therefore 3 days does not
| really matter, indeed, 3 months would still be a result
| that 10 years ago I would have thought impossible.
| letitgo12345 wrote:
| Think more is made of this asterix than necessary. Quite
| possible adding 10x more GPUs would have allowed it to solve
| it in the time limit.
| nybsjytm wrote:
| Very plausible, but that would also be noteworthy. As I've
| mentioned in some other comments here, (as far as I know)
| we outside of DeepMind don't know anything about the
| computing power required to run alphaproof, and the
| tradeoff between computing power required and the
| complexity of problems it can address is really key to
| understanding how useful it might be.
| nybsjytm wrote:
| Why have they still not released a paper aside from a press
| release? I have to admit I still don't know how auspicious it is
| that running google hardware for three days apiece was able to
| find half-page long solutions, given that the promise has always
| been to solve the Riemann hypothesis with the click of a button.
| But of course I do recognize that it's a big achievement relative
| to previous work in automatic theorem proving.
| whatshisface wrote:
| I don't know why so few people realize this, but by solving any
| of the problems their performance is superhuman for most
| reasonable definitions of human.
|
| Talking about things like solving the Reimman hypothesis in so
| many years assumes a little too much about the difficulty of
| problems that we can't even begin to conceive of a solution
| for. A better question is what can happen when everybody has
| access to above average reasoning. Our society is structured
| around avoiding confronting people with difficult questions,
| except when they are intended to get the answer wrong.
| GregarianChild wrote:
| We know that any theorem that is provable at all (in the
| chosen foundation of mathematics) can be found by patiently
| enumerating all possible proofs. So, in order to evaluate
| AlphaProof's achievements, we'd need to know how much of a
| shortcut AlphaProof achieved. A good proxy for that would be
| the total energy usage for training and running AlphaProof. A
| moderate proxy for that would be the number of GPUs / TPUs
| that were run for 3 days. If it's somebody's laptop, it would
| be super impressive. If it's 1000s of TPUs, then less so.
| Onavo wrote:
| > _We know that any theorem that is provable at all (in the
| chosen foundation of mathematics) can be found by patiently
| enumerating all possible proofs._
|
| Which computer science theorem is this from?
| zeroonetwothree wrote:
| It's just an obvious statement. If a proof exists, you
| will eventually get to it.
| j16sdiz wrote:
| Only if we take AC, I guess?
| Tainnor wrote:
| No, this has nothing to do with choice.
| kadoban wrote:
| It's a direct consequence of the format of a proof.
| They're finite-length sequences of a finite alphabet of
| symbols, so of course they're enumerable (the same
| algorithm you use to count works to enumerate them).
|
| If you want a computer to be able to tell that it found a
| correct proof once it enumerates it, you need a bit more
| than that (really just the existence of automated proof
| checkers is enough for that).
| E_Bfx wrote:
| I guess it is tautological from the definition of
| "provable". A theorem is provable by definition if there
| is a finite well-formulated formula that has the theorem
| as consequence (https://en.wikipedia.org/wiki/Theorem
| paragraph theorem in logic)
| Xcelerate wrote:
| Not sure it's a tautology. It's not obvious that a
| recursively enumerable procedure exists for arbitrary
| formal systems that will eventually reach all theorems
| derivable via the axioms and transformation rules. For
| example, if you perform depth-first traversal, you will
| not reach all theorems.
|
| Hilbert's program was a (failed) attempt to determine,
| loosely speaking, whether there was a process or
| procedure that could discover all mathematical truths.
| Any theorem depends on the formal system you start with,
| but the deeper implicit question is: where do the axioms
| come from and can we discover all of them (answer:
| "unknown" and "no")?
| Tainnor wrote:
| It's "obvious" in the sense that it's a trivial corollary
| of the completeness theorem (so it wouldn't be true for
| second order logic, for example).
|
| Hilbert's program failed in no contradiction to what GP
| wrote because the language of FOL theorems is only
| _recursively enumerable_ and not _decidable_. It 's
| obvious that something is true if you've found a proof,
| but if you haven't found a proof yet, is the theorem
| wrong or do you simply have to wait a little longer?
| Davidzheng wrote:
| The shortcut vs enumeration is definitely enormous right?
| just take average shannon entropy to the exponent of the
| length for example will be probably > heat death (or
| whatever death) of universe on all of human compute (I'm
| assuming I didn't bother to check but it seems intuitively
| true by a margin)
| zeroonetwothree wrote:
| Well, multiply two large numbers instantly is a superhuman
| feat a calculator can do. I would hope we are going for a
| higher bar, like "useful". Let's see if this can provide
| proofs of novel results.
| nybsjytm wrote:
| It's worth emphasizing that it's been possible for years to
| use an automatic theorem prover to prove novel results. The
| whole problem is to get novel interesting results.
| whatshisface wrote:
| The ability to multiply two numbers superhumanly has
| already transformed society.
| rowanG077 wrote:
| That superhuman capability of "multiplying two large
| numbers instantly" has transformed human society like not
| even plumbing has. I really can't see how this you could
| not consider this useful.
| nybsjytm wrote:
| > A better question is what can happen when everybody has
| access to above average reasoning. Our society is structured
| around avoiding confronting people with difficult questions,
| except when they are intended to get the answer wrong.
|
| What does this have to do with a hypothetical automatic
| theorem prover?
| whatshisface wrote:
| Logic is pretty much absent from our culture and daily
| life, but that could be due to its limited supply.
| nybsjytm wrote:
| Being logical in social life is pretty much completely
| different from being logical in a mathematical argument,
| especially in a formal theorem proving environment. (Just
| try to write any kind of cultural proposition in a formal
| language!)
| whatshisface wrote:
| That's the way things are now, but this regime came about
| when proving things took intense concentration and
| specialized skills that very few people had. Contrast
| going to look something up in a library with googling
| something during a conversation.
| GregarianChild wrote:
| Google stopped publishing interesting AI work since they had
| their AI lead taken away by OpenAI, and mostly with tech that
| was pioneered, but not monetised by Google like transformers.
|
| I imagine they are under pressure not to make this mistake
| again.
| throwaway713 wrote:
| Anyone else feel like mathematics is sort of the endgame? I.e.,
| once ML can do it better than humans, that's basically it?
| margorczynski wrote:
| I doubt it. Math has the property that you have a way to 100%
| verify that what you're doing is correct with little cost (as
| it is done with Lean). Most problems don't have anything close
| to that.
| exe34 wrote:
| to be fair, humans also have to run experiments to discover
| whether their models fit nature - AI will do it too.
| margorczynski wrote:
| These kind of experiments are many times orders of
| magnitude more costly (time, energy, money, safety, etc.)
| than verifying a mathematical proof with something like
| Lean. That's why many think math will be one of the first
| to crack with AI as there is a relatively cheap and fast
| feedback loop available.
| AlotOfReading wrote:
| Math doesn't have a property that you can verify everything
| you're doing is correct with little cost. Humans simply tend
| to prefer theorems and proofs that are simpler.
| thrance wrote:
| You _can_ , in principle, formalize any correct
| mathematical proof and verify its validity procedurally
| with a "simple" algorithm, that actually exists (See Coq,
| Lean...). Coming up with the proof is much harder, and
| deciding what to attempt to prove even harder, though.
| AlotOfReading wrote:
| You can verify it with a simple algorithm, but that
| verification won't always be cheap. If it was, curry-
| howard would be incredibly uninteresting. It only _seems_
| cheap because we usually have little interest in
| exploring the expensive theorems. Sometimes we do though
| and get things like the 4 color theorem whose proof
| verification amounts to combinatorial search.
| raincole wrote:
| If "better than humans" means when you give it a real world
| problem, it gives you a mathematical model to describe it (and
| does it better than human experts), then yes, it's the end
| game.
|
| If it just solves a few formalized problems with formalized
| theorems, not so much. You can write a program that solves ALL
| the problems under formalized theorems already. It just runs
| very slowly.
| titanomachy wrote:
| I don't think you can gloss over the importance of
| computational tractability here. A human could also start
| enumerating every possible statement in ZFC, but clearly that
| doesn't make them a mathematician.
| technotony wrote:
| Yes, because if AI can do maths then it can use that to improve
| the efficiency/quality of it's algorithms to self improve...
| nybsjytm wrote:
| The quality of AI algorithms is not based on formal
| mathematics at all. (For example, I'm unaware of even one
| theorem relevant to going from GPT-1 to GPT-4.) Possibly in
| the future it'll be otherwise though.
| SkiFire13 wrote:
| ... or it might prove that it's impossible to self-improve
| given the current constraits
| GuB-42 wrote:
| Not the endgame by far. Maybe the endgame for LLMs, and I am
| not even convinced.
|
| Maths is detached from reality. An AI capable of doing math
| better than humans may not be able do drive a car, as driving a
| car requires a good understanding of the world, it has to
| recognize object and understand their behavior, for example,
| understanding that a tree won't move but a person might, but it
| will move slower than another car. It has to understand the
| physics of the car: inertia, grip, control,... It may even have
| to understand human psychology and make ethical choices.
|
| Fully autonomous robots would be the endgame.
| ykonstant wrote:
| >An AI capable of doing math better than humans may not be
| able do drive a car,
|
| Noo, but my excuse for being unable to drive a car is
| precisely that I am a quirky mathematician focused on
| research!
| jdietrich wrote:
| Humans are terrible at anything you learn at university and
| incredibly good at most things you learn at trade school. In
| absolute terms, mathematics is much easier than laying bricks
| or cutting hair.
|
| https://en.wikipedia.org/wiki/Moravec%27s_paradox
| youoy wrote:
| I would say that "narrow" mathematics (finding a proof of a
| given statement that we suspect has a proof using a formal
| language) is much easier that "generally" laying brick or
| cutting hair.
|
| But I cannot see how consistently doing general mathematics
| (as in finding interesting and useful statements to proof,
| and then finding the proofs) is easier than consistently
| cutting hair/driving a car.
|
| We might get LLM level mathematics, but not Human level
| mathematics, in the same way that we can get LLM level films
| (something like Avengers, or the final season of GoT), but we
| are not going to get Human level films.
|
| I suspect that there are no general level mathematics without
| the geometric experience of humans, so for general level
| mathematics one has to go through perceptions and
| interactions with reality first. In that case, general
| mathematics is one level over "laying bricks or cutting
| hair", so more complex. And the paradox is only a paradox for
| superficial reasoning.
| staunton wrote:
| > But I cannot see how consistently doing general
| mathematics (as in finding interesting and useful
| statements to proof, and then finding the proofs) is easier
| than consistently cutting hair/driving a car.
|
| The main "absolute" difficulty there is in understanding
| and shaping what the mathematical audience thinks is
| "interesting". So it's really a marketing problem. Given
| how these tools are being used for marketing, I would have
| high hopes, at least for this particular aspect...
| youoy wrote:
| Is it really marketing in general? I can agree with some
| of it, but for me the existence of the term "low hanging
| fruit" to describe some statements says otherwise...
| looofooo0 wrote:
| Sure but checking everything is correctly wired, plug-in, cut
| etc. Everything needes is thought of? There is plenty of
| things an AI could do to help a trades man.
| abrookewood wrote:
| I mean ... calculators can do better at mathematics than most
| of us. I don't think they are going to threaten us anytime
| soon.
| AlexCoventry wrote:
| There are important forms of discernment and judgement which
| aren't captured by mathematics.
| awanderingmind wrote:
| The end of an era perhaps, but not 'the end' - another
| commenter has already mentioned Moravec's paradox:
| https://en.wikipedia.org/wiki/Moravec%27s_paradox
|
| It will be interesting if/when these models start proving major
| open problems, e.g. the Riemann Hypothesis. The sociological
| impact on the mathematical community would certainly be acute,
| and likely lead to a seismic shift in the understanding of what
| research-level mathematics is 'for'. This discussion already
| appears to be in progress. As an outsider I have no idea what
| the timeline is for such things (2 years? 10? 100?).
|
| On the plus side, AlphaProof has the benefit over ordinary LLMs
| in their current form in that it does not pollute our common
| epistemological well, and its output is eminently interrogable
| (if you know Lean at last).
| empath75 wrote:
| Computers have been better than us at calculation since about a
| week after computers were invented.
|
| If a computer proves the Reimann Hypothesis, someone will say
| "Oh of course, writing a proof doesn't require intelligence,
| it's merely the dumb application of logical rules, but only a
| human could have thought of the conjecture to begin with."
| sega_sai wrote:
| I think the interface of LLM with formalized languages is really
| the future. Because here you can formally verify every statement
| and deal with hallucinations.
| raincole wrote:
| It's obviously not the future (outside of mathematics
| research). The whole LLM boom we've seen in the past two years
| comes from one single fact: peopel don't need to learn a new
| language to use it.
| seizethecheese wrote:
| Both comments can be right. People don't need to know HTML to
| use the internet.
| nickpsecurity wrote:
| Natural language -> Formal Language with LLM-assisted
| tactics/functions -> traditional tools (eg provers/planners)
| -> expert-readable outputs -> layperson-readable results.
|
| I can imagine many uses for flows where LLM's can implement
| the outer layers above.
| Groxx wrote:
| The difficulty then will be figuring out if the proof is
| relevant to what you want, or simply a proof of 1=1 in
| disguise.
| est wrote:
| > formalized languages is really the future
|
| Hmm, maybe it's time for symbolism to shine?
| samweb3 wrote:
| I am building Memelang (memelang.net) to help with this as
| well. I'd love your thoughts if you have a moment!
| thesz wrote:
| You are building an SQL in disguise.
|
| First, you need to encode "memes" and relations between them
| at scale. This is not a language problem, it is data handling
| problem.
|
| Second, at some point of time you will need to query memes
| and relations between them, again, at scale. While expression
| of queries _is_ a language problem, an implementation will
| heavily use what SQL engines does use.
|
| And you need to look at Cyc:
| https://en.wikipedia.org/wiki/Cyc
|
| It does what you are toing do for 40 (forty) years now.
| jamilton wrote:
| The PHP implementation works by converting it to SQL, so
| I'm sure they're aware.
| meindnoch wrote:
| Looks like an incomplete Prolog.
| chvid wrote:
| Maybe include a section titled 'how is this different from
| and how is it related to' ie. relational algebra?
| nybsjytm wrote:
| If this were the case, I don't see why we'd need to wait for an
| AI company to make a breakthrough in math research. The key
| issue instead is how to encode 'real-life' statements in a
| formal language - which to me seems like a ludicrous problem,
| just complete magical thinking.
|
| For example, how might an arbitrary statement like "Scholars
| believe that professional competence of a teacher is a
| prerequisite for improving the quality of the educational
| process in preschools" be put in a lean-like language? What
| about "The theoretical basis of the October Revolution lay in a
| development of Marxism, but this development occurred through
| three successive rounds of theoretical debate"?
|
| Or have I totally misunderstood what people mean when they say
| that developments in automatic theorem proving will solve LLM's
| hallucination problem?
| nine_k wrote:
| You can't talk about formally verifiable truthiness until you
| solve epistemology. This can be achieved formally in
| mathematics, with known principal limitations. Here strict
| theorem-proving, Lean-style, is viable.
|
| It can also be achieved informally and in a fragments way in
| barely-mathematical disciplines, like biology, linguistics,
| and even history. We have chains of logical conclusions that
| do not follow strictly, but with various probabilistic
| limitations, and under modal logic of sorts. Several
| contradictory chains follow under the different (modal)
| assumptions / hypotheses, and often both should be
| considered. This is where probabilistic models like LLMs
| could work together with formal logic tools and huge
| databases of facts and observations, being the proverbial
| astute reader.
|
| In some more relaxed semi-disciplines, like sociology,
| psychology, or philosophy, we have a hodgepodge of
| contradictory, poorly defined notions and hand-wavy reasoning
| (I don't speak about Wittgenstein here, but more about Freud,
| Foucault, Derrida, etc.) Here, I think, the current crop of
| LLMs is applicable most directly, with few augmentations.
| Still a much, much wider window of context might be required
| to make it actually productive, by the standards of the
| field.
| nybsjytm wrote:
| Hmmm I think even in something very nominally nearby like
| theoretical physics, there's very little that's similar to
| theorem proving. I don't see how AlphaProof could be a
| stepping stone to anything like what you're describing.
|
| Generally, I think many people who haven't studied
| mathematics don't realize how huge the gulf is between
| "being logical/reasonable" and applying mathematical logic
| as in a complicated proof. Neither is really of any help
| for the other. I think this is actually the orthodox
| position among mathematicians; it's mostly people who might
| have taken an undergraduate math class or two who might
| think of one as a gateway to the other. (However there are
| certainly some basic commonalities between the two. For
| example, the converse error is important to understand in
| both.)
| auggierose wrote:
| I don't think that mathematicians are actually in the
| best position to judge how math/logic might help in
| practical applications, because they are usually not
| interested in practical applications at all (at least for
| the last 70 years or so). Especially, they are also not
| interested in logic at all.
|
| But logic _is_ very relevant to "being
| logical/reasonable", and seeing how mathematicians apply
| logic in their proofs is very relevant, and a starting
| point for more complex applications. I see mathematics as
| the simplest kind of application of logic you can have if
| you use only your brain for thinking, and not also a
| computer.
|
| "Being logical/reasonable" also contains a big chunk of
| intuition/experience, and that is where machine learning
| will make a big difference.
| benlivengood wrote:
| Probabilistic reasoning is possible in a formal setting; It
| produces a probability distribution over answers. To ground
| probabilistic logic itself I'm not aware of much progress
| beyond the initial idea of logical induction[0].
|
| [0] https://arxiv.org/abs/1609.03543
| nybsjytm wrote:
| This takes for granted a formal setting, which is what I'm
| questioning in any of these 'real world' contexts.
| benlivengood wrote:
| > For example, how might an arbitrary statement like
| "Scholars believe that professional competence of a
| teacher is a prerequisite for improving the quality of
| the educational process in preschools" be put in a lean-
| like language? What about "The theoretical basis of the
| October Revolution lay in a development of Marxism, but
| this development occurred through three successive rounds
| of theoretical debate"?
|
| > This takes for granted a formal setting, which is what
| I'm questioning in any of these 'real world' contexts.
|
| A formal model of semantics would likely be a low-level
| physical representation of possible states augmented with
| sound definitions of higher-level concepts and objects. I
| don't think humans are capable of developing a formal
| semantics that would work for your sentences (it's taken
| us hundreds of years to approach formalization of
| particle physics), but I think that an automated prover
| with access to physical experiments and an LLM could
| probably start building a more comprehensive semantics.
| lukeschlather wrote:
| People talk about how LLMs need "more data" but data is
| really sort of a blunt instrument to try and impart the LLM a
| lot of experience. Unfortunately data isn't actually
| experience, it's a record of an experience.
|
| Now what we've seen with e.g. Chess and Go, is that when you
| can give a tensor model "real experience" at the speed of
| however many GPUs you have, the model is quickly capable of
| superhuman performance. Automatic theorem proving means you
| can give the model "real experience" without armies of humans
| writing down proofs, you can generate and validate proofs as
| fast as a computer will let you and the LLM has "real
| experience" with every new proof it generates along with an
| objective cost function, was the proof correct?
|
| Now, this might not let us give the LLMs real experience with
| being a teacher or dealing with Marxist revolutionaries, but
| it would be a sea change in the ability of LLMs to do math,
| and it probably would let us give LLMs real experience in
| writing software.
| ianhorn wrote:
| This is a thing I'm working on, so I have some potentially
| useful thoughts. tl;dr, it doesn't have to be about encoding
| _arbitrary_ real life statements to be super duper useful
| today.
|
| > how might an arbitrary statement like "Scholars believe
| that professional competence of a teacher is a prerequisite
| for improving the quality of the educational process in
| preschools" be put in a lean-like language?
|
| Totally out of scope in the any near future for me at least.
| But that doesn't prevent it from being super useful for a
| narrower scope. For example:
|
| - How might we take a statement like "(x + 1) (x - 5) = 0"
| and encode it formally?
|
| - Or "(X X^T)^-1 X Y = B"?
|
| - Or "6 Fe_2 + 3 H_2 O -> ?"?
|
| We can't really do this for a huge swath of pretty narrow
| applied problems. In the first, what kind of thing is X? Is
| it an element of an algebraically closed field? In the
| second, are those matrices of real numbers? In the third, is
| that 6 times F times e_2 or 6 2-element iron molecules?
|
| We can't _formally_ interpret those as written, but you and I
| can easily tell what 's meant. Meanwhile, current ways of
| formally writing those things up is a massive pain in the
| ass. Anything with a decent amount of common sense can tell
| you what is _probably_ meant formally. We know that we can 't
| have an algorithm that's right 100% of the time for a lot of
| relevant things, but 99.99% is pretty useful. If clippy says
| 'these look like matrices, right?' and is almost always
| right, then it's almost always saving you lots of time and
| letting lots more people benefit from formal methods with a
| much lower barrier to entry.
|
| From there, it's easy to imagine coverage and accuracy of
| formalizable statements going up and up and up until so much
| is automated that it looks kinda like 'chatting about real-
| life statements' again. I doubt that's the path, but from a
| 'make existing useful formal methods super accessible' lens
| it doesn't have to be.
| thesz wrote:
| So, you looking for Cyc [1], practically.
|
| [1] https://en.wikipedia.org/wiki/Cyc
| auggierose wrote:
| Does Cyc have proofs?
| cess11 wrote:
| In a sense, yes, since it has a foundation in Prolog style
| facts and rules, and supposedly can output its reasoning.
| auggierose wrote:
| Ok, sounds like in principle you could have proofs, but
| in practice, you don't?
|
| Are there any checks for the consistency of all facts?
| cess11 wrote:
| How could there be? Godel gets in the way.
| auggierose wrote:
| In most interactive theorem provers the check is simple:
| you single out the axioms you believe, and all other
| stuff you do is guaranteed to preserve consistency. That
| works in an ITP system because there are only a few
| axioms.
|
| If Cyc has 100000 axioms/facts, then that's a problem.
| thesz wrote:
| If axioms, facts and rules have types, these types can
| guide choice.
| auggierose wrote:
| Not a believer in types here, sorry. I'd rather have
| simple theorems instead without wading through a swamp of
| types first. And I am sure AI will be able to handle
| theorems just fine.
| cess11 wrote:
| It's not a theorem prover, it's more like a support
| system for decision making.
|
| They can probably isolate 100k facts from some domain for
| you if you pay for it, but their idea is to run inference
| on millions of "common sense" and everyday facts and have
| been for decades.
| auggierose wrote:
| That sounds to me like something that will be subsumed by
| the combination of ITP and AI.
| notTooFarGone wrote:
| You reminded me of my pains with OpenCyc in my student job.
| Thanks :D
| trenchgun wrote:
| >The Cyc knowledge base involving ontological terms was
| largely created by hand axiom-writing
|
| >An inference engine is a computer program that tries to
| derive answers from a knowledge base. The Cyc inference
| engine performs general logical deduction.[8] It also
| performs inductive reasoning, statistical machine learning
| and symbolic machine learning, and abductive reasoning. The
| Cyc inference engine separates the epistemological problem
| from the heuristicproblem. For the latter, Cyc used a
| community-of-agents architecture in which specialized
| modules, each with its own algorithm, became prioritized if
| they could make progress on the sub-problem.
|
| Not really.
| tucnak wrote:
| Grammar sampling has been around for months, and remained
| largely unexplored. Don't fall into the common trap of thinking
| in a fixed language, rather think about a superset of possible
| languages (grammars) and how they could evolve from one
| another. I bet, if there's a breakthrough, it's probably in
| "differential grammars," or whatever it would be called: plug
| that into the backtracking sampler, & you have your System 2.
| gosub100 wrote:
| Ok so Claude says that the Riemann hypothesis is proven and
| gives you 1200 pages of math symbols backing it up. Now what?
|
| You probably say "now a human can verify it" but what if the
| humans are wrong? What is the source of truth?
| chompychop wrote:
| Is it currently possible to reliably limit the cut-off knowledge
| of an LLM (either during training or inference)? An interesting
| experiment would be to feed an LLM mathematical knowledge only up
| to the year of proving a theorem, and then see if it can actually
| come up with the novel techniques used in the proof. For example,
| having only access to papers prior to 1993, can an LLM come up
| with Wiles' proof of FLT?
| ogrisel wrote:
| That should be doable, e.g. by semi-automated curation of the
| pre-training dataset. However, since curating such large
| datasets and running pre-training runs is so expensive, I doubt
| that anybody will run such an experiment. Especially since
| would have to trust that the curation process was correct
| enough for the end-result to be meaningful. Checking that the
| curation process is not flawed is probably as expensive as
| running it in the first place.
| n4r9 wrote:
| There's the Frontier Math benchmarks [0] demonstrating that AI
| is currently quite far from human performance at research-level
| mathematics.
|
| [0] https://arxiv.org/abs/2411.04872
| data_maan wrote:
| They didn't demonstrate anything. They haven't even released
| their dataset, nor mentioned how big it is.
|
| It's just hot air, just like the AlphaProof announcement,
| where very little is know about their system.
| chvid wrote:
| Mathematicians have been using computers, programming languages,
| and proof engines for over half a century; however breakthroughs
| in mathematics are still made by humans in any meaningful sense,
| even though the tools they use and make are increasingly complex.
|
| But as things look now, I will be willing to bet that the next
| major breakthrough in maths will be touted as being AI/LLMs and
| coming out of one of the big US tech companies rather than some
| German university.
|
| Why? Simply, the money is much bigger. Such an event would pop
| the market value of the company involved by a hundred billion -
| plenty of incentive right there to paint whatever as AI and hire
| whoever.
| kzrdude wrote:
| But, these AI solutions are trying to solve math problems to
| prove their AI capabilities, not because they care about
| mathematics.
| staunton wrote:
| Sure. Why do you say "but"? Solving such a math problem
| (while perhaps massively overstating the role AI actually
| played in the solution) would be great PR for everyone
| involved.
___________________________________________________________________
(page generated 2024-11-18 23:01 UTC)