[HN Gopher] DeepMind AI outdoes human mathematicians on unsolved...
       ___________________________________________________________________
        
       DeepMind AI outdoes human mathematicians on unsolved problem
        
       Author : rntn
       Score  : 64 points
       Date   : 2023-12-14 19:33 UTC (3 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | supermdguy wrote:
       | Code is available! They have a few different discovered
       | solutions.
       | 
       | https://github.com/google-deepmind/funsearch
        
         | westurner wrote:
         | "Mathematical discoveries from program search with large
         | language models" (2023)
         | https://www.nature.com/articles/s41586-023-06924-6 :
         | 
         | > Abstract: _Large Language Models (LLMs) have demonstrated
         | tremendous capabilities in solving complex tasks, from
         | quantitative reasoning to understanding natural language.
         | However, LLMs sometimes suffer from confabulations (or
         | hallucinations) which can result in them making plausible but
         | incorrect statements [1,2]. This hinders the use of current
         | large models in scientific discovery. Here we introduce
         | FunSearch (short for searching in the function space), an
         | evolutionary procedure based on pairing a pre-trained LLM with
         | a systematic evaluator. We demonstrate the effectiveness of
         | this approach to surpass the best known results in important
         | problems, pushing the boundary of existing LLM-based approaches
         | [3]. Applying FunSearch to a central problem in extremal
         | combinatorics -- the cap set problem -- we discover new
         | constructions of large cap sets going beyond the best known
         | ones, both in finite dimensional and asymptotic cases._ This
         | represents the first discoveries made for established open
         | problems using LLMs. _We showcase the generality of FunSearch
         | by applying it to an algorithmic problem,_ online bin packing,
         | _finding new heuristics that improve upon widely used
         | baselines. In contrast to most computer search approaches,_
         | FunSearch searches for programs that describe how to solve a
         | problem, rather than what the solution is. _Beyond being an
         | effective and scalable strategy, discovered programs tend to be
         | more interpretable than raw solutions, enabling feedback loops
         | between domain experts and FunSearch, and the deployment of
         | such programs in real-world applications._
         | 
         | "DeepMind AI outdoes human mathematicians on unsolved problem"
         | (2023) https://www.nature.com/articles/d41586-023-04043-w :
         | 
         | > _Large language model improves on efforts to solve
         | combinatorics problems inspired by the card game Set._
        
         | bnprks wrote:
         | Though there are a couple caveats as to what code is available.
         | Quoting from the github:
         | 
         | > implementation contains an implementation of the evolutionary
         | algorithm, code manipulation routines, and a single-threaded
         | implementation of the FunSearch pipeline. It does not contain
         | language models for generating new programs, the sandbox for
         | executing untrusted code, nor the infrastructure for running
         | FunSearch on our distributed system. This directory is intended
         | to be useful for understanding the details of our method, and
         | for adapting it for use with any available language models,
         | sandboxes, and distributed systems.
        
           | Q6T46nT668w6i3m wrote:
           | I don't care about their sandbox or distributed system. They
           | are irrelevant to the method. The missing language model for
           | program generation is disappointing but I imagine anyone
           | interested in replication, myself included, would prefer to
           | roll their own.
        
       | nyrikki wrote:
       | Bad title, this was a hybrid ML/human effort, not a ML only
       | achievement.
       | 
       | From the article:
       | 
       | "What's most exciting to me is modelling new modes of human-
       | machine collaboration," Ellenberg adds. "I don't look to use
       | these as a replacement for human mathematicians, but as a force
       | multiplier."
        
         | SirMaster wrote:
         | It's almost like saying human+calculator beats human.
         | 
         | Haven't mathematicians been using complex computer modeling to
         | help solve unsolved math problems since computers have existed?
         | And havent those computers basically always beat out a human
         | alone?
         | 
         | So isn't this news just that the mathematicians now have a
         | newer and better computer model to help them solve their
         | problems?
         | 
         | Seems like evolution, not revolution.
        
           | fasterik wrote:
           | If I understand correctly, they're using an LLM to write a
           | series of computer programs exploring a large solution space
           | and feeding the output of those programs into a separate
           | validation program written by a human. To take the calculator
           | analogy, it's more like having the human give a set of
           | constraints on a solution and having the calculator decide
           | what buttons to push.
           | 
           | Real progress is always incremental. I wouldn't be surprised
           | if 5-10 years from now we have similar kinds of systems
           | discovering new materials or new candidates for dark matter.
        
         | fasterik wrote:
         | Even Nature has clickbait titles now.
         | 
         | I agree with the quote. The ability of AI to augment human
         | capabilities has way more potential than the more speculative
         | ideas about artificial general intelligence. This is why I'm
         | not very sympathetic to the skepticism toward deep learning and
         | LLMs as "not intelligence", "not real AI", "stochastic
         | parrots", etc. Who cares whether or not these systems are
         | generally intelligent agents, if they have the potential to
         | increase the scientific output of humanity by even 10 or 20%?
        
           | lainga wrote:
           | Nobody's asked me to donate all my money to Yudkowsky or else
           | suffer the scientific output of humanity increasing by 10 or
           | 20%.
        
         | ummonk wrote:
         | No it's not a hybrid effort (except insomuch as LLMs are
         | reliant on human generated data for training). They're simply
         | saying that the code created by the LLM can be examined and
         | potentially understood by humans.
        
           | caddemon wrote:
           | Their best reported results are a hybrid effort though, here
           | is one of the authors of the paper describing how they used
           | programs generated by the LLM to extract their own insights
           | that then refined future iterations of their workflow:
           | https://x.com/matejbalog/status/1735331210140819938?s=20
           | 
           | It can work by itself too but it is unclear at a glance how
           | well since the main focus of the paper is the new
           | mathematical benchmarks they achieved, i.e. their best
           | results. Will have to read the paper more closely to say
           | anything with high confidence, but based on their summary I'd
           | guess the human in the loop part was pretty important here.
        
       | metanonsense wrote:
       | I love the Set game mentioned in the article so much. Whenever I
       | make the mistake to install a digital version of it on my phone,
       | I have to uninstall it a few weeks later because it completely
       | wrecks my productivity.
        
       | caddemon wrote:
       | It seems this is essentially an evolutionary algorithm with an
       | LLM generating the pool of new variations at each step.
       | Definitely a very cool idea, but hard to evaluate the results
       | without knowing more about that field of mathematics. Obviously
       | the problem they chose fit well into the FunSearch framework, but
       | I'm curious if this is one of the more popular open problems in
       | that space or if they picked something that was more niche?
       | 
       | Namely, what sort of computational resources had been dedicated
       | to the problem before? Because DeepMind suddenly throwing their
       | weight at a problem that was previously the focus of a handful of
       | random math grad students would make it hard to benchmark the ML
       | advance that was made here -- like would it be possible to find a
       | similar solution with a ton of compute and more traditional
       | genetic algorithms?
       | 
       | I wouldn't be surprised if the answer were no. Protein folding
       | competition was a pretty big space in biology before getting
       | absolutely destroyed by Alpha Fold. But I also wouldn't be
       | entirely surprised if the answer were yes. The amount of hype
       | around LLMs right now is crazy, probably half the news I see
       | about them turns out to be very exaggerated upon further
       | evaluation.
       | 
       | ETA: the headline here is definitely exaggerated, because there
       | was also human in the loop to refine what the LLM was generating.
       | At a glance the technical article doesn't benchmark enough
       | against alternatives to the LLM component in their workflow IMO.
       | But it is entirely possible they tackled well established enough
       | open problems, such that prior work already handled those control
       | cases decently.
       | 
       | I'd love to know what someone in this space of mathematics thinks
       | about the paper! Would it have generated much buzz if they got
       | these same results like 3 years ago? Would it be accepted to
       | Nature if they found they could accomplish something similar
       | using their framework even if it was an RNN in the loop?
        
       | ummonk wrote:
       | In other words, "LLM writes a computer program that generates new
       | examples which improve the lower bound for the n=8 case of a
       | problem."
       | 
       | I'd like to see how novel the program it generated was, rather
       | than just being a brute force random search or a standard genetic
       | algorithm. I have a suspicion that the bug result here is simply
       | that it saved them coding time.
        
       | Q6T46nT668w6i3m wrote:
       | I'm personally excited by this paradigm. A few years back I had
       | success using a similar architecture for polynomial root finding.
       | I think it's entirely possible to be really ambitious and reverse
       | engineer new and useful generalized functions.
        
       ___________________________________________________________________
       (page generated 2023-12-14 23:01 UTC)