[HN Gopher] Use Grobner bases to solve polynomial equations
       ___________________________________________________________________
        
       Use Grobner bases to solve polynomial equations
        
       Author : ransac9000
       Score  : 76 points
       Date   : 2023-04-25 17:26 UTC (5 hours ago)
        
 (HTM) web link (jingnanshi.com)
 (TXT) w3m dump (jingnanshi.com)
        
       | umutisik wrote:
       | Grobner basis methods do not scale well as the number of
       | variables increases. The computational complexity becomes too
       | high. Numerical Algebraic Geometry methods give the correct
       | answer in many cases where there is a practical reason to solve
       | the equations, but not a provably correct answer.
        
         | mathisfun123 wrote:
         | Got a pointer? Paper or code either works.
        
       | Fermat963 wrote:
       | I did my master's thesis on the Gaussian Elimination part of
       | Grobner Bases making multiple optimizations to the Faugere-
       | Lachartre method (working with Faugere himself) and had the first
       | C (C++ish) public implementation at the time. The thesis can be
       | found here https://hpac.imag.fr/gbla/data/bib/Mart12.pdf.
       | 
       | Many ideas from the thesis ended up in the GBLA library used for
       | fast Gaussian Elimination over these large matrices:
       | https://dl.acm.org/doi/10.1145/2930889.2930914
       | 
       | This is a nice presentation from Christian Eder explaining how
       | the algorithm works and the different optimization techniques
       | implemented in GBLA https://www-user.rhrk.uni-
       | kl.de/~ederc/download/gbla-aca-201...
        
       | ur-whale wrote:
       | Naive question, but gotta ask: can Grobner basis get over the "in
       | general, a polynomial of degree >=5 can't be solved
       | algebraically" hump?
       | 
       | Given that the above is a theorem (from Galois IIRC) I'd assume
       | the answer is no.
       | 
       | And given that any system of polynomial equations in multiple
       | variables - even of low degree - eventually subsume to a
       | univariate polynomial of very high degree , then ... what are
       | Grobner actually useful for?
        
         | ransac9000 wrote:
         | If you can't algebraically solve the univariate polynomial, you
         | can still get some solutions to the system as long as you can
         | obtain some solutions to the univariate polynomial numerically.
        
         | jjtheblunt wrote:
         | You know how someone walking down the street might give you a
         | collection of vectors over a field, and ask questions about the
         | subspace the vectors generate? You might make an orthonormal
         | basis set of vectors which generate the same subspace, as a
         | means to more easily answer questions about the original
         | vectors.
         | 
         | Groebner bases play a role similar to a minimal orthonormal
         | basis for a set of arbitrary vectors in a vector field, except
         | the set of things to be linearly combined, and about which the
         | linear combinations will be asked questions, are not tuples of
         | floats (vectors) but instead are polynomials.
         | 
         | https://a.co/d/iuciOR5
         | 
         | is a famous book that's very nice (or was 25 years ago in
         | earlier editions) discussing this topic and the various answers
         | to your question.
        
           | _a_a_a_ wrote:
           | > You know how someone walking down the street might give you
           | a collection of vectors over a field, and ask questions about
           | the subspace the vectors generate
           | 
           | what on earth...? I can't make sense of the last 2/3ds and
           | with the first 1/3rd it is just weird. What does any of it
           | mean anyway?
        
           | _dain_ wrote:
           | _> You know how someone walking down the street might give
           | you a collection of vectors over a field, and ask questions
           | about the subspace the vectors generate?_
           | 
           | they can be very insistent but legally you don't have to
           | answer them.
        
       | adamnemecek wrote:
       | Groebner bases are insanely powerful
       | https://arxiv.org/abs/math/9812097
       | 
       | I have been looking into them lately in the context of ML and I
       | have a hunch that in the future, all ML will be running a Grobner
       | basis algorithm on a large scale.
        
         | jjtheblunt wrote:
         | Why?
         | 
         | (I know Grobner bases and their history for 25 or so years, for
         | context)
        
           | hgsgm wrote:
           | You might the first person to understand and comment on the
           | paper Adam has been shopping around HN for months!
        
             | [deleted]
        
           | adamnemecek wrote:
           | Hopf algebra subsumes like all all currently fashionable
           | architectures (transformers, convnets, SSM, diffusion models)
           | and then some. Hopf convolution is a surprisingly general
           | concept, Groebner bases are the thing used to calculate the
           | antipode.
           | 
           | Hopf algebra is tensor bialgebra, i.e. tensor with a built in
           | feedback (which subsumes autodiff).
           | 
           | I have written a paper on this recently
           | https://arxiv.org/abs/2302.01834
           | 
           | This paper is also good https://arxiv.org/abs/1206.3620
           | 
           | I have a discord channel https://discord.cofunctional.ai.
        
             | cscheid wrote:
             | Interesting paper.
             | 
             | I have a question about the claim in 6.2 that attention
             | matrices are SPD, if you don't mind my asking.
             | 
             | It seems to me that accepting the empirical result that the
             | eigenvalues are positive isn't enough to get a Fourier
             | Transform interpretation. Specifically, I don't understand
             | the assumption that all attention matrices are symmetric.
             | (I'm sure you know that positive eigenvalues are not enough
             | by themselves, but for other folks reading, [[1 1/2] [1/3
             | 1]] is a simple concrete example.)
             | 
             | Consider Fig. 17 here:
             | https://lilianweng.github.io/posts/2018-06-24-attention/
             | (this is Fig.1 in Attention is all you need). I understand
             | that you get symmetric attention matrices for the self-
             | attention matrix in the input stream, as well as the masked
             | attention matrix in the output stream (the first block).
             | But I don't understand how you claim symmetry for the final
             | attention mechanism that combines input and output.
             | 
             | And if you don't get symmetry, you don't get the Fourier
             | Transform interpretation and all the nice algebra that
             | follows.
        
               | adamnemecek wrote:
               | I don't mention Fourier transform.
        
               | cscheid wrote:
               | Sorry. You do mention a linear systems response, and
               | that's what I meant.
               | 
               | In that setting, the eigenvectors work as a generalized
               | forward and inverse fourier transform, and the
               | eigenvalues form the transfer function you allude to in
               | the bold sentence
               | 
               | "The attention mechanism's role is the same as that of a
               | transfer function in a linear time-invariant system,
               | namely it calculates the frequency response of the
               | transformer model,"
               | 
               | Specifically, it seems to me that this requires a
               | _symmetric_ attention matrix. Which you get from the
               | self-attention mechanisms (two of the three places where
               | they're used in transformers), but not all of them,
               | notably not the one that combines the output of the first
               | two attention mechanisms (one input, and one output)
        
               | adamnemecek wrote:
               | I think that the magic comes from the antipode which
               | makes things symmetric. The Ising model is somewhat
               | similar.
               | 
               | Rereading the paper, quite a bit has changed in my
               | understanding of all this. My conclusions still stand,
               | but some of the reasoning needs to be explained better.
        
       | Aardwolf wrote:
       | Very well written and informative!
       | 
       | One random thing I wonder: in the lexicographic ordering, is
       | there any reason why the convention is that x_2 < x_1, while x^2
       | > x^1? Where _ denotes subscript, ^ superscript exponents, in
       | other words that x > y uses opposite of alphabetic ordering,
       | while the exponents are done in non-opposite numeric order? I
       | don't think the naming of the variables matters for the
       | mathematical result (you can use any subscript or letter for any
       | variable as long as the corresponding ones remain the same
       | between the equations), so the convention x_1 < x_2 would work
       | just as well, and be less confusing I think.
        
         | ransac9000 wrote:
         | This is a good point. I think it's just due to convention ---
         | people tend to put terms involving x only at the front, and in
         | the textbook by Cox et. al. that I referenced they use this
         | convention as well.
        
       | wenc wrote:
       | I tried using Groebner bases in grad school to solve implicit
       | polynomial equations (part of an optimization problem) but found
       | quickly that it didn't scale. I ran into trouble with a system of
       | about 300 polynomials and this was with Maple which implemented
       | state of the art algorithms at the time.
       | 
       | The computational complexity of Groebner is doubly exponential in
       | the number of variables.
        
       | ransac9000 wrote:
       | Author here. Grobner bases are powerful tools for us to
       | understand polynomials. I started on this article when I
       | encountered them during research, and decided to write down some
       | notes. After a few months, these notes turned into this blog
       | post. It's quite a long read, and let me know if you have any
       | questions.
        
         | vasvir wrote:
         | thanks for writing this. I have read about 1/3 to 1/2 so far
         | and it is very nicely written and easy to follow.
        
           | ransac9000 wrote:
           | No problem, and thank you for the kind words!
        
         | jjtheblunt wrote:
         | nicely done
        
           | ransac9000 wrote:
           | thanks!
        
         | abdullahkhalids wrote:
         | Can the method of Grobner bases also be used to solve systems
         | of polynomial inequalities?
        
           | ransac9000 wrote:
           | It seems that you can reformulate inequalities into
           | equalities and solve the system:
           | https://arxiv.org/pdf/1603.01183.pdf
        
         | red_trumpet wrote:
         | Looks nice!
         | 
         | A tip for your latex: I would use \langle and \rangle to
         | surround the generators of an ideal, which looks better than
         | using < and >. And I think they are called angle brackets, not
         | square brackets (which would be [,]).
        
           | ransac9000 wrote:
           | Good catch! I thought I changed all < and > into \langle and
           | \rangle.
        
       | hgsgm wrote:
       | Thorough introduction to computing with Grobner basis, using
       | Python. Bonus: goofy ASCII art for surds.
       | 
       | https://mattpap.github.io/masters-thesis/html/src/groebner.h...
       | 
       | It's part of Mateusz Paprocki's Masters thesis about Sympy, which
       | he co-created
        
       ___________________________________________________________________
       (page generated 2023-04-25 23:00 UTC)