[HN Gopher] Use Grobner bases to solve polynomial equations
___________________________________________________________________
Use Grobner bases to solve polynomial equations
Author : ransac9000
Score : 76 points
Date : 2023-04-25 17:26 UTC (5 hours ago)
(HTM) web link (jingnanshi.com)
(TXT) w3m dump (jingnanshi.com)
| umutisik wrote:
| Grobner basis methods do not scale well as the number of
| variables increases. The computational complexity becomes too
| high. Numerical Algebraic Geometry methods give the correct
| answer in many cases where there is a practical reason to solve
| the equations, but not a provably correct answer.
| mathisfun123 wrote:
| Got a pointer? Paper or code either works.
| Fermat963 wrote:
| I did my master's thesis on the Gaussian Elimination part of
| Grobner Bases making multiple optimizations to the Faugere-
| Lachartre method (working with Faugere himself) and had the first
| C (C++ish) public implementation at the time. The thesis can be
| found here https://hpac.imag.fr/gbla/data/bib/Mart12.pdf.
|
| Many ideas from the thesis ended up in the GBLA library used for
| fast Gaussian Elimination over these large matrices:
| https://dl.acm.org/doi/10.1145/2930889.2930914
|
| This is a nice presentation from Christian Eder explaining how
| the algorithm works and the different optimization techniques
| implemented in GBLA https://www-user.rhrk.uni-
| kl.de/~ederc/download/gbla-aca-201...
| ur-whale wrote:
| Naive question, but gotta ask: can Grobner basis get over the "in
| general, a polynomial of degree >=5 can't be solved
| algebraically" hump?
|
| Given that the above is a theorem (from Galois IIRC) I'd assume
| the answer is no.
|
| And given that any system of polynomial equations in multiple
| variables - even of low degree - eventually subsume to a
| univariate polynomial of very high degree , then ... what are
| Grobner actually useful for?
| ransac9000 wrote:
| If you can't algebraically solve the univariate polynomial, you
| can still get some solutions to the system as long as you can
| obtain some solutions to the univariate polynomial numerically.
| jjtheblunt wrote:
| You know how someone walking down the street might give you a
| collection of vectors over a field, and ask questions about the
| subspace the vectors generate? You might make an orthonormal
| basis set of vectors which generate the same subspace, as a
| means to more easily answer questions about the original
| vectors.
|
| Groebner bases play a role similar to a minimal orthonormal
| basis for a set of arbitrary vectors in a vector field, except
| the set of things to be linearly combined, and about which the
| linear combinations will be asked questions, are not tuples of
| floats (vectors) but instead are polynomials.
|
| https://a.co/d/iuciOR5
|
| is a famous book that's very nice (or was 25 years ago in
| earlier editions) discussing this topic and the various answers
| to your question.
| _a_a_a_ wrote:
| > You know how someone walking down the street might give you
| a collection of vectors over a field, and ask questions about
| the subspace the vectors generate
|
| what on earth...? I can't make sense of the last 2/3ds and
| with the first 1/3rd it is just weird. What does any of it
| mean anyway?
| _dain_ wrote:
| _> You know how someone walking down the street might give
| you a collection of vectors over a field, and ask questions
| about the subspace the vectors generate?_
|
| they can be very insistent but legally you don't have to
| answer them.
| adamnemecek wrote:
| Groebner bases are insanely powerful
| https://arxiv.org/abs/math/9812097
|
| I have been looking into them lately in the context of ML and I
| have a hunch that in the future, all ML will be running a Grobner
| basis algorithm on a large scale.
| jjtheblunt wrote:
| Why?
|
| (I know Grobner bases and their history for 25 or so years, for
| context)
| hgsgm wrote:
| You might the first person to understand and comment on the
| paper Adam has been shopping around HN for months!
| [deleted]
| adamnemecek wrote:
| Hopf algebra subsumes like all all currently fashionable
| architectures (transformers, convnets, SSM, diffusion models)
| and then some. Hopf convolution is a surprisingly general
| concept, Groebner bases are the thing used to calculate the
| antipode.
|
| Hopf algebra is tensor bialgebra, i.e. tensor with a built in
| feedback (which subsumes autodiff).
|
| I have written a paper on this recently
| https://arxiv.org/abs/2302.01834
|
| This paper is also good https://arxiv.org/abs/1206.3620
|
| I have a discord channel https://discord.cofunctional.ai.
| cscheid wrote:
| Interesting paper.
|
| I have a question about the claim in 6.2 that attention
| matrices are SPD, if you don't mind my asking.
|
| It seems to me that accepting the empirical result that the
| eigenvalues are positive isn't enough to get a Fourier
| Transform interpretation. Specifically, I don't understand
| the assumption that all attention matrices are symmetric.
| (I'm sure you know that positive eigenvalues are not enough
| by themselves, but for other folks reading, [[1 1/2] [1/3
| 1]] is a simple concrete example.)
|
| Consider Fig. 17 here:
| https://lilianweng.github.io/posts/2018-06-24-attention/
| (this is Fig.1 in Attention is all you need). I understand
| that you get symmetric attention matrices for the self-
| attention matrix in the input stream, as well as the masked
| attention matrix in the output stream (the first block).
| But I don't understand how you claim symmetry for the final
| attention mechanism that combines input and output.
|
| And if you don't get symmetry, you don't get the Fourier
| Transform interpretation and all the nice algebra that
| follows.
| adamnemecek wrote:
| I don't mention Fourier transform.
| cscheid wrote:
| Sorry. You do mention a linear systems response, and
| that's what I meant.
|
| In that setting, the eigenvectors work as a generalized
| forward and inverse fourier transform, and the
| eigenvalues form the transfer function you allude to in
| the bold sentence
|
| "The attention mechanism's role is the same as that of a
| transfer function in a linear time-invariant system,
| namely it calculates the frequency response of the
| transformer model,"
|
| Specifically, it seems to me that this requires a
| _symmetric_ attention matrix. Which you get from the
| self-attention mechanisms (two of the three places where
| they're used in transformers), but not all of them,
| notably not the one that combines the output of the first
| two attention mechanisms (one input, and one output)
| adamnemecek wrote:
| I think that the magic comes from the antipode which
| makes things symmetric. The Ising model is somewhat
| similar.
|
| Rereading the paper, quite a bit has changed in my
| understanding of all this. My conclusions still stand,
| but some of the reasoning needs to be explained better.
| Aardwolf wrote:
| Very well written and informative!
|
| One random thing I wonder: in the lexicographic ordering, is
| there any reason why the convention is that x_2 < x_1, while x^2
| > x^1? Where _ denotes subscript, ^ superscript exponents, in
| other words that x > y uses opposite of alphabetic ordering,
| while the exponents are done in non-opposite numeric order? I
| don't think the naming of the variables matters for the
| mathematical result (you can use any subscript or letter for any
| variable as long as the corresponding ones remain the same
| between the equations), so the convention x_1 < x_2 would work
| just as well, and be less confusing I think.
| ransac9000 wrote:
| This is a good point. I think it's just due to convention ---
| people tend to put terms involving x only at the front, and in
| the textbook by Cox et. al. that I referenced they use this
| convention as well.
| wenc wrote:
| I tried using Groebner bases in grad school to solve implicit
| polynomial equations (part of an optimization problem) but found
| quickly that it didn't scale. I ran into trouble with a system of
| about 300 polynomials and this was with Maple which implemented
| state of the art algorithms at the time.
|
| The computational complexity of Groebner is doubly exponential in
| the number of variables.
| ransac9000 wrote:
| Author here. Grobner bases are powerful tools for us to
| understand polynomials. I started on this article when I
| encountered them during research, and decided to write down some
| notes. After a few months, these notes turned into this blog
| post. It's quite a long read, and let me know if you have any
| questions.
| vasvir wrote:
| thanks for writing this. I have read about 1/3 to 1/2 so far
| and it is very nicely written and easy to follow.
| ransac9000 wrote:
| No problem, and thank you for the kind words!
| jjtheblunt wrote:
| nicely done
| ransac9000 wrote:
| thanks!
| abdullahkhalids wrote:
| Can the method of Grobner bases also be used to solve systems
| of polynomial inequalities?
| ransac9000 wrote:
| It seems that you can reformulate inequalities into
| equalities and solve the system:
| https://arxiv.org/pdf/1603.01183.pdf
| red_trumpet wrote:
| Looks nice!
|
| A tip for your latex: I would use \langle and \rangle to
| surround the generators of an ideal, which looks better than
| using < and >. And I think they are called angle brackets, not
| square brackets (which would be [,]).
| ransac9000 wrote:
| Good catch! I thought I changed all < and > into \langle and
| \rangle.
| hgsgm wrote:
| Thorough introduction to computing with Grobner basis, using
| Python. Bonus: goofy ASCII art for surds.
|
| https://mattpap.github.io/masters-thesis/html/src/groebner.h...
|
| It's part of Mateusz Paprocki's Masters thesis about Sympy, which
| he co-created
___________________________________________________________________
(page generated 2023-04-25 23:00 UTC)