[HN Gopher] The FBHHRBNRSSSHK-Algorithm for Multiplication is st...
___________________________________________________________________
The FBHHRBNRSSSHK-Algorithm for Multiplication is still not the end
Author : zitterbewegung
Score : 96 points
Date : 2022-10-12 17:36 UTC (5 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| Sharlin wrote:
| The cryptic "5x52" in the title should be
| $\mathbb{Z}_2^{5\times5}$. Not sure how best to phrase that in
| plain text, but at least it shouldn't be "5x52".
| dang wrote:
| Let's try to avoid this by cutting that bit.
|
| Someone will soon point out why that doesn't work, but in the
| meantime we may gain a few minutes...
| raverbashing wrote:
| Z_2^(5x5) might be a better description (or, better, 5x5
| matrices of binaries)
| Sharlin wrote:
| Or "5x5 bit matrices".
| [deleted]
| mminer237 wrote:
| Something like Z25x5?
| jwilk wrote:
| The multipication sign should be in superscript, but
| unforunately there's no suitable Unicode character... unless
| we abuse IPA:
|
| Z25x5
| cassonmars wrote:
| Might be abuse of IPA, but 100% stealing this trick for
| platforms that don't let me use LaTeX
| alexmolas wrote:
| I love how advances in modern approaches (deep learning) push for
| advances in classical approaches. Something similar happened in
| physics, where advances in quantum algorithms made researchers to
| find better classical algorithms.
|
| Science is not a competition!
| robbomacrae wrote:
| My admittedly less charitable take is that they may have been
| sitting on this 1 operation optimization for a while, noticed
| it works on the new method, and only decided to go public now
| they have seen the positive reception and potential of such
| collaborations.
| kcexn wrote:
| More likely, I think it's that the first principles approach
| they have used to find this optimisation has applications in
| domains far outside of optimising matrix multiplications.
|
| Only because the Google paper crossed their desk did they
| even think to try it on matrix multiplications.
|
| And then more or less by luck, it happened to turn out to be
| slightly better. If it had produced exactly the machine
| learning algorithm, it would have been less interesting.
|
| The speed of the response makes me feel that the authors are
| extremely good at their domain, and that finding this
| algorithm was a reasonably trivial problem for them. The
| slowness of the follow up paper, I believe is that the
| authors understand that their mathematical domain is wildly
| far outside of what most people can easily understand, so
| they're going to take their time to write it up well, instead
| of drop a whole bunch of maths definitions that nobody else
| in the world understands.
| metahost wrote:
| Aside: what's on with the name of the algorithm? -- does it stand
| for something?
| jeffbee wrote:
| This is a bit like the way that Bannister broke a decade-old
| record with his 4-minute mile, then someone beat his record 6
| weeks later.
| harveywi wrote:
| Even more impressive considering that the universe is
| expanding. So everyone today has to run farther in even less
| time.
| tiborsaas wrote:
| Matter is not expanding with space, the forces between atoms
| are much stronger, so planets stay as-is over time. Instead
| think of space between galaxies expanding.
| ncmncm wrote:
| I can't even think of the space between galaxies as they
| are now.
| yarg wrote:
| I think he was joking.
| telotortium wrote:
| Abstract: "In response to a recent Nature article which announced
| an algorithm for multiplying 5x5-matrices over Z2 with only 96
| multiplications, two fewer than the previous record, we present
| an algorithm that does the job with only 95 multiplications."
|
| The "recent Nature article" is, of course, Deepmind's
| reinforcement learning-guided search for better matrix
| multiplication algorithms:
| https://news.ycombinator.com/item?id=33096580.
|
| They improved upon Deepmind's algorithms by applying a method of
| their own, which is not described in this paper, but which they
| will publish in an upcoming paper: "Manuel Kauers and Jakob
| Moosbauer. Flip graphs for matrix multiplication. in
| preparation."
| TremendousJudge wrote:
| So it's a teaser paper?
| 3j0hn wrote:
| I wouldn't exactly call it a teaser. The formulas they
| published stand on their own, and putting them out there in
| the wake of all the press of the AlphaTensor paper, is just a
| good idea.
|
| Of course, these formulas are not super interesting on their
| own, and so the forthcoming paper on HOW they developed these
| formulas is in some sense the "real" paper.
| mabbo wrote:
| I think the point is to publish before anyone else does. If
| they wait until they're full paper is ready, they might be
| scooped.
| Yajirobe wrote:
| Are we calling dibs on research now? Is this a
| kindergarten?
| sidlls wrote:
| Read some of the history around natural philosophy some
| few hundred years ago. It's a real hoot, how petty and
| ego-driven some of these people were. Leibniz's attempts
| at taking credit for Newton's hard work come to mind.
| (Tongue somewhat planted in cheek here.)
|
| Research has always been about getting credit for first
| discovering something, to some extent.
| girvo wrote:
| Science/academia has been for as long as I can remember.
| ReaLNero wrote:
| I'm not sure how to feel about this. If the purpose of
| science research is to build the understanding we have,
| teaser articles seem more like a way to put a name on a
| creation without actually contributing to our understanding
| of something.
|
| I guess this is an artifact of how cutthroat academia is
| and how vital it is to have credit for being first.
| TheRealPomax wrote:
| > If the purpose of science research is to build the
| understanding we have.
|
| It's not. It never was. It's always been "to build the
| understanding that I or this small group I am part of
| has, that gives us an edge for a while until we share it
| with the world because it looks like someone else might
| otherwise pretend to have discovered it".
|
| The idea of a noble science, diligently working towards
| the betterment of mankind, has always been a romantic
| fantasy at best.
| golem14 wrote:
| "The Double Helix" by James Watson is still a
| surprisingly accessible read that's very relevant here.
| Beldin wrote:
| > _I guess this is an artifact of how cutthroat academia
| is and how vital it is to have credit for being first._
|
| In fairness, academia has improved much in this regards
| since the days of Newton. That man was such an arse...
| even the Wikipedia article about his feud with Leibniz
| glosses over... everything.
|
| Basically, not only was he a donkey about things, he used
| all sorts of underhanded tactics to win and reveled in
| devastating his rival's good standing.
| eynsham wrote:
| So long as it doesn't slow or otherwise impede the
| dissemination of the full paper I don't mind personally.
| sweettea wrote:
| But it is a long principle of academia to avoid being
| scooped by putting out priority-proving work. Consider US
| patents; or, further back, I read here yesterday that
| Galileo published anagrams of his results and Kepler took
| great delight in reverse-engineering them, accidentally
| finding other true facts other than Galileo's intended
| discovery.
| zitterbewegung wrote:
| If they were able to complete it in a week then it is a
| big deal... especially about wording if that they would
| be scooped.
| ckemere wrote:
| Deep-learning-hype-hater-quick-take: Based on that title (and a
| glance at the paper), my guess is that rather than brute force
| and megawatts of GPU power, they found a first-principles
| solution.
| inasio wrote:
| On that Google Nature paper, I'm not sure if the 5x5 matrix
| improvement is the more significant result, or that they
| figured out a way to optimize directly to SIMD/AVX
| instructions, resulting in faster execution even if not
| necessarily fewer operations
| ncmncm wrote:
| The result is not interesting for ordinary numbers, where
| the classical ("slower") method is faster. When each
| element of the 5x5 matrix is itself a matrix, it could be
| important.
|
| So AVX ops for the method are not interesting.
| naillo wrote:
| Hope they get a nature article as well.
| alexmolas wrote:
| Besides from the reduction from 96 to 95 multiplications it would
| be interesting to know how have they derived these results.
|
| I think it's more interesting and has more impact to know the
| methodology they followed than just to know the algorithm.
| ajkjk wrote:
| I'm really interested to know when we will be able to prove that
| we have the fastest algorithm for some matrix multiplication
| problem. Is there a theoretical lower bound? (well in this case,
| for 5x5, there's gonna be an actual lower bound. but for
| asymptotic cases..?) Is figuring out the theoretical lower bound
| some kind of undecidable/NP problem? what do we know about it?
| ithinkso wrote:
| I think there is some educated-believe that it can be done in
| n^(2+epsilon) but not in n^2. No proof yet ofcourse
| genpfault wrote:
| https://en.wikipedia.org/wiki/Computational_complexity_of_ma...
| imbusy111 wrote:
| Obviously a lower bound is O(N*M), but it's not very useful.
| But I'm not aware of any tighter lower bound.
| klyrs wrote:
| A lower bound of O(n^2 log n) has been known for quite some
| time. Good luck going any tighter than that...
| https://eccc.weizmann.ac.il/report/2002/012/download/ [pdf]
| 3j0hn wrote:
| This is a little bit apples and oranges, however. An
| asymptotic bound on the complexity doesn't say a lot about
| the number of multiplications needed for a fixed value of n=5
| say. A lot of those complexity proofs say "for large enough
| n".
| klyrs wrote:
| Not so! The 5x5 multiplication can be applied recursively
| to NxN matrices -- at the top level the "5x5
| multiplication" carves the matrix into a 5x5 grid of
| "blocks" each of size approximately (N/5 x N/5). It's been
| a while since I've done this sort of complexity analysis,
| but I found some nice slides here explaining how to use the
| so-called Master Theorem for recursive algorithms: https://
| www.inf.ed.ac.uk/teaching/courses/ads/Lects/lecture3...
| [pdf]
| henrydark wrote:
| that's for real and complex fields, and it's not exactly
| that. I don't think there's anything better than O(n^2) with
| no conditions
| klyrs wrote:
| ah, excellent point
| [deleted]
| [deleted]
| [deleted]
| Traubenfuchs wrote:
| How would one go about finding new ways to multiplicate matrices?
|
| Trial and error? Can the process be automated? Genius gets hit on
| the head by an apple moment?
| whymauri wrote:
| Virginia Williams teaches an excellent course:
|
| https://people.csail.mit.edu/virgi/6.890/
|
| Lecture notes are open.
| PartiallyTyped wrote:
| Thank you for sharing this! Do you know of similar gems?
| Jabbles wrote:
| See https://news.ycombinator.com/item?id=33096580
| klyrs wrote:
| For small n, you can reasonably brute-force all (n x n)
| multiplication formulae in characteristic 2 because there's a
| naive bound on the number of element multiplications and the
| coefficients are binary. As n gets large (say, n>3?), the cost
| of brute-force becomes prohibitive.
|
| Beyond brute-force, I imagine that this problem could be
| phrased as a max-sat problem, mixed-integer linear program or
| similar. There are generic solvers that can get solutions for
| problems of moderate size. But unless it's a fairly rote
| translation to the solver's native representation, it's often
| better to write a custom solver and port heuristics into the
| notation native to the problem. As far as I understand it,
| that's the approach that the Fawzi et al and TFA took.
| js8 wrote:
| I was wondering, why use Deepmind when you can just use SAT
| solver? I am not saying SAT solvers cannot be improved with
| AI, maybe they can.
| fcholf wrote:
| Actually, this has been done:
| https://arxiv.org/abs/1903.11391
| klyrs wrote:
| Note my uncertainty about the suitability of a max-sat
| formulation. But also note who did the Fawzi et al result:
| the DeepMind team at Google. I think that dogfooding their
| own stuff is an understandable default.
___________________________________________________________________
(page generated 2022-10-12 23:00 UTC)