[HN Gopher] Deepmind Alphadev: Faster sorting algorithms discove...
       ___________________________________________________________________
        
       Deepmind Alphadev: Faster sorting algorithms discovered using deep
       RL
        
       Author : anjneymidha
       Score  : 539 points
       Date   : 2023-06-07 15:31 UTC (7 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | jonbaer wrote:
       | "AlphaDev uncovered new sorting algorithms that led to
       | improvements in the LLVM libc++ sorting library that were up to
       | 70% faster for shorter sequences and about 1.7% faster for
       | sequences exceeding 250,000 elements."
        
         | dheera wrote:
         | > up to 70% faster
         | 
         | So O(0.3(N log N))? That's still O(N log N)
        
           | dekhn wrote:
           | In the real world, we care about runtime as much as, if not
           | more than, computational complexity.
        
             | boomanaiden154 wrote:
             | Definitely more. There are lots of things that might be
             | more optimal in terms of raw complexity but end up having
             | abysmal performance due to things like cache locality.
             | Anything with pointer chasing usually kills performance.
        
               | xdavidliu wrote:
               | for example, matrix multiplication has all sorts of
               | algorithms with exponent significantly less than 3, but
               | they are nearly all curiosities. In practice, the
               | straightforward N^3 algo is used nearly everywhere
        
               | dekhn wrote:
               | I had a long discussion with Rob Pike once that ended
               | with me saying "you know, I don't think people actually
               | use Strassen on supercomputers" (we were discussing how
               | somebody had managed to show an N*2.37 algorithm or
               | something like that and "it was a new world's record").
        
               | xdavidliu wrote:
               | nit: ^, not *. ** for Python, but never *
        
           | caturopath wrote:
           | They're not going to find a general sorting algorithm faster
           | than O(n log n), but that doesn't mean all O(n log n) sorts
           | are created equal.
        
             | iopq wrote:
             | That's not true because Thorup's algorithm is O(n log log
             | n)
        
               | xdavidliu wrote:
               | any sort that only uses comparisons is provably not
               | better than n log n. Hence whatever Thorup is, it doesn't
               | work for arbitrary input, and must assume something
               | further, something like "all elements under 1000 or
               | something"
        
               | utopcell wrote:
               | I think the parent's argument is that, since they only
               | evaluated their algorithms on sorting arrays of 32-bit or
               | 64-bit integers, it is fair game to compare against
               | integer sorting algorithms for which we have better
               | complexities than O(nlgn).
        
               | reaperman wrote:
               | This seems like a very valid point for iopq's clarifying
               | point in the context of what still might exist to be
               | discovered in the set of theoretically possible new
               | algorithms, though doesn't change that dheera set the bar
               | much, much higher than most people would agree with.
               | 
               | Thank you.
        
               | xdavidliu wrote:
               | but the number of elements in the array is also a 32 or
               | 64 bit integer, so you still cannot do better than O(n
               | log n) even with fancy radix sort or whatever
        
               | graycat wrote:
               | The Gleason bound is n log(n) and says that no sorting
               | algorithm based on comparing pairs of keys can be faster.
               | Heap sort meets the Gleason bound so is the fastest
               | possible in this context. Actually the usual versions of
               | quick sort are slower. If the keys are not too long,
               | radix sort is O(n) and faster. All this has been well
               | known for decades. I explained a little more in another
               | post in this thread.
        
               | guyomes wrote:
               | > If the keys are not too long, radix sort is O(n) and
               | faster.
               | 
               | More precisely, if the key length is w, then radix sort
               | is O(w n) operations. In particular, if the n elements
               | are distinct integers for example, w is greater than
               | log(n).
        
               | CyberDildonics wrote:
               | I think you are getting your information mixed up. Here
               | is a comparison that shows quicksort running in 1/10th
               | the time as heapsort.
               | 
               | https://johnysswlab.com/why-is-quicksort-faster-than-
               | heapsor...
        
               | xdavidliu wrote:
               | we're talking about asymptotics (i.e. the exponent).
               | Things like "1/10 the time" are utterly meaningless here.
        
             | ralfn wrote:
             | If the thing you have to sort is within a known domain you
             | can definitely beat o(n log(n)). Just expect crazy memory
             | usage.
             | 
             | And that's only not the case in theory. But nobody owns a
             | real Turing machine with infinite tape and truly infinite
             | numbers. It doesn't exist in reality.
             | 
             | You can always divide time by multiplying space with the
             | same factor.
        
               | xdavidliu wrote:
               | by "general sort" your parent comment means "comparison
               | sort", and the claim (which has been proved, see CLRS
               | intro to algorithms) is that you cannot do it in better
               | than O(n log n) comparisons.
        
               | yunohn wrote:
               | Parent said: > not going to find a general sorting
               | algorithm
               | 
               | You said: > you have to sort is within a known domain you
               | can definitely beat
               | 
               | Not sure why you framed your response this way?
        
       | reocha wrote:
       | Pretty impressive, I'm looking forward to optimizations in other
       | commonly used algorithms.
        
       | mellosouls wrote:
       | Broken link. Ok it's there now, maybe a Nature snafu.
       | 
       | News Release:
       | 
       | https://www.deepmind.com/blog/alphadev-discovers-faster-sort...
        
       | [deleted]
        
       | lngnmn2 wrote:
       | Not general or universal. Only for pre-trained data and with
       | abysmal worst cases.
        
         | caturopath wrote:
         | I'm not positive what you mean here. Are you saying the
         | discovered algorithm isn't actually good? Didn't LLVM accept
         | it?
        
         | pjscott wrote:
         | Are you confusing this with another, unrelated paper? Because
         | the algorithms they present here do, in fact, sort any sequence
         | of 3, 4, or 5 integers in constant time with close to the
         | minimal number of instructions and (in the variable-size case)
         | with minimal branching. There is no abysmal worst-case time,
         | and the execution time for a fixed array length does not depend
         | on the data.
         | 
         | You can then plug these into other sorting algorithms (like
         | quicksort, mergesort, etc.) to handle sufficiently small
         | subsequences fast. Which is exactly how LLVM's libc++ is using
         | them right now.
        
       | EGreg wrote:
       | Almost all ML is basically searching through a (latent) space of
       | possible answers and indexing the results. So it outperform most
       | known methods. Like when AlphaGo using MCTS beat Rybka, etc.
       | 
       | I would be interested to know if ML can be used to reverse
       | hashes, such as to more quickly solve the numbers that satisfy
       | the Bitcoin Hash challenge. Maybe it can even get P and NP closer
       | together!
       | 
       | Are there theoretical results on SHA-256 or something that
       | preclude finding an ML algorithm that, with a billion parameters,
       | can speed up the search for a Bitcoin Hash input?
        
       | jeffbee wrote:
       | The sorting seems cool but I am more interested in the varint
       | decoder. Does anyone know where the source code of that landed,
       | if it did?
        
       | graycat wrote:
       | "Faster sorting"?
       | 
       | Heap sort achieves the Gleason bound and, thus, is the fastest
       | possible sort by comparing pairs of keys.
       | 
       | For keys not too long, radix sort can be faster.
       | 
       | These facts have been very well known for decades.
       | 
       | There might be more to do in sorting with some different records,
       | keys, computer architecture, concerns about caches and _locality
       | of reference_ , but otherwise, thankfully, sorting is a well
       | solved problem.
        
         | CaptainNegative wrote:
         | Heap sort's worst- and average-case running time are both
         | provably within a constant factor of optimality (both in
         | running time and number of comparisons), but it's either likely
         | or straight up provably not actually optimal in any of the four
         | categories.
        
         | CyberDildonics wrote:
         | Heap sort gets beat by sorting algorithms with better locality.
         | You can't be the fastest while skipping through memory for
         | every operation.
        
       | VikingCoder wrote:
       | Dumb question -
       | 
       | Where are the machine learning de-compilers?
       | 
       | Given an executable program, give me human-readable code that
       | generates the executable exactly. With machine learning guessed
       | function, type, variable names...?
       | 
       | I get that it's a very hard problem. But... Does it just not
       | work? Or have people not done it yet? Or have I missed it?
        
         | xwdv wrote:
         | This is the killer app for me with AI. A way to get a non-
         | copyrighted code for any program you already have binary code
         | for.
        
           | ealexhudson wrote:
           | That sounds obviously transformative and on any non-trivial
           | scale I can't see why copyright would be avoided.
        
             | reaperman wrote:
             | I think you mean "non-transformative", although in this
             | context I can see there's a bit of an ambiguity in how
             | people would use the word "transform" to mean one thing
             | w.r.t copyright and the "opposite" for code (really,
             | orthogonal, but boolean evaluation would yield the opposite
             | true/false).
             | 
             | In copyright, "transformative" refers to modifying or
             | adapting a copyrighted work in a way that creates a new and
             | original expression; resulting in a new work with a
             | different purpose or meaning.
             | 
             | In terms of code, you'd "just be transforming" the assembly
             | code to a systems language of your choice.
        
         | skeaker wrote:
         | Personally I think this would be very exciting. Currently there
         | are a ton of projects to decompile older games (see the Mario
         | 64 and Ocarina of Time de-compilations and subsequent native PC
         | ports as an example), but those projects are huge and take
         | whole teams years to finish. If you could simply point an AI at
         | a project like that and have usable source code in a reasonable
         | amount of time, it would be huge for those scenes. You would
         | have a sort of defacto-open source moment for just about all
         | software. It feels like the stuff of far off sci-fi.
        
         | VMG wrote:
         | You can already do that to some extent with ChatGPT. Paste in
         | the assembly and it gives pretty good results
        
         | sl-1 wrote:
         | That is a good question!
         | 
         | Training datasets should be pretty trivial for this, all open
         | source software that is buildable on the internet could provide
         | training sets (source code & compiled code).
         | 
         | But I guess it would have to be trained specifically for every
         | architechture.
        
         | pitaj wrote:
         | This actually doesn't sound that difficult, given we can
         | produce a practically infinite training dataset using a
         | compiler with existing programs.
         | 
         | What's more interesting to me would be a model that can
         | automatically port a program between languages, such as
         | converting a C program to a fairly idiomatic Rust program.
        
       | shpx wrote:
       | I (don't) want to see an agent learn to rowhammer the memory
       | address storing its reward quantity.
        
         | orlp wrote:
         | You'll love this list of examples where the AI successfully
         | accomplished reward hacking:
         | https://docs.google.com/spreadsheets/d/e/2PACX-1vRPiprOaC3Hs...
        
       | ComputerGuru wrote:
       | RL: Reinforcement Learning
        
       | MagicMoonlight wrote:
       | Now do it for compression
        
       | cypherpunks01 wrote:
       | This was educational because I learned about Sorting Networks.
       | 
       | I didn't hear of them in CS undergrad algorithms, perhaps because
       | they can be thought of as a special case or optimization, rather
       | than fundamental and generic variable-length sort algos.
       | 
       | It's a simple concept, and forms the basis of all the sort
       | optimizations described here.
       | 
       | https://en.wikipedia.org/wiki/Sorting_network
        
       | Havoc wrote:
       | I like the high impact dynamic of this thinking.
       | 
       | If that can be done say OS wide I could see that having an actual
       | impact on global energy usage
        
       | gfd wrote:
       | Can someone summarize how it achieves "provable correctness"?
        
       | finitestateuni wrote:
       | The most interesting part of this paper to me is that they let
       | the agent guess how efficient it's own solutions were and only
       | had the model experimentally verify it's guesses in 0.002% of
       | cases. This allowed the model to search much faster than another
       | program that didn't guess and had to run every program.
        
         | dsign wrote:
         | I've been wondering about a similar approach for biomolecular
         | simulations, where exact computations are still a hard
         | bottleneck. I wonder if something like this could give us a few
         | orders of magnitude more speed.
        
         | Zamicol wrote:
         | That sounds like intuition.
        
           | Filligree wrote:
           | Intuition is the only thing we've figured out how to
           | automate. Reason turns out to be higher hanging fruit.
        
             | [deleted]
        
             | the_other wrote:
             | Like humans' "slow" and "fast" thinking, then?
        
               | Filligree wrote:
               | There's likely a connection. Either way, I like to
               | describe AIs like ChatGPT / diffusion models, etc. as
               | operating 100% on intuition. It gives people a better
               | intuition of their weaknesses...
               | 
               | For GPT you can kind of prompt it to do chain-of-thought
               | reasoning, but it doesn't work very well; not if you
               | compare it to what humans do.
               | 
               | Once again it seems like what we thought was hard, is
               | easy; what we thought was easy and computer-like turns
               | out to be hard.
        
               | munchler wrote:
               | The Google Bard folks have a blog post today making
               | exactly that connection:
               | https://blog.google/technology/ai/bard-improved-
               | reasoning-go...
        
               | fragmede wrote:
               | If you tell ChatGPT "show your work", you get better
               | answers
        
         | blackbear_ wrote:
         | This is actually quite common to optimize stuff in several
         | disciplines. You essentially fit a surrogate model (keyword if
         | you want to look up more) to whatever you want to optimize,
         | then use the model to guide the procedure, making sure that the
         | model is correct every one in a while.
        
         | tux3 wrote:
         | This is the same sort of "more guesses faster beats smarter
         | guesses slower" that made afl-fuzz by far the best at exploring
         | large search spaces in program fuzzing
         | 
         | Fast search often beats accurate search. Sometimes adding
         | clever heuristics or more complex scoring "works" but slows
         | down the search enough that it's an overall loss. Another kind
         | of a bitter lesson, perhaps
        
           | lqr wrote:
           | But why isn't the proposed method an instance of smart
           | guessing? It reduces oracle complexity with heuristics. The
           | heuristic is "build a machine learning model of the objective
           | function and use it to fake oracle queries most of the time."
        
       | GreedClarifies wrote:
       | Deepmind shouldn't be part of a for profit entity. There I said
       | it.
       | 
       | They are clearly focused on moving technology forward and helping
       | humanity. That's great. However, pulling down 1M+ salaries (for
       | L6+ developers) and using hundreds of millions or billions in
       | borg resources while adding _nothing_ to the bottom line is not
       | in the interest of Google. Not to mention the negative effects on
       | productivity as other Googlers attempt to replicate the "publish
       | everything to support my brand" strategy of Deepmind.
       | 
       | I know Google is not a normal company in that Larry and Sergey
       | have complete control of the company, but sooner or later they
       | have to realize that Deepmind needs to be spun off, and further
       | that Demis is entirely the wrong person to run Google's AI. He
       | doesn't care a whit about Google, it is only a vessel to fund his
       | research.
       | 
       | Products. A company is about products and customers not research
       | papers. Academia or non-profits are about papers.
        
         | benlivengood wrote:
         | I think Google's biggest net positives in the world at this
         | point are Waymo and DeepMind. Way better than spending money on
         | new chat apps.
         | 
         | Google's history has been having a very lucrative bottom line
         | that allows for very beneficial research with absolutely no
         | hardships on anyone. Supremely better than lining shareholder
         | pockets; shareholders can't think enough quarters ahead for
         | long-term human benefit. Salaries are just the way to attract
         | and retain quality folks.
        
           | GreedClarifies wrote:
           | The billions who use their search everyday would likely
           | disagree. The billions who use YouTube per day would likely
           | disagree. The billions who use android would likely disagree.
           | The hundreds of millions who use gmail, or gsuite would
           | disagree.
           | 
           | Products.
        
             | casey2 wrote:
             | Products are at best optimization of a yesterday's
             | technology, in tech you realistically can't peek into the
             | future without spending billions on r&d. Having smart
             | people work on any of the things you listed is a (further)
             | waste of their time.
        
         | sidibe wrote:
         | Both of these contribute a lot to Google's infra by giving it
         | new challenges. It's not a one-way street, Google wouldn't know
         | what to build towards in their data center/hardware pipeline
         | without these really ambitious projects with huge requirements.
        
       | [deleted]
        
       | hexomancer wrote:
       | > The confidence intervals are represented as latency +- (lower,
       | upper), in which latency corresponds to the fifth percentile of
       | latency measurements across 100 different machines. Lower and
       | upper refer to the bounds of the 95% confidence interval for this
       | percentile.
       | 
       | Does anybody know why they chose fifth percentile? I though we
       | should always choose the fastest time when measuring performance.
        
         | ajuc wrote:
         | Usually you discard extreme values to reduce noise, and in fact
         | they wrote that's why they did it:
         | 
         | > We then take the fifth percentile as our final measurement,
         | because we assume that most noise sources are one-sided (for
         | example, cache misses, pre-emptions and so on). During training
         | we process the measurements across ten machines for
         | computational efficiency.
         | 
         | > I though we should always choose the fastest time when
         | measuring performance.
         | 
         | Depends. For games you usually do sth similar to what they did
         | - exclude small percentage of worst results to reduce influence
         | of noise and then optimize the worst scenario to make the game
         | run consistent and smooth.
        
         | gcr wrote:
         | Because they want to make sure that the sorting algorithm works
         | well for all possible workloads, not just the most preferable
         | ones.
         | 
         | If we measured sorting algorithms by the fastest measurement,
         | we might conclude that BubbleSort is the fastest possible sort
         | algorithm on some inputs. (Bubblesorting an already-sorted list
         | makes at most one comparison per list element)
        
           | hexomancer wrote:
           | I don't think that's what they meant (or I have
           | misunderstood). Running the same algorithm on the same input
           | still has variations because of OS/CPU idiosyncrasies. When
           | measuring performance we usually run the algorithm on the
           | same input multiple times and report the fastest performance.
        
       | air7 wrote:
       | Can anyone explain how this worked? as per the paper (which I
       | TLDRed): "A single incorrect instruction in the AssemblyGame can
       | potentially invalidate the entire algorithm, making exploration
       | in this space of games incredibly challenging."
       | 
       | What did it do if it didn't have a useful partial score function?
       | How did it avoid brute force?
        
         | blackbear_ wrote:
         | This paragraph:
         | 
         | > To better estimate latency, we implemented a dual value
         | function setup, whereby AlphaDev has two value function heads:
         | one predicting algorithm correctness and the second predicting
         | algorithm latency. The latency head is used to directly predict
         | the latency of a given program by using the program's actual
         | computed latency as a Monte Carlo target for AlphaDev during
         | training. This dual-head approach achieved substantially better
         | results than the vanilla, single head value function setup when
         | optimizing for real latency.
         | 
         | Briefly, they use a neural network to predict whether a given
         | sequence of instructions is correct, and how fast it is. Then
         | they used this neural network to guide the program generation
         | via Monte Carlo tree search [1]. It is this procedure that
         | keeps track of the partial score functions at each node.
         | 
         | [1] https://en.wikipedia.org/wiki/Monte_Carlo_tree_search
        
       | usgroup wrote:
       | " The fixed sort solutions for sort 3, sort 4 and sort 5
       | discovered by AlphaDev have been integrated into the standard
       | sort function in the LLVM standard C++ library3."
       | 
       | Not exactly what the title would lead you to believe.
        
       | computing wrote:
       | *Google DeepMind
        
       | greenflag wrote:
       | Does anyone have high level guidance on when (deep) RL is worth
       | pursuing for optimization (e.g. optimizing algorithm design)
       | rather than other approaches (e.g genetic)?
        
         | jeffbee wrote:
         | Start with a planet-scale computer that makes the marginal cost
         | of RL be nearly zero, and at the same time spend a lot of money
         | on hashing and sorting so the micro-optimization pays off.
        
         | AndrewKemendo wrote:
         | Less of a scale problem than a type problem usually in my
         | experience.
         | 
         | My rule of thumb is when it's easy to specify a reward function
         | but infinite ways to traverse the action space - versus having
         | a constrained state and action space (small n solution
         | traversal pathways) and only a few possible paths to traverse.
        
       | dontupvoteme wrote:
       | From "It can't Code" to publishing articles in Nature.
        
       | whywhywhydude wrote:
       | It is astounding how something as well as studied as sorting
       | still has opportunities for further improvements!
        
         | zzbn00 wrote:
         | It is not the sorting per-se which was improved here, but
         | sorting (particularly short sequences) on modern CPUs with
         | really the complexity being on the difficulty of predicting
         | what will work quickly on these modern CPUs.
         | 
         | Doing an empirical algorithm search to find which algorithms
         | fit well on modern CPUs/memory systems is pretty common, see
         | e.g. FFTW, ATLAS, https://halide-lang.org/
        
         | jackmott42 wrote:
         | Part of it is because hardware properties are always changing.
         | The instructions available, the relative speed of CPU to memory
         | and the various caches, how big and numerous and fast the
         | various caches are, etc etc.
        
           | GravityLabs wrote:
           | I am curious why things can't just get better on a base that
           | doesn't change, until the base changes because the
           | improvements with the new base are just that much better...
           | 
           | Or is that why hardware properties change so much?
        
             | zerodensity wrote:
             | For a time that is what happened. Every year you would get
             | a new CPU and that would be faster at pretty much
             | everything than the version the year before. But then we
             | hit the clockspeed wall and the only way of making faster
             | CPUs was to add complexity to the internals of the CPUs. So
             | branch prediction, micro code pipelining, larger caches,
             | simd instructions more CPU cores ect was the result.
             | 
             | So nowadays a new CPU might not be better at everything
             | then the previous version but it will most likely have more
             | cache and some internal improvements to
             | pipelining/concurrency.
             | 
             | Given this, for newer versions it can be useful to add
             | instructions to take advantage of extra pipelining or using
             | a different instruction that happen to be faster now.
        
         | dist-epoch wrote:
         | What's well studies is theoretical algorithmic sorting.
         | 
         | For practical sorting, for a particular CPU architecture, there
         | is still plenty of low hanging fruit:
         | 
         | > Today we're sharing open source code that can sort arrays of
         | numbers about ten times as fast as the C++ std::sort, and
         | outperforms state of the art architecture-specific algorithms,
         | while being portable across all modern CPU architectures
         | 
         | https://opensource.googleblog.com/2022/06/Vectorized%20and%2...
        
       | skellington wrote:
       | So clickbait to claim a faster algorithm when all it did was find
       | a faster machine optimization for a particular CPU. It's
       | fun/cool, but not the leap they seem to be representing.
        
       | mmaunder wrote:
       | In 2023 we saw the first glimpse of machines becoming better at
       | programming than humans when an AI created a new sorting
       | algorithm that was faster than anything humans had yet created.
        
       | hamilyon2 wrote:
       | I want this for sql compilation, plan choosing. So much effort is
       | wasted on suboptimal database performance and trying to improve
       | that. I think even sorting pales in comparison.
        
       | fwlr wrote:
       | Some very cool improvements found in already highly optimized
       | algorithms.
       | 
       | They found that in a sorting network handling 3 inputs, the AI
       | found a way to save an instruction by reducing a "min(A, B, C)"
       | operation to just "min(A, B)" by taking advantage of the fact
       | that previous operations guaranteed that B <= C. They also found
       | that in a sorting network handling 4 inputs, when it is part of a
       | "sort 8" network, there are similar guarantees (D >= min(A, C) in
       | this case) that can be taken advantage of to remove an
       | instruction as well. The compiler may not be able to compete with
       | hand-written assembly, but it seems an AI can hand-write assembly
       | code that's even better in some cases.
       | 
       | Another improvement is also very cool. They discuss VarSort4,
       | which takes a list that may be 2, 3, or 4 elements long, and
       | sorts it. The existing algorithm is                   if (len =
       | 4) {              sort4         }         elif (len = 3) {
       | sort3         }         elif (len = 2) {             sort2
       | }
       | 
       | The AI found an algorithm that looks totally different:
       | if (len = 2) {             sort2         }         else {
       | sort3             if (len = 4) {                 sortspecial
       | }         }
       | 
       | It's pretty wild! It immediately runs Sort3 on something that may
       | be either 3 elements or 4 elements long, and only afterwards does
       | it check to see how long it really is. If it's 3 elements long,
       | we're done; otherwise, run Sort4 - but because (having already
       | run Sort3) you know the first three elements are sorted, you can
       | use a special and much simpler algorithm to simply put the 4th
       | element in the right spot.
       | 
       | Very cool. Improving on core LLVM sorting algorithms that have
       | already been heavily hand-optimized by the best in the world is
       | definitely a "it's 1997 and Deep Blue defeats the World Champion
       | in chess" kind of feeling.
        
         | ComputerGuru wrote:
         | Interesting that it took AI to pull this off. I think mutagen
         | testing would have discovered the first (by virtue of checking
         | which code isn't necessary _globally_ ) though likely not the
         | second (which needs a new branch, unless that branch was
         | already there?).
        
           | fkarg wrote:
           | As I understand it it kind of _did that_, just in an
           | extremely guided kind of way, which is why it produced
           | results in some reasonable amount of time (probably)
        
         | thomasahle wrote:
         | I'm surprised they only report results up to sort5. It seems at
         | that level, you could just iterate through all possible
         | programs. AI generated code seems more interested once you get
         | to 10+ values, where classical methods break down.
        
           | cypherpunks01 wrote:
           | I believe that's because above 5 elements, fixed sorting
           | networks are no longer used. Introsort takes over and
           | dispatches to insertion sort, quicksort, or heap sort as
           | appropriate.
           | 
           | Divide and conquer strategies are used for larger sorts, and
           | the smaller arrays could include the fixed lengths 3, 4, 5.
        
           | bitshiftfaced wrote:
           | The paper notes that they brute forced all solutions for
           | sort3 to verify that it was optimal. It said that this wasn't
           | possible for 4+.
        
       | bgirard wrote:
       | The title implies it found an entirely new algorithmic approach
       | to sorting (like quick sort) which would have been a fantastic
       | discovery. But it feels a lot like micro-optimizing the control
       | flow and codegen.
        
         | ozr wrote:
         | I think this is still super interesting. It's something humans
         | are unable to do/there's few humans that can. I very much like
         | the pattern of writing basic logic myself and then using a
         | coding model to optimize it. It's effectively what we do with
         | compilers already, this just makes it better.
        
         | GravityLabs wrote:
         | Still a fantastic discovery, because now there are more
         | powerful automated algorithm improvement discovery pipelines. I
         | wonder what will happen when those improvements are added to
         | the processing that found the improvements. :D
        
           | bgirard wrote:
           | Yes, I'm not discounting the results. I think it's a very
           | interesting approach. I just think the language is important
           | here. If you ask a CS class turn in their own quick sort
           | implementation, you have n implementations of 1 algorithm,
           | not n new algorithms.
        
         | fwlr wrote:
         | I'm not sure there is a new algorithmic approach to sorting
         | like you're thinking of. From a high level you can divide
         | sorting algorithms into roughly two camps, "those that use
         | divide-and-conquer", and "those that do not". The divide-and-
         | conquer (split into smaller sub-lists, sort the sub-lists, and
         | merge them while preserving sort order) algorithms are better.
         | From this perspective, algorithms that we think of as quite
         | distinct like quicksort and mergesort are not that different -
         | quicksort just does the "divide" step of divide-and-conquer in
         | a smarter way.
         | 
         | In the end, no matter what divide-and-conquer sorting algorithm
         | you pick, you will be doing lots of "small sorts". And it is
         | those small sorts that DeepMind has optimized here.
        
           | ozr wrote:
           | > I'm not sure there is a new algorithmic approach to sorting
           | like you're thinking of. From a high level you can divide
           | sorting algorithms into roughly two camps, "those that use
           | divide-and-conquer", and "those that do not". ...
           | 
           | I think the sci-fi-but-possibly-real hope is that for sorting
           | (among other things), we may have the perspective that there
           | isn't any new algorithmic approach available, but an AI finds
           | one for us.
        
           | bgirard wrote:
           | There's lot of accepted sorting algorithms [1]. I'm sure we
           | can come up with novel new algorithms, even if they're not
           | optimal. Like Wikipedia mentions, they all fall within some
           | small number of higher level categories (eg. Partitioning,
           | Merging, Selection, Insertion). I'm still not convinced that
           | the optimizations presented in the article amount to the
           | discovery of NEW sorting algorithms but merely optimizations
           | of existing ones.
           | 
           | [1] https://en.wikipedia.org/wiki/Sorting_algorithm
        
         | nurettin wrote:
         | My first intuition, knowing the limitations of generating
         | first-order logic sequences, was that they must have somehow
         | restricted the sorting sequence to a number of elements.
        
       | lumb63 wrote:
       | This is really cool. I'll be interested to see if the team can
       | produce useful and provably hard cryptographic hash functions
       | with this tech. The other interesting application that this
       | inspires is use of this tech to optimize the optimization
       | algorithms used by compilers. Perhaps we can all benefit from
       | optimized optimizers.
        
         | boomanaiden154 wrote:
         | There's already been quite a bit of work done on replacing
         | compiler heuristics with ML models. Google has productionized
         | it with MLGO and there have been quite a few papers/experiments
         | on the topic.
        
       | danlark wrote:
       | You can see hashing optimizations as well
       | https://www.deepmind.com/blog/alphadev-discovers-faster-sort...,
       | https://github.com/abseil/abseil-cpp/commit/74eee2aff683cc7d...
       | 
       | I was one of the members who reviewed expertly what has been done
       | both in sorting and hashing. Overall it's more about assembly,
       | finding missed compiler optimizations and balancing between
       | correctness and distribution (in hashing in particular).
       | 
       | It was not revolutionary in a sense it hasn't found completely
       | new approaches but converged to something incomprehensible for
       | humans but relatively good for performance which proves the point
       | that optimal programs are very inhuman.
       | 
       | Note that for instructions in sorting, removing them does not
       | always lead to better performance, for example, instructions can
       | run in parallel and the effect can be less profound. Benchmarks
       | can lie and compiler could do something differently when
       | recompiling the sort3 function which was changed.
       | 
       | For hashing it was even funnier, very small strings up to 64 bit
       | already used 3 instructions like add some constant -> multiply
       | 64x64 -> xor upper/lower. For bigger ones the question becomes
       | more complicated, that's why 9-16 was a better spot and it
       | simplified from 2 multiplications to just one and a rotation.
       | Distribution on real workloads was good, it almost passed
       | smhasher and we decided it was good enough to try out in prod. We
       | did not rollback as you can see from abseil :)
       | 
       | But even given all that, it was fascinating to watch how this
       | system was searching and was able to find particular programs can
       | be further simplified. Kudos to everyone involved, it's a great
       | incremental change that can bring more results in the future.
        
         | gumby wrote:
         | Your description makes the approach sound like applying ca
         | 1980s simulated annealing, also a form of gradient descent. Am
         | I missing something?
        
           | ska wrote:
           | This doesn't really sound like SA, which is a principled
           | approach to avoid the local min problem in descent algorithms
           | and end up with a true global optimization. The general
           | result is clean and impractical, the approximations hard to
           | analyze.
        
         | CalChris wrote:
         | This sounds like a more intelligent version of
         | superoptimization. The original Masselin SO, albeit for its
         | time, also created surprising results which is similar to
         | AlphaDev's _incomprehensible for humans_. You see the same
         | thing in computer chess which Agadmator calls _disgusting
         | engine lines_.
         | 
         | https://courses.cs.washington.edu/courses/cse501/15sp/papers...
        
         | thesz wrote:
         | So it works as a superoptimizer of sorts.
        
         | sytelus wrote:
         | If more compute is spent than before but in parallel to reduce
         | latency then wouldn't this increase power? Shouldn't latency
         | and power both be objective?
        
         | 13of40 wrote:
         | > something incomprehensible for humans
         | 
         | This might be buggy whip talk, but I wonder if you could take
         | the same system and apply it to smaller problems (e.g.
         | computing an 8-bit hash) so the novel techniques could be
         | identified and used by humans.
        
         | thomasahle wrote:
         | I'm disappointed a the hashing is just based on training on
         | microbenchmarks and SMHasher, rather than designing a fast
         | _provably_ universal hash. Suites like SMHasher are never
         | complete. They are just trying to catch the most common
         | weaknesses. If you train on the test cases you'll only get an
         | algorithm that passes the tests, but people can always find a
         | set of values on which you will do badly.
        
           | eklitzke wrote:
           | I think you're confusing proving that the hash function is
           | collision resistant with the other goal which is hashing
           | speed. If you really need a collision resistant hash you need
           | to use a cryptographic hash function, but outside of
           | cryptographic applications that is rarely the requirement.
           | And (huge caveat, this isn't my domain expertise) I'm not
           | sure what security properties are really "proven" about
           | existing cryptographic hash functions, AFAIK existing
           | cryptographic hash functions are considered secure because we
           | don't know how to break them, not because of some fundamental
           | mathematical property about them.
           | 
           | For the other 99.999% of hashing applications there is a
           | balance between collision resistance and hashing latency. For
           | example, in a hash table (probably the most common use for a
           | non-cryptographic hash function) there is a cost incurred by
           | hash collisions because lookups on keys with collisions may
           | have to do extra probing. On the other hand, _every_ hash
           | table lookup requires doing at least one hash operation,
           | regardless of whether or not it collides. So it may make
           | sense to have a slightly worse hash function (in the sense
           | that it is more likely to have collisions with pathological
           | inputs) if it has slightly lower latency. The only way to
           | really know what is faster for a real world application is to
           | have some kind of benchmark to train against as a loss
           | function.
        
             | Someone wrote:
             | > I'm not sure what security properties are really "proven"
             | about existing cryptographic hash functions
             | 
             | AFAIK, we don't even know whether trapdoor functions
             | _exist_.
             | 
             | https://en.wikipedia.org/wiki/Trapdoor_function:
             | 
             |  _"As of 2004, the best known trapdoor function (family)
             | candidates are the RSA and Rabin families of functions"_
             | 
             | Also note that the 'examples' section starts with:
             | 
             |  _"In the following two examples, we always assume it is
             | difficult to factorize a large composite number (see
             | Integer factorization)."_
        
             | tga_d wrote:
             | >If you really need a collision resistant hash you need to
             | use a cryptographic hash function, but outside of
             | cryptographic applications that is rarely the requirement.
             | 
             | There are reasons to use (strongly) collision resistant
             | hashes outside of cryptographic settings. E.g., the default
             | Rust hash function, used in hash maps and sets, has strong
             | collision resistance, because otherwise you could open up
             | applications to DoS attacks (the attacker uses lots of
             | inserts with collisions to kill performance of accesses and
             | further inserts at those buckets).[0]
             | 
             | >I'm not sure what security properties are really "proven"
             | about existing cryptographic hash functions, AFAIK existing
             | cryptographic hash functions are considered secure because
             | we don't know how to break them, not because of some
             | fundamental mathematical property about them.
             | 
             | There are provably secure hash functions[1] (typically
             | using the same sort of primitives as public key crypto),
             | but they're generally only used when certain properties
             | need to be composed, and are often less secure than the
             | non-provable ones in practice anyway. This is pretty
             | similar to the state of symmetric vs. asymmetric
             | cryptography in general: primitives like RSA, DH, etc. have
             | much stronger proofs than AES, but algorithms built using
             | AES for security are generally viewed as a lot less likely
             | to be broken any time soon than algorithms built using
             | typical asymmetric primitives for security, even ignoring
             | things like quantum advantage.
             | 
             | [0] https://doc.rust-
             | lang.org/std/collections/struct.HashMap.htm...
             | 
             | [1] https://en.wikipedia.org/wiki/Security_of_cryptographic
             | _hash...
        
             | thomasahle wrote:
             | > I think you're confusing proving that the hash function
             | is collision resistant with the other goal which is hashing
             | speed. If you really need a collision resistant hash you
             | need to use a cryptographic hash function.
             | 
             | I wish this misconception would die. There is a great
             | theory of algorithmic probabilistic hash functions,
             | completely distinct from cryptographic hash functions. If
             | you are designing a hash table, or a different algorithm
             | using a hash function, you nearly always want the former
             | kind.
             | 
             | The idea is that `Pr[h(x) = h(y)]` is small _no matter the
             | inputs x and y_. Here the probability is over the random
             | seed of h. Lots of good hash functions, like UMASH
             | (https://engineering.backtrace.io/2020-08-24-umash-fast-
             | enoug...) has this guarantee. Other fast hash functions,
             | like MURMUR don't.
             | 
             | When a function doesn't have this guarantee, it means I can
             | find sets of values x1, x2, ... that will likely collide
             | under _any_ or most seeds! Sure, if your inputs are
             | basically random, this probably won't happen, but people
             | can still use this to DDoS your hash table, or whatever you
             | are coding.
             | 
             | Notice again, this has nothing to do with cryptography. It
             | is all about probabilistic guarantees. You can't just test
             | the hash function on a fixed number of inputs and say it's
             | good, since you may just have moved the "bad set" to
             | somewhere else.
             | 
             | In this day and age there are super fast algorithmic hash
             | functions with guaranteed low expected collisions. It's
             | just silly to use one that you can break so easily.
        
           | jacquesm wrote:
           | Indeed, and this has been the case for quite a while now. You
           | can _always_ improve on some general algorithm by taking
           | advantage of knowledge of the data but that never generalizes
           | and usually leads to either worse performance on other data
           | and /or new pathological cases that result in results that
           | are unusable.
           | 
           | It's an instance of overfitting.
        
             | bee_rider wrote:
             | Ship the optimization framework in with the application,
             | sample from the user data, and optimize for that? It isn't
             | overfitting if you overfit on the data you care about,
             | right?
        
               | [deleted]
        
               | scns wrote:
               | Sounds like the JVMs recompilation of hor paths to me.
        
               | jacquesm wrote:
               | Data tends to change over time, and once a hash function
               | is in use you can't really replace it easily without a
               | lot of overhead, possibly quite a bit more overhead than
               | what you saved in the first place. There are some
               | examples of this in the sorting arena too, such as
               | 'Timsort', personally I haven't found any that gave a
               | substantial boost, but probably there are some cases
               | where they do. Unless sorting or hashing (and lookup) are
               | the main bottleneck for an application I would spend my
               | time on other aspects of it.
        
         | cabalamat wrote:
         | > proves the point that optimal programs are very inhuman
         | 
         | Maybe there should be an AI that produces optimally-
         | readable/understandable programs? That's what I would want if I
         | was adding the output to a codebase.
        
           | whatever1 wrote:
           | Optimal routing of delivery vehicles for UPS/Fedex etc is
           | also non-compressible to the drivers, so the planners often
           | intentionally generate suboptimal solutions.
           | 
           | A suboptimal implemented solution is a better than an optimal
           | not implemented one.
        
           | make3 wrote:
           | that's what LLMs can with rl from human (or ai) readability
           | feedback & instruction tuning + prompting. we will 100% see
           | this if gpt-4 doesn't already do this.
        
             | ska wrote:
             | I wouldn't classify any of the output I've seen so far as
             | "optimally readable/understandable".
             | 
             | Some if it looks pretty ok, especially where it overlaps
             | with well established approaches.
        
               | chaxor wrote:
               | It can do well with optimization and readability *if you
               | ask it specifically for those things*. Especially if you
               | have a particular paradigm and algorithm in mind (you
               | obviously already should anyway).
               | 
               | This is why these systems are helpful in programming:
               | they allow developers to think more about the design
               | paradigms, and algorithmic solutions, rather than the
               | fine grained code syntax and typing.
               | 
               | My hope (not prediction unfortunately, but *hope*) is
               | that these systems will make people "better* programmers.
               | This _could_ happen by alleviating the requirement of
               | typing out the code in a particular way, and allowing
               | more time to really try out or think carefully about the
               | correct solution for how to make their programs (i.e.
               | multiprocessed, producer-consumers, distribute data with
               | ipfs, faster algorithms, etc)
        
           | m3kw9 wrote:
           | It's not that is written like obfuscated, the routine/ algo
           | is just hard to understand even if they commented every line.
           | Likely some recursive trick is involved, those are always
           | hard to follow
        
         | hajile wrote:
         | To what extent is this simply working around the weirdness of
         | x86? Do these improvements apply to something like MIPS, ARM64,
         | or RISC-V that have inherently simpler ISAs?
        
           | danlark wrote:
           | In this particular case they were universal but in paper it's
           | said the optimizations were done on x86. One of the ideas was
           | to use LLVM IR but intuition for optimizer over optimizer was
           | unlikely to work properly.
        
             | ndesaulniers wrote:
             | > but intuition for optimizer over optimizer was unlikely
             | to work properly.
             | 
             | Wut?
        
       | orlp wrote:
       | > AlphaDev uncovered new sorting algorithms that led to
       | improvements in the LLVM libc++ sorting library that were up to
       | 70% faster for shorter sequences and about 1.7% faster for
       | sequences exceeding 250,000 elements.
       | 
       | As someone that knows a thing or two about sorting... bullshit.
       | No new algorithms were uncovered, and the work here did not
       | _lead_ to the claimed improvements.
       | 
       | They found a sequence of assembly that saves... one MOV. That's
       | it. And it's not even novel, it's simply an unrolled insertion
       | sort on three elements. That their patch for libc++ is 70% faster
       | for small inputs is only due to the library not having an
       | efficient implementation with a *branchless* sorting network
       | beforehand. Those are not novel either, they already exist, made
       | by humans.
       | 
       | > By open sourcing our new sorting algorithms in the main C++
       | library, millions of developers and companies around the world
       | now use it on AI applications across industries from cloud
       | computing and online shopping to supply chain management. This is
       | the first change to this part of the sorting library in over a
       | decade and the first time an algorithm designed through
       | reinforcement learning has been added to this library. We see
       | this as an important stepping stone for using AI to optimise the
       | world's code, one algorithm at a time.
       | 
       | I'm happy for the researchers that the reinforcement learning
       | approach worked, and that it gave good code. But the paper and
       | surrounding press release is self-aggrandizing in both its
       | results and impact. That this is the first change to 'this part'
       | of the sorting routine in a decade is also just completely
       | cherry-picked. For example, I would say that my 2014 report and
       | (ignored patch of) the fact that the libc++ sorting routine was
       | QUADRATIC (https://bugs.llvm.org/show_bug.cgi?id=20837) finally
       | being fixed late 2021 https://reviews.llvm.org/D113413 is quite
       | the notable change. If anything it shows that there wasn't a
       | particularly active development schedule on the libc++ sorting
       | routine the past decade.
        
         | throwaway106382 wrote:
         | [flagged]
        
         | dundarious wrote:
         | Yes, I believe what they're doing already exists in the
         | literature as "supercompilation", though good to see its
         | application under any name.
        
         | choppaface wrote:
         | A decade ago Google had people with breadth and context who
         | could have adjusted the framing of the result of this work (and
         | maybe even re-targeted it yo something useful). Today however
         | Google is a mix of hyper-narrow expert ICs and leaders who lack
         | domain expertise. Paper in Nature? Sure let's take it!
        
         | danlark wrote:
         | I agree on the sorting front. Removing one cmov is not likely
         | to improve much.
        
           | orlp wrote:
           | It's not even a cmov. Look at Figure 3 in the paper:
           | https://www.nature.com/articles/s41586-023-06004-9
           | 
           | They eliminated a register-register mov.
        
         | scotty79 wrote:
         | Shouldn't libc++ be kinda good? Since it's the standard and
         | all? Why isn't/wasn't it?
        
           | petschge wrote:
           | The LLVM implementation of libc++ is very young compared to
           | other implementations (it was started 5 years ago or so), so
           | there is still a lot of things missing and a lot of things
           | that can be improved.
        
             | bigcheesegs wrote:
             | libc++ was open sourced May 11th 2010. 13 years ago.
             | https://blog.llvm.org/2010/05/new-libc-c-standard-
             | library.ht...
        
         | TaupeRanger wrote:
         | This is DeepMind's modus operandi. Every press release is just
         | utterly hyperbolic nonsense that doesn't withstand the
         | slightest scrutiny. AlphaGo, AlphaFold, AlphaDev... they've
         | done literally nothing to improve the human condition, and may
         | have actually made things worse. Go players have DECREASED
         | their Elo after playing AlphaGo (or just quit the game
         | altogether). I would be embarrassed to be associated with
         | DeepMind.
        
           | vosper wrote:
           | > Go players have DECREASED their ELO after playing AlphaGo
           | (or just quit the game altogether)
           | 
           | Can you explain this for someone unfamiliar with the game?
        
             | schaefer wrote:
             | Lee Sedol retired in 2019 following his 2016 defeat by
             | Alpha Go in a 5 round match. At the start of the match most
             | people were confident an AI could never defeat a top human
             | player at Go. By the end of the match, watching (arguable)
             | world champ Sedol suffer lost game after lost game the
             | story had changed dramatically. Sedol fans were championing
             | his single win against the unstoppable AI.
             | 
             | We (hacker news) discussed Lee Sedol's retirement here: [1]
             | 
             | To active go players at the time, Alpha Go and Alpha Zero
             | really were as shocking as the debut of Chat GTP was
             | recently.
             | 
             | [1]: https://news.ycombinator.com/item?id=21649495
        
             | dfan wrote:
             | I am familiar with the game and I cannot explain it. Go is
             | not typically rated with the Elo system, and the quality of
             | top-level human play has increased since 2016.
        
               | 29athrowaway wrote:
               | That is correct but in game servers, ranks (e.g.: 1 kyu,
               | 1 dan) are computed via Elo-like means.
               | 
               | https://forums.online-go.com/t/how-does-the-rating-
               | system-wo...
        
               | nahstra wrote:
               | I also play some, and yeah that's incorrect. Also, there
               | was recently a bunch of hype about an adversarial
               | strategy that could beat AIs w/o deep readout (i.e. only
               | using the 'policy network' nonlinear ML stuff and not
               | enough actual calculation). Here's a vid of an amateur
               | doing it.
               | 
               | https://www.youtube.com/watch?v=H4DvCj4ySKM
               | 
               | EDIT:
               | 
               | Also, if you want to get into it, Micheal Redmond's Go TV
               | on youtube has an amazing beginner playlist, watch some
               | of that then maybe blacktoplay.com and if you likey play
               | :)
        
             | scotty79 wrote:
             | Maybe they just lost so their ELO went down? Or do loses
             | against AI not count?
        
               | rcxdude wrote:
               | Either way, in ELO it depends on the ELO of your
               | opponent, and so a loss against an AI with an accurate
               | (very high) ELO is not going to lose you much unless
               | you're one of the best players in the world (heck, in
               | chess, Magnus Carlsen's ELO would still go down by
               | literally nothing from losing against stockfish).
        
           | programmer_dude wrote:
           | BTW Elo is not an abbreviation it is a persons name:
           | https://en.wikipedia.org/wiki/Arpad_Elo
        
           | blazespin wrote:
           | It's a bit surprising how poorly DeepMind has lived up to
           | their hype. But they're an OK lab, maybe a bit overly vain.
        
             | TaupeRanger wrote:
             | It's probably very fun to be _at_ DeepMind, I just don 't
             | think I'd want to be a part of the cringey hype machine.
        
               | blazespin wrote:
               | I bet it really sucks, tbh. They did all this over
               | promising and now the only way they can deliver is by
               | grifting like this. That sounds really stressful to me.
        
               | lurknot wrote:
               | ya bet it sucks making $500-800k/year comp getting access
               | to the best hardware and google datasets
        
               | blazespin wrote:
               | I've seen people who get promoted above their band, they
               | are not happy campers.
        
               | jimsimmons wrote:
               | Why not. It's not like they have deadlines
        
               | 29athrowaway wrote:
               | They still have to publish something in major journals
               | and have presence in major conferences.
        
               | jprd wrote:
               | a.k.a. - they still care enough to have some semblance of
               | shame. There's the rub.
        
               | 29athrowaway wrote:
               | At least they've produced tangible value unlike black
               | holes of money like the Human Brain project which has
               | delivered close to nothing in multiple decades despite
               | billions of dollars in investment.
        
           | chaxor wrote:
           | This is unbelievably wrong. Deepmind has probably the best
           | academic group in RL. The difference between Deepmind and
           | OpenAI is that Deepmind favors academic endeavours and
           | novelty much more, while completely ignoring any
           | commercialization or products, while OpenAI is the stark
           | opposite in that they almost entirely focus on products
           | first, and typically their academic endeavours are just
           | slight modifications of other works scaled up.
           | 
           | Don't get me wrong, Sutskyver (et al) has done incredibly
           | good work previously, but when it comes to the products,
           | they're much more engineering and marketing polish than
           | scientific endeavours.
           | 
           | Botvinick's work on metaRL for example is an interesting
           | direction that Deepmind has shown that few other companies
           | that are only interested in engineering would venture
           | towards.
        
           | rcxdude wrote:
           | I think this is a bit far in the other direction. Deepmind's
           | stuff is often deeply impressive, they just have a tendency
           | to exaggerate on top of that.
        
             | jprd wrote:
             | +1
             | 
             | There must be a "demonstrate (1) DeepMind #Win per
             | <interval>" requirement somewhere that gets the once-over
             | from the marketing dept. to meet some MBOs.
        
           | gdy wrote:
           | "may have actually made things worse. Go players..."
           | 
           | Go players are using AI to get better at Go.
        
             | TaupeRanger wrote:
             | That claim is often made, and never substantiated.
        
               | gdy wrote:
               | Just ask any professional Go player.
        
               | digging wrote:
               | It's made here in response to a claim that Go players are
               | made worse by practicing with AlphaGo, which is also
               | unsubstantiated.
        
               | nahstra wrote:
               | https://www.amazon.com/How-Play-Way-Explained-
               | illustrative/d...
        
           | 29athrowaway wrote:
           | Around the time of the AlphaGo challenge and afterwards...
           | 
           | 1) you could see increased activity in go clubs and online go
           | servers
           | 
           | 2) the analysis of the games published by Deepmind has
           | resulted in interesting "discoveries" (or rediscoveries) and
           | changes to what is considered joseki.
           | 
           | 3) many people started analyzing their kifus using AI, to
           | find fluctuations in estimated win rate across moves.
           | 
           | So I disagree entirely
        
           | malux85 wrote:
           | I thought the same thing - it smacks of desperation at the
           | moment, any tiny win is exaggerated .
           | 
           | It's not hard to see why, with the emergence (ha) of OpenAI,
           | Midjourney and all of this generative modelling, what has
           | DeepMind done? I imagine the execs at Google are asking them
           | some very probing questions on their mediocre performance
           | over the last 5 years.
        
             | chaxor wrote:
             | Deepmind has done quite an enormous amount actually, but
             | it's been in _academia_ not in the commercial product
             | sphere. Just because something is not on a little web page
             | available to average Joe 's does not mean there isn't value
             | in it. For example, Deepmind's work towards estimating
             | quantum properties of materials via density functional
             | theory may not be the best toy for your grandma to play
             | around with, but it certainly does move academia way
             | further ahead of where it once was.
        
               | malux85 wrote:
               | I run atomictessellator.com and have been working on many
               | different implementations of density functional theory
               | for the last 10 years, as well as working closely with
               | professors at Stanford university and Oxford university
               | of using advanced, non-static geometry data structure for
               | density functional theory for multiple years, this is a
               | subject I know a LOT about, so I'm glad you brought it
               | up.
               | 
               | Deep minds work on Density functional theory was complete
               | rubbish, and everyone in computational chemistry knows
               | it. They simply modelled static geometry and overfit
               | their data, we wanted this methodology to work, computing
               | DFT is expensive, and we did multiple months of rigorous
               | work and the reality of the situation is that a bunch of
               | machine learning engineers with a glancing amount of
               | chemistry knowledge, made approximations that were way
               | too naive, and announced it as a huge success in their
               | typical fashion.
               | 
               | What they then count on is people not having enough
               | knowledge of DFT / Quantum property prediction to query
               | their work and make claims like "it certainly does move
               | academia way further ahead" - which is total rubbish. In
               | what way? Why aren't these models being used in ab initio
               | simulators now? The answer to that is simple: they are
               | not revolutionary, in fact they are not even useful.
        
               | dr_dshiv wrote:
               | How so? Legitimately interested.
        
         | blazespin wrote:
         | Yeah, this type of grifting is very proforma for Researchers. I
         | used to stress about it, but realize it just sort of goes with
         | the territory.
         | 
         | It's also worth noting that the paper is blindingly obvious and
         | everyone started doing this a long time ago but didn't want to
         | tip their cards.
         | 
         | And that's the real contribution here - Google is tipping their
         | cards. We now have a rough baseline to compare our results
         | against.
        
           | ryao wrote:
           | It is likely meant to help secure further research funds.
        
       | SethTro wrote:
       | LLVM merge: https://reviews.llvm.org/D118029 Benchmark:
       | https://bit.ly/3AtesYf
       | 
       | Benchmark seems to be in the range of a 1-5% improvement for 80%
       | of sizes.
        
         | andreamichi wrote:
         | And for the hashing patch this is the commit to Abseil:
         | https://github.com/abseil/abseil-cpp/commit/74eee2aff683cc7d...
        
       ___________________________________________________________________
       (page generated 2023-06-07 23:00 UTC)