hngopher.com

       [HN Gopher] Learning to Superoptimize real-world programs
       ___________________________________________________________________
        
       Learning to Superoptimize real-world programs
        
       Author : pramodbiligiri
       Score  : 54 points
       Date   : 2021-09-30 17:07 UTC (5 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | fulafel wrote:
       | This sounds really promising. Could this be extended to other
       | kinds of optimizations, eg data layout for memory traffic
       | reduction?
        
         | mhh__ wrote:
         | Almost definitely, only rub is that the compiler isn't allowed
         | to do anything hugely funky with memory layout in C/C++ esque
         | languages.
        
           | fulafel wrote:
           | Yes, those are lost causes if they can't ride the as-if
           | optimization rule. But if the assembly language modules just
           | have to return same outputs for same inputs...
        
       | mach1ne wrote:
       | I don't understand the wording: "Our method, SILO, superoptimizes
       | programs an expected 6.2% of our test set when compared with the
       | gcc version 10.3 compiler's aggressive optimization level -O3."
       | 
       | Does that mean that the runtime is only 6.2% of the runtime after
       | gcc optimization, or that there was a 6.2% reduction in runtime
       | compared to gcc optimization?
        
         | O_nlogn wrote:
         | Neither, they found that 6.2% of the programs that SILO
         | generated performed better on their benchmarks (based on a sum
         | of the expected and actual execution latency of the assembly on
         | some test data) than the gcc -O3 baseline (i.e. 93.8% of the
         | programs SILO generated did not perform better, or were not
         | correct). AFAIK they don't state _how much_ better the super
         | optimized programs were.
        
       | WithinReason wrote:
       | I'm convinced that neural networks in compilers are the Next Big
       | Thing in AI. e.g. GPUs are very difficult to program for mere
       | mortals.
        
         | mhh__ wrote:
         | Even things like instruction selection could be pretty
         | interesting. People think compilers are basically sentient but
         | they really aren't, even ignoring the tuning parameters, they
         | really do need help when you get the edges of what the ISA
         | designers incorporate.
         | 
         | Similarly, try and work out how a compiler schedules and lays
         | out code for a modern superscalar processor. Both things _do_
         | matter (potentially a lot) but it 's not like the old days
         | where you have a very fixed model of the pipeline to evaluate
         | your schedule with (or a simple one at least)
        
         | hinkley wrote:
         | Escape analysis leads to Rust's memory safety model. I think we
         | will see a similar cross-domain learnings in optimization, and
         | I don't think NN are it. Monte Carlo simulation seems more
         | likely to me, and that can steal ideas from either game AI,
         | property-based testing, or both.
         | 
         | More boringly, treating the compilation metadata more as a
         | database may also be waiting to be exploited. Right now we have
         | feedback-directed optimization which basically takes a 1:many
         | or 1:1 approach of one data set gives you one binary. Next
         | compile you start with new input and run the process again,
         | whereas retaining a history might allow for deeper tree
         | searches.
         | 
         | It's always annoyed me a little that every single time my code
         | loads or gets compiled that the runtime has to start from
         | ground state and build up. If I run the same compile fifty
         | times in a row it's always the same output and it always takes
         | the same amount of time. If a big improvement is just beyond
         | the search budget it will forever remain out of sight.
         | Especially in a CI/CD world my ratio of changes to binaries is
         | very, very low, and so the waste is much more pronounced.
         | 
         | If instead you store a map of decisions and constraints, can I
         | test the constraints, flush all of the decisions whose
         | constraints are violated, and begin my search tree from there?
         | Going a little deeper every time in stable areas of the code?
        
           | isaacimagine wrote:
           | You mention searching the tree of potential programs that
           | satisfy a set of constraints--but what's the fastest way to
           | search that tree, both in making edits that are contextually
           | sensible, and in using prior programs to factor out common
           | patterns?
           | 
           | If you want to go off the deep end of the possibilities of
           | NNs in compilers, I suggest you look up wake-sleep program
           | synthesis at the least:
           | https://dl.acm.org/doi/10.1145/3453483.3454080
        
         | makapuf wrote:
         | Well, we quite dont know since AFAICT GPUs instruction sets
         | (real instruction set, not IR) are completely closed, so the
         | first reason they're difficult to program is by secrecy, not
         | technical.
        
           | mhh__ wrote:
           | AMD is open, right?
        
             | kimixa wrote:
             | Indeed, you can get them all here -
             | https://gpuopen.com/documentation/amd-isa-documentation/
             | 
             | And that's the ISA that actually executes on the gpu,
             | including instruction encoding and whatnot - not just some
             | IR
        
             | fulafel wrote:
             | Intel can't be too well kept secret either since they
             | develop drivers as opsource.Then there are the reverse
             | engineered mobile GPUs.
        
               | mhh__ wrote:
               | Openness is also a no-brainer play to get market share
               | especially as a business-card e.g. if we buy a GPU, it
               | only needs above a certain performance - stock and driver
               | support is all that matters for us in today's market
               | (obviously this is not the case for GPGPU work)
        
       ___________________________________________________________________
       (page generated 2021-09-30 23:01 UTC)