[HN Gopher] Learning to Superoptimize real-world programs
___________________________________________________________________
Learning to Superoptimize real-world programs
Author : pramodbiligiri
Score : 54 points
Date : 2021-09-30 17:07 UTC (5 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| fulafel wrote:
| This sounds really promising. Could this be extended to other
| kinds of optimizations, eg data layout for memory traffic
| reduction?
| mhh__ wrote:
| Almost definitely, only rub is that the compiler isn't allowed
| to do anything hugely funky with memory layout in C/C++ esque
| languages.
| fulafel wrote:
| Yes, those are lost causes if they can't ride the as-if
| optimization rule. But if the assembly language modules just
| have to return same outputs for same inputs...
| mach1ne wrote:
| I don't understand the wording: "Our method, SILO, superoptimizes
| programs an expected 6.2% of our test set when compared with the
| gcc version 10.3 compiler's aggressive optimization level -O3."
|
| Does that mean that the runtime is only 6.2% of the runtime after
| gcc optimization, or that there was a 6.2% reduction in runtime
| compared to gcc optimization?
| O_nlogn wrote:
| Neither, they found that 6.2% of the programs that SILO
| generated performed better on their benchmarks (based on a sum
| of the expected and actual execution latency of the assembly on
| some test data) than the gcc -O3 baseline (i.e. 93.8% of the
| programs SILO generated did not perform better, or were not
| correct). AFAIK they don't state _how much_ better the super
| optimized programs were.
| WithinReason wrote:
| I'm convinced that neural networks in compilers are the Next Big
| Thing in AI. e.g. GPUs are very difficult to program for mere
| mortals.
| mhh__ wrote:
| Even things like instruction selection could be pretty
| interesting. People think compilers are basically sentient but
| they really aren't, even ignoring the tuning parameters, they
| really do need help when you get the edges of what the ISA
| designers incorporate.
|
| Similarly, try and work out how a compiler schedules and lays
| out code for a modern superscalar processor. Both things _do_
| matter (potentially a lot) but it 's not like the old days
| where you have a very fixed model of the pipeline to evaluate
| your schedule with (or a simple one at least)
| hinkley wrote:
| Escape analysis leads to Rust's memory safety model. I think we
| will see a similar cross-domain learnings in optimization, and
| I don't think NN are it. Monte Carlo simulation seems more
| likely to me, and that can steal ideas from either game AI,
| property-based testing, or both.
|
| More boringly, treating the compilation metadata more as a
| database may also be waiting to be exploited. Right now we have
| feedback-directed optimization which basically takes a 1:many
| or 1:1 approach of one data set gives you one binary. Next
| compile you start with new input and run the process again,
| whereas retaining a history might allow for deeper tree
| searches.
|
| It's always annoyed me a little that every single time my code
| loads or gets compiled that the runtime has to start from
| ground state and build up. If I run the same compile fifty
| times in a row it's always the same output and it always takes
| the same amount of time. If a big improvement is just beyond
| the search budget it will forever remain out of sight.
| Especially in a CI/CD world my ratio of changes to binaries is
| very, very low, and so the waste is much more pronounced.
|
| If instead you store a map of decisions and constraints, can I
| test the constraints, flush all of the decisions whose
| constraints are violated, and begin my search tree from there?
| Going a little deeper every time in stable areas of the code?
| isaacimagine wrote:
| You mention searching the tree of potential programs that
| satisfy a set of constraints--but what's the fastest way to
| search that tree, both in making edits that are contextually
| sensible, and in using prior programs to factor out common
| patterns?
|
| If you want to go off the deep end of the possibilities of
| NNs in compilers, I suggest you look up wake-sleep program
| synthesis at the least:
| https://dl.acm.org/doi/10.1145/3453483.3454080
| makapuf wrote:
| Well, we quite dont know since AFAICT GPUs instruction sets
| (real instruction set, not IR) are completely closed, so the
| first reason they're difficult to program is by secrecy, not
| technical.
| mhh__ wrote:
| AMD is open, right?
| kimixa wrote:
| Indeed, you can get them all here -
| https://gpuopen.com/documentation/amd-isa-documentation/
|
| And that's the ISA that actually executes on the gpu,
| including instruction encoding and whatnot - not just some
| IR
| fulafel wrote:
| Intel can't be too well kept secret either since they
| develop drivers as opsource.Then there are the reverse
| engineered mobile GPUs.
| mhh__ wrote:
| Openness is also a no-brainer play to get market share
| especially as a business-card e.g. if we buy a GPU, it
| only needs above a certain performance - stock and driver
| support is all that matters for us in today's market
| (obviously this is not the case for GPGPU work)
___________________________________________________________________
(page generated 2021-09-30 23:01 UTC)