[HN Gopher] CompilerGym: A toolkit for reinforcement learning fo...
       ___________________________________________________________________
        
       CompilerGym: A toolkit for reinforcement learning for compiler
       optimization
        
       Author : azhenley
       Score  : 106 points
       Date   : 2021-02-02 14:43 UTC (8 hours ago)
        
 (HTM) web link (facebookresearch.github.io)
 (TXT) w3m dump (facebookresearch.github.io)
        
       | smhx wrote:
       | I've been helping on the project, it's lead by Chris Cummins and
       | Hugh Leather.
       | 
       | Just a heads-up for folks, we haven't fully cleaned up and gotten
       | ready for public attention yet, we are 90% there.
       | 
       | Once we are golden, we're going to write a note with a way to
       | submit the results of your own agents, compare with baselines
       | (random, actor-critic), etc.
       | 
       | And as others noticed, in it's current form we're focusing on
       | code size and phase ordering, but we will be expanding over time
       | to other optimization problems like runtime.
        
         | matoro wrote:
         | This is really cool - I understand how the reinforcement loop
         | works for improving performance, but how does it verify that
         | the optimizations applied don't change the
         | semantics/correctness of the code?
        
           | brundolf wrote:
           | Regular old tests, I imagine
        
       | algo_trader wrote:
       | Where do they benchmark code candidates? This is hard, noisy and
       | expensive (I see example reward for code size).
       | 
       | I am not an expert, but bayesian learning maybe more appropriate
       | for such an expensive-sampling environment?
        
         | sstangl wrote:
         | They are optimizing only for code size, so they don't need to
         | run the code: they only count bytes emitted.
         | 
         | This works by changing around the order / interleaving of
         | various LLVM optimization phases, so the learning process does
         | not require knowledge of program timing or correctness.
        
           | liuliu wrote:
           | I heard when they started to do this, LLVM folks were freaked
           | out because they worried changing orders may affect
           | correctness. But it seems to work so far :)
        
       | wyldfire wrote:
       | This looks pretty cool. FB does crank out some cool toolchain
       | stuff.
        
       | gavinray wrote:
       | Holy smokes, this is so fucking cool.
       | 
       | ML is neat conceptually, but as far as practical applications
       | this really excites me.
        
       | yters wrote:
       | ML cannot solve the halting problem.
        
         | hinkley wrote:
         | Compression can't solve the pigeonhole problem either. And yet
         | we're using it right now.
         | 
         | Most of the interesting stuff I want to accomplish is avoiding
         | intractable problems, if for no other reason than that our
         | peers will accuse us of tilting at windmills or boiling the
         | ocean.
         | 
         | Finding the problem you _can_ solve is effectively relaxation.
        
         | BenoitP wrote:
         | But it can optimize heuristics to death.
        
         | aparsons wrote:
         | How does pass reordering relate to the halting problem?
        
       ___________________________________________________________________
       (page generated 2021-02-02 23:01 UTC)