hngopher.com

       [HN Gopher] Enzyme - High-performance automatic differentiation ...
       ___________________________________________________________________
        
       Enzyme - High-performance automatic differentiation of LLVM
        
       Author : albertzeyer
       Score  : 64 points
       Date   : 2021-02-03 10:15 UTC (1 days ago)
        
 (HTM) web link (enzyme.mit.edu)
 (TXT) w3m dump (enzyme.mit.edu)
        
       | KenoFischer wrote:
       | Also see the Julia package that makes it acessible with a high
       | level interface and probably one of the easier ways to play with
       | it: https://github.com/wsmoses/Enzyme.jl.
        
       | ksr wrote:
       | > The Enzyme project is a tool for performing reverse-mode
       | automatic differentiation (AD) of statically-analyzable LLVM IR.
       | This allows developers to use Enzyme to automatically create
       | gradients of their source code without much additional work.
       | 
       | Can someone please explain applications of creating gradients of
       | my source code?
        
         | 6d65 wrote:
         | AFAIK, It's mainly used for implementing gradient descent,
         | which is used for training neural networks.
         | 
         | Frameworks like pytorch, tensorflow, probably used back
         | propagation to calculate the gradient of a multidimensional
         | function. But in involves tracing, and storing the network
         | state during the forward pass.
         | 
         | Static automatic differentiation should be faster and should
         | look a lot like differentiation is done mathematically rather
         | than numerically.
         | 
         | Of course there are more applications to AD in scientific
         | computing.
        
           | oscargrouch wrote:
           | I think Swift is also going into this direction of backing in
           | directly into the compiler and provide it as a higher level
           | language construction.
           | 
           | https://github.com/apple/swift/blob/main/docs/Differentiable.
           | ..
           | 
           | Which leads to "Swift for Tensorflow" that unlike other
           | languages like Java, Go or Python is not just about bindings
           | to the C++ tensorflow library.
        
         | ant6n wrote:
         | Optimization. Then again, one could probably calculate
         | gradients numerically.
        
           | cameronperot wrote:
           | One could, but automatic differentiation is much more
           | efficient than numerical differentiation, thus for high
           | performance applications it is preferable to use automatic
           | differentiation.
        
         | PartiallyTyped wrote:
         | It constructs an analytical gradient from the code. The reason
         | is that you can compute the gradient directly. This can enable
         | optimizations such as avoiding caching big matrices because you
         | don't need to keep track of states/trace the graph, or you can
         | compute the 2nd, 3rd, 4th... and so on derivatives because you
         | have an analytical gradient.
         | 
         | For example in an affine function, the gradient of the
         | bias/intercept is the gradient of the loss wrt the activation
         | function and for the weights, it's the product of loss wrt
         | activation function and the input to the layer.
         | 
         | With automatic graph construction e.g. eager
         | Tensorflow/Pytorch, the layer needs to cache the input of the
         | layer, so that it can compute the gradient of the weights. If
         | the layer receives inputs multiple times within the computation
         | graph, you end up caching it multiple times.
         | 
         | With analytical gradients, you may be able to save memory by
         | finding optimizations because you have the analytical gradient,
         | e.g. above you can sum the inputs ie (dL/dz)input1 +
         | (dL/dz)input2 = (dL/dz)(input1+input2).
        
         | [deleted]
        
       | bdauvergne wrote:
       | Funny, I worked on Tapenade (one of the compared automatic
       | differentiation software). I'm happy that it still reaches 60% of
       | the performance of something written directly inside an
       | optimizing compiler.
        
       ___________________________________________________________________
       (page generated 2021-02-04 23:00 UTC)