[HN Gopher] Calculus on Computational Graphs: Backpropagation (2...
       ___________________________________________________________________
        
       Calculus on Computational Graphs: Backpropagation (2015)
        
       Author : throwup238
       Score  : 97 points
       Date   : 2024-01-19 16:42 UTC (6 hours ago)
        
 (HTM) web link (colah.github.io)
 (TXT) w3m dump (colah.github.io)
        
       | dawnofdusk wrote:
       | Nice blog. I'll be provocative/pedantic for no good reason and
       | say that what's described isn't "calculus" per se, because you
       | can't do calculus on discrete objects like a graph. However, you
       | can define the derivative purely algebraically (as a linear
       | operation which satisfies the Leibniz chain/product rule), which
       | is more accurately what is being described.
        
         | philipfweiss wrote:
         | You're not doing calculus on a graph- you're using a graph
         | algorithm to automate the derivative taking process.
         | 
         | Essentially, you transform your function into a "circuit" or
         | just a graph with edge labels according to the relationship
         | between parts of the expression. The circuit has the nice
         | property that there is an algorithm you can run on it, with
         | very simple rules, which gets you the derivative of the
         | function used to create that circuit.
         | 
         | So taking the derivative becomes:
         | 
         | 1. Transform function F into circuit C. 2. Run
         | compute_gradiant(c) to get the gradient of F.
         | 
         | Lots of useful examples here:
         | https://cs231n.github.io/optimization-2/
        
         | antonvs wrote:
         | If we're being pedantic, then there's also a more general
         | definition of calculus, which is the first definition in
         | Merriam-Webster: "a method of computation or calculation in a
         | special notation (as of logic or symbolic logic)." One example
         | of this is the lambda calculus. Differential and integral
         | calculus are just special cases of this general definition.
        
           | edgyquant wrote:
           | Right but this is about differential calculus (the chain
           | rule)
        
         | krackers wrote:
         | >because you can't do calculus on discrete objects like a graph
         | 
         | Of course you can, what do you think shift operators and
         | recurrence relations are?
         | https://en.wikipedia.org/wiki/Finite_difference?#Calculus_of...
        
       | coolThingsFirst wrote:
       | I still don't understand the process of learning ML, like sure we
       | build micrograd but is it only didactic exercise or can we use it
       | to train it to do something serious on our own hardware?
        
         | nextos wrote:
         | This is like building a tiny C compiler. At some program scale,
         | optimizations become important.
        
         | pmelendez wrote:
         | >do something serious
         | 
         | I guess it depends on what you mean by serious. Pre-training a
         | competitive LLM with current methods and consumer hardware is
         | prohibitive for sure. Solving a classification problem could be
         | totally doable depending on the domain.
        
         | edgyquant wrote:
         | I don't understand this comment, for one we're
         | engineers/hackers and should be curious how this stuff works.
         | It's exciting. Practically speaking this is like asking why
         | learn how to write a simple forum or blog when we can't host
         | Facebook on our on hardware: it's going to be hard to work on
         | the latest models if you don't first understand the basics.
        
           | coolThingsFirst wrote:
           | You learn what Db is you can use a db for your own custom
           | task.
           | 
           | What so i do with micrograd? Hang the code on my wall?
        
         | sva_ wrote:
         | You'll probably need some hardware acceleration. There's a good
         | course that builds something like micrograd in the beginning
         | and extends on it: https://dlsyscourse.org/lectures/
        
           | coolThingsFirst wrote:
           | I have a 3060 Ti, that enough?
        
         | genman wrote:
         | You can totally do some visual classification problems (like
         | object detection) on current consumer hardware. Even more. You
         | can also take some smaller existing language models and fine
         | tune them for some special task - also completely feasible.
        
       | dang wrote:
       | Discussed at the time:
       | 
       |  _Calculus on Computational Graphs: Backpropagation_ -
       | https://news.ycombinator.com/item?id=10148064 - Aug 2015 (9
       | comments)
        
       | sva_ wrote:
       | If you want more on this, check out this paper by Terence Parr
       | and Jeremy Howard: https://arxiv.org/abs/1802.01528
       | 
       | It is rather accessible.
        
       ___________________________________________________________________
       (page generated 2024-01-19 23:00 UTC)