[HN Gopher] Calculus on Computational Graphs: Backpropagation (2...
___________________________________________________________________
Calculus on Computational Graphs: Backpropagation (2015)
Author : throwup238
Score : 97 points
Date : 2024-01-19 16:42 UTC (6 hours ago)
(HTM) web link (colah.github.io)
(TXT) w3m dump (colah.github.io)
| dawnofdusk wrote:
| Nice blog. I'll be provocative/pedantic for no good reason and
| say that what's described isn't "calculus" per se, because you
| can't do calculus on discrete objects like a graph. However, you
| can define the derivative purely algebraically (as a linear
| operation which satisfies the Leibniz chain/product rule), which
| is more accurately what is being described.
| philipfweiss wrote:
| You're not doing calculus on a graph- you're using a graph
| algorithm to automate the derivative taking process.
|
| Essentially, you transform your function into a "circuit" or
| just a graph with edge labels according to the relationship
| between parts of the expression. The circuit has the nice
| property that there is an algorithm you can run on it, with
| very simple rules, which gets you the derivative of the
| function used to create that circuit.
|
| So taking the derivative becomes:
|
| 1. Transform function F into circuit C. 2. Run
| compute_gradiant(c) to get the gradient of F.
|
| Lots of useful examples here:
| https://cs231n.github.io/optimization-2/
| antonvs wrote:
| If we're being pedantic, then there's also a more general
| definition of calculus, which is the first definition in
| Merriam-Webster: "a method of computation or calculation in a
| special notation (as of logic or symbolic logic)." One example
| of this is the lambda calculus. Differential and integral
| calculus are just special cases of this general definition.
| edgyquant wrote:
| Right but this is about differential calculus (the chain
| rule)
| krackers wrote:
| >because you can't do calculus on discrete objects like a graph
|
| Of course you can, what do you think shift operators and
| recurrence relations are?
| https://en.wikipedia.org/wiki/Finite_difference?#Calculus_of...
| coolThingsFirst wrote:
| I still don't understand the process of learning ML, like sure we
| build micrograd but is it only didactic exercise or can we use it
| to train it to do something serious on our own hardware?
| nextos wrote:
| This is like building a tiny C compiler. At some program scale,
| optimizations become important.
| pmelendez wrote:
| >do something serious
|
| I guess it depends on what you mean by serious. Pre-training a
| competitive LLM with current methods and consumer hardware is
| prohibitive for sure. Solving a classification problem could be
| totally doable depending on the domain.
| edgyquant wrote:
| I don't understand this comment, for one we're
| engineers/hackers and should be curious how this stuff works.
| It's exciting. Practically speaking this is like asking why
| learn how to write a simple forum or blog when we can't host
| Facebook on our on hardware: it's going to be hard to work on
| the latest models if you don't first understand the basics.
| coolThingsFirst wrote:
| You learn what Db is you can use a db for your own custom
| task.
|
| What so i do with micrograd? Hang the code on my wall?
| sva_ wrote:
| You'll probably need some hardware acceleration. There's a good
| course that builds something like micrograd in the beginning
| and extends on it: https://dlsyscourse.org/lectures/
| coolThingsFirst wrote:
| I have a 3060 Ti, that enough?
| genman wrote:
| You can totally do some visual classification problems (like
| object detection) on current consumer hardware. Even more. You
| can also take some smaller existing language models and fine
| tune them for some special task - also completely feasible.
| dang wrote:
| Discussed at the time:
|
| _Calculus on Computational Graphs: Backpropagation_ -
| https://news.ycombinator.com/item?id=10148064 - Aug 2015 (9
| comments)
| sva_ wrote:
| If you want more on this, check out this paper by Terence Parr
| and Jeremy Howard: https://arxiv.org/abs/1802.01528
|
| It is rather accessible.
___________________________________________________________________
(page generated 2024-01-19 23:00 UTC)