[HN Gopher] The matrix calculus you need for deep learning (2018)
___________________________________________________________________
The matrix calculus you need for deep learning (2018)
Author : cpp_frog
Score : 184 points
Date : 2023-07-30 17:18 UTC (1 days ago)
(HTM) web link (explained.ai)
(TXT) w3m dump (explained.ai)
| trolan wrote:
| I finished Vector Calculus last year and have no experience in
| machine learning but this seems exceptionally thorough and would
| have made my life easier having a practical explanation over a
| mathematical one, but woe is the life of the engineering student
| I guess.
| parrt wrote:
| Glad to be of assistance! Yeah, It really annoyed me that this
| critical information was not listed in any one particular spot.
| cs702 wrote:
| Please change the link to the original source:
|
| https://arxiv.org/abs/1802.01528
|
| ---
|
| EDIT: It turns out explained.ai is the personal website of one of
| the authors, so there's no need to change the link. See comment
| below.
| parrt wrote:
| :) Yeah, I use my own internal markdown to generate really nice
| html (with fast latex-derived images for equations) and then
| full-on latex. (tool is https://github.com/parrt/bookish)
|
| I prefer reading on the web unless I'm offline. The latex its
| super handy for printing a nice document.
| cs702 wrote:
| Even though it's shockingly common, I never cease to be
| surprised and delighted when authors who are on HN take the
| time to reply to comments about their work.
|
| Thank you for doing this with Jeremy and sharing it with the
| world!
| parrt wrote:
| Sure thing! Very enjoyable to have people use our work.
| liorben-david wrote:
| Explained.ai seems to be Terrence Parr's personal site
| cs702 wrote:
| Thank you for pointing it out. I edited my comment.
| godelski wrote:
| There's a common belief that you don't need math for ML or that
| you need a lot of math for ML. So let me clarify:
|
| You don't need math to make a model perform well, but you do need
| math to know why your model is wrong.
| rdedev wrote:
| I had followed this when I was learning DL through Andrew NG's
| course. In one of the lessons, he had the formula for calculating
| the loss as well as it's derivatives.
|
| I tried driving these formulas from scratch using what I learned
| from OP's post but it felt like there was something missing. I
| think it boils down to me not knowing how to aggregate those
| element wise derivatives into a matrix form. Afaik the Matrix
| cookbook and certain notes from Stanford cs231n that helped me
| grok it fully
| jayro wrote:
| We just released a comprehensive online course on Multivariable
| Calculus (https://mathacademy.com/courses/multivariable-
| calculus), and we also have a course on Mathematics for Machine
| Learning (https://mathacademy.com/courses/mathematics-for-
| machine-lear...) that covers just the matrix calculus you need in
| addition to just the linear algebra and statistics you need, etc.
| I'm a founder and would be happy to answer any questions you
| might have.
| barrenko wrote:
| Whom do you think Mathematics for Machine Learning benefits? In
| my personal opinion the only audience for a plethora of courses
| and articles available in that regard is useful mostly to the
| people that recently went through college level Linear Algebra.
|
| I'd like more resources geared for people that are done with
| Khan Academy and want something as well made for more advanced
| topics.
| jayro wrote:
| The Mathematics for Machine Learning course doesn't assume
| knowledge of Linear Algebra, but covers the basics of Linear
| Algebra you'll need along with the basics of Multivariable
| Calculus, Statistics, Probability, etc. it does however,
| assume knowledge of high-school math and Single Variable
| Calculus. If you've been out of school for while, our
| adaptive diagnostic exam will identify your knowledge gaps
| and create a custom course for you that includes the
| necessary remediation.
|
| If you're REALLY rusty (maybe you've been out of school for a
| while 5+ years), or maybe you just never learned the material
| that well in the first place, then you might want to start
| with one of our Mathematical Foundations courses that will
| scaffold you up to the level where you can handle the content
| in Mathematics for Machine Learning. More info can be found
| here: https://mathacademy.com/courses
|
| The Mathematics for Machine Learning course would be ideal
| for anyone who majored in a STEM subject like CS (or at least
| has a solid mathematical foundation) and is interested in
| doing work in machine learning.
| thewataccount wrote:
| I understand you don't have a free trial, is there any chance
| you have a demo somewhere of what it actually looks like
| though? Like a tiny sample lesson or something along those
| lines? It looks interesting but I'm just uncertain as to what
| it actually "feels" like in practice vs lets say Brilliant,
| etc.
|
| I only see pictures, I'm curious the extent of the interaction
| in the linear algebra/matrix calc specifically
| quanto wrote:
| The article/webpage is a nice walk-through for the uninitiated.
| Half the challenge of doing matrix calculus is remembering the
| dimension of the object you are dealing with (scalar, vector,
| matrix, higher-dim tensor).
|
| Ultimately, the point of using matrix calculus (or matrices in
| general) is not just concision of notation but also understanding
| that matrices are operators acting on members of some spaces,
| i.e. vectors. It is this higher level abstraction that makes
| matrices powerful.
|
| For people who are familiar with the concepts but need a concise
| refresher, the Wikipedia page serves well:
|
| https://en.wikipedia.org/wiki/Matrix_calculus
| PartiallyTyped wrote:
| Adding, these operators are also "polymorphic"; for matrix
| multiplication the only operations you need are (non
| commutative) multiplication and addition; thus you can use
| elements of any non-commutative ring, i.e. a set of elements
| with those two operations :D
|
| Matrices themselves form non-commutative rings too; and based
| on this, you can think of a 4N x 4N matrix as a 4x4 matrix
| whose elements are NxN matrices [1] :D
|
| [1] https://youtu.be/FX4C-JpTFgY?list=PL49CF3715CB9EF31D&t=1107
|
| You already know whose lecture it is :D
|
| I love math.. I should have become a mathematician ...
| mrfox321 wrote:
| Re [1]: it's fairly concrete to simply say that matrix
| multiplication can be performed block-wise.
| PartiallyTyped wrote:
| I don't disagree; but that is just an example of MM. The
| gist is not that you can do block multiplication; but that
| you can define matrices over any non commutative ring,
| which includes other matrices - ie blocks.
| mrfox321 wrote:
| Yeah matrices are more abstract. I guess I am just
| pointing out that your concrete example of non-
| commutative rings (matrices of matrices) still needs a
| proof to demonstrate bijection between 4N x 4N (scalar)
| and 4 x 4 (N x N(scalar)).
|
| Block MM demonstrates the equivalence.
| _the_inflator wrote:
| I just had a glimpse look at it. A good sum-up.
|
| It seems that these topics are covered by the first one or two
| semesters of a Math degree. Of course university is a bit more
| advanced.
| thatsadude wrote:
| vec(ABC)=kron(C.T,A)vec(C) is all your need for matrix calculus!
| esafak wrote:
| Can anyone provide an intuitive explanation?
| hayasaki wrote:
| They have an error in their formula, but the vectorized
| form(stacking columns of the matrix to form a vector) of the
| triple matrix multiplication(A times B times C) can be
| changed to a form involving kronecker products against
| another vectorized matrix.
|
| I wouldn't say that is everything, but it is a useful trick.
| esafak wrote:
| That is just reading out the equation in English. My
| question is, why is it so?
| hayasaki wrote:
| The correct version you can find here: https://en.wikiped
| ia.org/wiki/Kronecker_product#Matrix_equat...
|
| The answer for why it is so is pretty trivial(just do the
| indexing for each element) if you know the definition of
| the kronecker product and what the 'vec' operation is.
|
| For an intuitive explanation, try thinking of how the
| matrix multiplication would work and consider how the
| kronecker product pattern would apply to the vector.
|
| This honestly isn't a super interesting result, and I
| would say the original commenter was overstating its
| importance in the matrix calculus. It really is more
| useful for solving certain matrix problems, or speeding
| up some tensor product calculations if you have things
| with a certain structure. Like if we have discretization
| of a PDE then depending on the representation the
| operator in the discrete space may be a sum of kronecker
| products, so applying those can be fast using a matrix
| multiply and never storing the kroneckered matrices.
| scrubs wrote:
| Darn good post!
| bluerooibos wrote:
| Oh nice, I did most of this in school, and during my non-CS
| engineering degree. Thanks for sharing!
|
| Always wanted to dip my toes into ML, but I've never been
| convinced of it's usefulness to the average solo developer, in
| terms of things you can build with this new knowledge. Likely I
| don't know enough about it to make that call though.
| williamcotton wrote:
| Here's an ML project I've been working on as a solo dev:
|
| https://github.com/williamcotton/chordviz
|
| Labeling software in React, CNN in PyTorch, prediction on app
| in SwiftUI. 12,000 and counting hand labeled images of my hand
| on a guitar fretboard!
| nsajko wrote:
| Another matrix math reference:
| https://github.com/r-barnes/MatrixForensics
| dang wrote:
| Related:
|
| _The matrix calculus you need for deep learning (2018)_ -
| https://news.ycombinator.com/item?id=26676729 - April 2021 (40
| comments)
|
| _Matrix calculus for deep learning part 2_ -
| https://news.ycombinator.com/item?id=23358761 - May 2020 (6
| comments)
|
| _Matrix Calculus for Deep Learning_ -
| https://news.ycombinator.com/item?id=21661545 - Nov 2019 (47
| comments)
|
| _The Matrix Calculus You Need for Deep Learning_ -
| https://news.ycombinator.com/item?id=17422770 - June 2018 (77
| comments)
|
| _Matrix Calculus for Deep Learning_ -
| https://news.ycombinator.com/item?id=16267178 - Jan 2018 (81
| comments)
| SnooSux wrote:
| This is the resource I wish I had in 2018. Every grad school
| course had a Linear Algebra review lecture but never got into the
| Matrix Calculus I actually needed.
| unpaddedantacid wrote:
| [dead]
| dpflan wrote:
| True, this was a designated resource during my studies
| (2020/2022), but they were post-2018.
| ayhanfuat wrote:
| That was my struggle, too. Imperial College London has a small
| online course which covers similar topics
| (https://www.coursera.org/learn/multivariate-calculus-
| machine...). It helped a lot.
___________________________________________________________________
(page generated 2023-07-31 23:01 UTC)