[HN Gopher] Matrix Multiplication
___________________________________________________________________
Matrix Multiplication
Author : jonbaer
Score : 105 points
Date : 2022-01-17 20:40 UTC (2 hours ago)
(HTM) web link (matrixmultiplication.xyz)
(TXT) w3m dump (matrixmultiplication.xyz)
| d--b wrote:
| Visually, the rotation part makes it much less intuitive than
| writing the result matrix next to the first operand and the
| second operand above the result matrix
| drhodes wrote:
| Denis Auroux explains that way of seeing it,
| https://youtu.be/bHdzkFrgRcA?t=1404
| neosat wrote:
| so useful - thanks!
| tinalumfoil wrote:
| I agree. Matrix multiplication can pretty succinctly be defined
| in terms of the dot product between the row and column of the
| inputs at the corresponding spot in the output. When I first
| learned I already new a little about the properties of dot
| products so it made the "magical" properties of matrix
| multiplication more understandable. Maybe not the way a
| mathematician would define it -- they might look at dot
| products as a special case of matrix multiplication thus teach
| the latter first -- but it was at least intuitive.
| axpy906 wrote:
| I like the visual look of the sliding window.
| phkahler wrote:
| I did until I realized it's computing more than one dot product
| at a time. Then it was suddenly not great. It falls somewhere
| between standard matrix multiplication and what might
| eventually be a good way to understand a systolic array
| multiplier.
| mrtksn wrote:
| The process itself is visualised here well but can someone
| explain why anyone would want to multiply matrixes? Yes, I know
| where it is used and I did my fair share back in college but it
| never really clicked for me in intuitive manner.
|
| Why simply don't go procedural and skip the matrix representation
| altogether? Using matrixes in physics calculations, for example,
| feels like disassociating the intuitiveness of the physics for
| the sake of the matrix calculation tools. When I look at a matrix
| of values I don't see anything.
| kzrdude wrote:
| The benefit is reusing the matrix theory - matrix knowledge(!)
| - in different contexts.
|
| In some cases making a matrix in physics is a quite sterile
| representation, like you suggest, like plugging in the numbers
| in an otherwise generally stated question. The matrix values
| depend on choices of coordinate system and everything, quite
| inelegant in that way.
|
| Why practically use matrixes on the computer? It's efficient to
| compute in blocks instead of one by one.
| gizmo686 wrote:
| Most of the time, the objects you are actually interested on
| are not NxM grids of numbers, but linear functions. For
| simplicity, I will assume we are working with real numbers.
|
| Suppose you have two linear functions: f :: R3
| -> R3 h :: R3 -> R2
|
| and a point: x :: R3
|
| The first problem you have if you want to work with these
| functions is how to represent them. As it turns out, assuming
| you have a set of basis vectors, any linear function Rn -> Rm
| can be represented by exactly n * m real numbers. Further, if
| we arrange these numbers in a grid it is easy to fill in the
| grid if you know how the function acts on the basis vectors,
| and it is easy to read off how the function acts on the basis
| vectors from the grid.
|
| Similarly, a point in Rn can be represented by exactly n real
| numbers. Again, this representation depends on a choice of
| basis vectors.
|
| For simplicity, let [?] represent the matrix representation of
| ?.
|
| The second problem you want to tackle is how to compute
| function application y = f(x). Given the representations
| defined above, y is the point in R3 that is represented by the
| result of the matrix multiplication [f][x].
|
| The third problem you want to solve is how to compute the
| function composition g = f[?]h. Again, g is the function given
| by the matrix representation [g] = [f][h]
| paulpauper wrote:
| a major application is markov chains
| mrtksn wrote:
| How do you think about it in intuitive way?
| max_ wrote:
| This is what you should show students before sitting them through
| a 1 hour class lecture.
| clircle wrote:
| Is the connotation that this is the most intuitive way to think
| about or visualize matrix multiplication? If so, I reject.
| windsignaling wrote:
| Agreed.
|
| Posts like this seem to get upvoted on HN just because "oooh, a
| pretty animation" and/or they struggled with college math "See?
| This is how they should've taught it! Pictures are better than
| lectures!"
| Twisol wrote:
| I'd guess that the "sliding window" metaphor means you could
| model matrix multiplication as a convolution. Does this give any
| intuition for matrix multiplication via the Fast Fourier
| Transform?
|
| https://en.m.wikipedia.org/wiki/Convolution_theorem
|
| > the convolution theorem states that under suitable conditions
| the Fourier transform of a convolution of two functions (or
| signals) is the pointwise product of their Fourier transforms.
| amelius wrote:
| The fastest known matrix multiplication algorithm is the
| Coppersmith-Winograd algorithm with a complexity of
| O(n^2.3737). FFT is O(n log n). So probably not.
| Twisol wrote:
| Apologies; apparently I was thinking of _integer_
| multiplication using FFT:
|
| https://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strasse.
| ..
|
| But my question (which wasn't about asymptotic complexity)
| stands: can we think about matrix multiplication as a
| convolution? If so, we can do pointwise multiplication
| sandwiched between Fourier transforms -- I don't expect it to
| be _fast_ , I just expect it to be possible.
| ogogmad wrote:
| My advice:
|
| - Learn how to multiply a matrix M by a column vector v.
|
| - Generalise that by noticing that M[u|v|...] = [Mu|Mv|...].
|
| That's matrix multiplication.
| madars wrote:
| Yep! Wikipedia has a very useful visual
| https://en.wikipedia.org/wiki/Matrix_multiplication#/media/F...
| qsort wrote:
| After all it's called _linear_ algebra for a reason :)
| contravariant wrote:
| It looks so innocuous when you write it as
|
| (A B)_ij = A_ik B_kj.
| ogogmad wrote:
| Using the Einstein convention*. Otherwise that can be read
| wrong.
|
| * https://en.wikipedia.org/wiki/Einstein_notation
| carlosgg wrote:
| Discussion from 5 years ago:
| https://news.ycombinator.com/item?id=13036386
| dang wrote:
| Thanks! Macroexpanded:
|
| _Matrix Multiplication_ -
| https://news.ycombinator.com/item?id=24141472 - Aug 2020 (1
| comment)
|
| _Matrix Multiplication_ -
| https://news.ycombinator.com/item?id=13036386 - Nov 2016 (131
| comments)
|
| _Matrix Multiplication_ -
| https://news.ycombinator.com/item?id=13033725 - Nov 2016 (2
| comments)
| jagrsw wrote:
| I understand the rotation part and some uses, but what's the
| intuitive explanation for why it's done this way? Convention?
|
| IOW, what matrix multiplication tries to achieve by doing this
| manipulation which looks like rotation visually? Why not doing it
| w/o rotation instead (with the precondition that numbers of
| columns in two matrices should be equal, instead of column vs
| row)?
| ogogmad wrote:
| A different convention would make matrix-matrix multiplication
| no longer express composition of linear maps. So the geometric
| -- or deeper meaning -- would be lost.
|
| The operation you're describing is nevertheless equal to A^T B.
| In other words, it can be expressed using a combination of
| matrix multiplication and matrix transpose. I don't see what it
| could be used for, though.
| xyzzyz wrote:
| When mathematicians think about matrix multiplication (and
| matrices in general), they don't really think about "rotating"
| matrices like in the animation above, but rather about
| operators and their composition. The matrix multiplication is
| what it is, because that's how function composition works.
|
| Look: consider two functions, f(x, y) = (x + 2y, 3x + 4y), and
| g(x, y) = (-x + 3y, 4x - y). What's f(g(x, y))? Well, let's
| work it out, it's simple algebra:
|
| f(g(x, y)) = f(-x+3y, 4x-y) = ((-x+3y)+2(4x-y), 3(-x+3y) +
| 4(4x-y)) = (-x + 3y + 8x - 2y, -3x + 9y + 16x - 4y) = (7x + y,
| 13x + 5y).
|
| Whew, that was some hassle to keep track of everything. Now,
| here's what mathematicians typically do instead: they introduce
| matrices to make it much easier to keep track of the
| operations:
|
| Let e_0 = (1, 0), and e_1 = (0, 1). Then f(e_0) = f(1, 0) = (1,
| 3) = e_0 + 3 e_1, and f(e_1) = f(0, 1) = (2, 4) = 2 e_0 + 4
| e_1. Thus, mathematicians would write that f in basis e_0, e_1
| is represented by the matrix
|
| [1 2] [3 4]
|
| so that when you multiply it by the (coulmn) vector [x, y], you
| get [x] * [y] [1 2][x + 2y]
| [3 4][3x + 4y]
|
| Similarly, g(e_0) = (-1, 4) = -e_0 + 4e_1, g(e_1) = (3, -1) =
| 3e_0 - e_1, so it's represented by the matrix:
|
| [-1 3] [ 4 -1]
|
| Now, let's multiply matrix of f by matrix of g:
| [-1 3] * [ 4 -1] [1 2][-1*1+2*4
| 3*1-1*2] = [7 1] [3 4][-1*3+4*4 3*3-1*4] [13 5]
|
| and when we multiply the resulting matrix by column vector [x,
| y]: [x] * [y] [7 1][7x + y]
| [13 5][13x + 5]
|
| So, what did we get was in fact our original calculation of
| f(g(x, y)) = (7x + y, 13x + 5y).
|
| The conclusion here is that _matrix multiplication is what the
| function composition forces it to be_.
| gizmo686 wrote:
| The short answer is that it leads to a more consistent
| notation.
|
| The longer answer is that matrix multiplication is essentially
| 2 different operations. Consider the linear funtions:
| f :: R3 -> R3 h :: R3 -> R3
|
| As well as the point x :: R3
|
| If we fix a set a basis vectors, we can represent f and h as
| 3x3 matrices, and x as a 3x1 matrix.
|
| The product [f][x]=[f(x)] then represents the result of a
| function application, while [f][h] = [f[?]h] represents
| function composition.
|
| For the function application portion, there is no problem with
| your proposal. We simply represent the point as a 1x3 vector
| instead of a a 3x1 vector (or similarly transpose the
| convention for representing a function as a matrix).
|
| The problem comes with the function composition use-case. For
| your proposal to work, we would need to transpose only one of
| the two matrices, which means that the matrix representation of
| a function is determined both by the basis vectors, and a
| transposition parity bit. With multiplication as composition
| only making sense when the transposition parities don't match.
| jmrm wrote:
| I need this a bit more than 10 years ago, but thanks anyway :-)
| Rickthedick wrote:
| have you looked into
| https://en.wikipedia.org/wiki/Strassen_algorithm strassen
| multiplication and karatsubas algorithm for multiplication?
| https://en.wikipedia.org/wiki/Karatsuba_algorithm apparently its
| the one python uses. I just recently learned it and I find it
| very interesting.
| civilized wrote:
| A good example of using technology to make the simple
| complicated.
___________________________________________________________________
(page generated 2022-01-17 23:00 UTC)