[HN Gopher] Linear Algebra Done Wrong (2004)
       ___________________________________________________________________
        
       Linear Algebra Done Wrong (2004)
        
       Author : sebg
       Score  : 72 points
       Date   : 2024-08-03 09:34 UTC (5 days ago)
        
 (HTM) web link (www.math.brown.edu)
 (TXT) w3m dump (www.math.brown.edu)
        
       | dang wrote:
       | Related:
       | 
       |  _Linear Algebra Done Wrong [pdf]_ -
       | https://news.ycombinator.com/item?id=999494 - Dec 2009 (12
       | comments)
        
       | optbuild wrote:
       | After having my first Linear Algebra course I came across the
       | online course called Linear Dynamical Systems by Prof Stephen
       | Boyd of Convex Optimisation fame.
       | 
       | Every lecture was so eye opening. I couldn't believe that linear
       | algebra could be taught in such a context with such a variety of
       | application domains.
       | 
       | The lectures are all available online with the assignments :
       | https://ee263.stanford.edu/archive/
        
         | nextos wrote:
         | Boyd has published a related book which is a bit more
         | elementary but still great:
         | https://web.stanford.edu/~boyd/vmls/
        
         | abhgh wrote:
         | I hadn't known about this, so thank you! Steven Brunton also
         | has a book that covers dynamical systems (among other things)
         | which is linear algebra adjacent, and the expositions are great
         | [1].
         | 
         | [1] https://www.amazon.com/Data-Driven-Science-Engineering-
         | Learn...
        
       | behnamoh wrote:
       | We seriously need better terminology, notation, and pedagogy when
       | it comes to linear algebra. In 2024, such old-style text books
       | just don't cut it anymore.
        
         | AnimalMuppet wrote:
         | What, specifically, do you think should be done better? What
         | style of pedagogy do you think would be more appropriate? What
         | notation? What terminology?
         | 
         | Apart from "argument by year number", what's actually wrong
         | with the book?
        
         | Jtsummers wrote:
         | 1. What's wrong with this specific book other than that it
         | hasn't been updated in 7 years?
         | 
         | 2. What are examples of better terminology, notation, or
         | pedagogy that are improvements over this book? (Doesn't need to
         | be total, could retain the same terminology and notation but
         | just have better pedagogy for example)
        
           | dunefox wrote:
           | I think interactive linear algebra could be superior to a
           | pure textbook. So, a textbook with interactive visualisations
           | that show a concept and allow manipulation. Something like
           | this maybe: https://textbooks.math.gatech.edu/ila/
        
             | Jtsummers wrote:
             | That does offer slightly better presentation for students
             | with its interactive elements, but otherwise looks pretty
             | standard for an undergrad linear algebra textbook in
             | overall presentation. It has many sections that could use
             | some interactive portion with none at all.
             | 
             | However, I'll give you that interactive texts (or texts
             | with supplemental interactive material) are often better
             | for some things (and some parts of linear algebra) than a
             | plain dead tree format.
        
               | dunefox wrote:
               | Yeah, I didn't mean this book specifically, rather the
               | presentation as an example.
        
         | andrewla wrote:
         | As far as I know, this notation and terminology is still very
         | standard.
         | 
         | In some ways I like the idea of replacing traditional indexing
         | with something more like the Einstein summation notation, and
         | moving away from the arbitrary feeling that you get around how
         | matrix multiplication is structured, and provide an immediate
         | route into tensor theory.
         | 
         | But so many things that are very important in linear algebra,
         | like matrix inverses, are awkward to express in this notation.
        
       | paulpauper wrote:
       | Why does linear algebra elicit so much confusion despite being an
       | elementary topic. I think part of the problem is that the
       | connection between matrices and matrix operations and their
       | applications is not explained well enough. Linear algebra is
       | typically one's first foray into abstract mathematics, and if
       | taught poorly sets one up for failure down the road.
        
         | rguzman wrote:
         | one issue that comes to mind is the nature of the determinant.
         | when one considers the determinant defined by the recursive
         | definition, it seems like a highly contrived object that is
         | difficult to work with (as it is from that definition!).
         | avoiding that confusion requires that a lot more scaffolding be
         | built (ala Axler in the "Done Right" book). either way you have
         | some work: either to untangle the meaning of the weird
         | determinant or get to the place where you can understand the
         | determinant as the product of the eigenvalues.
        
           | cooljoseph wrote:
           | Do you know why introductory textbooks don't define the
           | determinant in terms of the exterior product? This is how
           | some "real" mathematicians I've talked to define it. It also
           | is more intuitive (in my opinion) to define determinants as
           | "signed volumes" than some sum of products multiplied by
           | signs of cycles.
           | 
           | The product of eigenvalues definition is also somewhat
           | intuitive to me ("How much does the matrix scale vectors in
           | each direction? Now multiply those numbers together."), but
           | it's harder to motivate the fact that adding rows together
           | doesn't change the determinant, which is kind of important to
           | actually computing the determinant.
        
         | QuercusMax wrote:
         | I took LinAlg the same semester I took Computer Graphics, back
         | in 2001. The first half of LinAlg was all about solving
         | equations, transformation matrices and that sort of stuff, and
         | the first half CG was all 2D drawing (Bresenham and such).
         | Second half of CG was all about OpenGL, and I was able to apply
         | all the stuff I'd just learned in LinAlg.
         | 
         | The second half of LinAlg was a bunch of stuff about Eigenthis
         | and Eigenthat, with no particular explanation as to what you'd
         | use it for or why you would care. I pass the class with an A
         | and was even recommended by the instructor to be work as a
         | tutor for other students, but 23 years later I couldn't tell
         | you the use of any of that stuff.
         | 
         | T-matrices, though - I've used that stuff my whole career,
         | working in medical imaging and related fields. I still couldn't
         | tell you what an Eigenvalue or Eigenvector is useful for, as I
         | never had any reason to apply that stuff.
         | 
         | It's not particularly abstract if you can explain _why_ and
         | _how_ this stuff is used - and I _know_ it 's heavily used in
         | many fields - just not those that I've worked in.
        
           | will-burner wrote:
           | Yeah, eigenvectors and eigenvalues are abstract and best
           | understood in terms of the linear mapping between vector
           | spaces and change of basis, but change of basis for a vector
           | space and writing a matrix in a new basis gets really
           | complicated. Many people get lost there.
        
           | FabHK wrote:
           | Eigenvectors characterise a linear transformation by
           | answering a simple question: What lines map to themselves
           | after the transformation?
           | 
           | Say for example you rotate something in 3D. The rotation axis
           | remains unchanged. That's an eigenvector. Or say you mirror
           | something in 3D, then all the lines lying in the mirror plane
           | remain unchanged (all eigenvectors with eigenvalue 1), and
           | the line orthogonal to the mirror plane remains unchanged -
           | or rather, flipped, so it's an eigenvector with eigenvalue
           | -1.
        
         | will-burner wrote:
         | It's hard to make the connection between matrices and a linear
         | mapping between vector spaces. A lot of the time students in a
         | linear algebra classes do not have a great grasp of what a
         | function is, so when talking about vector spaces and linear
         | maps between them they're lost. Then connecting that to a
         | matrix is hopeless.
        
         | hintymad wrote:
         | > Why does linear algebra elicit so much confusion despite
         | being an elementary topic
         | 
         | I'm not sure that linear algebra is an elementary topic, if the
         | word "elementary" is as in math of elementary school. That
         | said, I don't think there's much confusion about linear algebra
         | either. It's more likely that linear algebra is substantially
         | more abstract than high-school algebra, so many students have a
         | hard time deeply understanding it, just like many students
         | already bail out on high-school math. When I was a TA in
         | college, I also observed that many students were not prepared
         | for the fact that college-level math is fundamentally different
         | from high-school ones. College-level maths has far less time
         | for students to grasp the concepts through sheer brute-force
         | practice. College-level maths requires students to focus on
         | intuitive understanding of they key concepts so the students
         | won't get bogged down by hundreds and hundreds of concepts or
         | corollaries or theorems. Of course, high-school maths requires
         | intuitive understanding too, but because high-school maths is
         | so simple that many students get the intuition naturally so
         | they are not aware of how important such intuitive
         | understanding is.
         | 
         | This book used to help my students build intuitions:
         | https://www.amazon.com/Algebra-Through-Geometry-
         | Undergraduat.... It starts with 2D geometry to teach linear
         | algebra, and then moves to 3D, and then 4D. The author also
         | chooses to use calculations and constructions to prove most of
         | the theorems instead of more abstract or algebraic approach.
        
           | jwells89 wrote:
           | Thinking back to high school and college, the biggest issue
           | with I had math in both is how it's frequently taught in
           | absence of practical examples. Speaking personally (though I
           | believe there are many who feel similarly), there's a need to
           | see real world examples to develop a true grasp on new
           | concepts in any reasonable amount of time, for reasons
           | related to both motivation and intuition.
        
             | andrewla wrote:
             | I agree with this in theory, but I don't necessarily need
             | the "real world"-ness.
             | 
             | A lot of the stuff you learn initially, about systems of
             | equations and how they relate to the matrix inverse, are
             | very interesting because it's clear how this can be
             | applied.
             | 
             | But as you move forward into vector bases and change of
             | coordinates there's a very long dry spell that you sort of
             | have to slog through until much later when you start to see
             | how it is actually useful. I'm not sure how to fix this --
             | maybe take a step back from the symbolic and do some
             | numeric computations because that's where they start
             | becoming useful again.
        
               | nwallin wrote:
               | Some of us _do_ need the real world-ness though.
               | 
               | IMHO there should be two versions of linear algebra. One
               | for computer science majors and one for mathematicians. I
               | regularly run into stuff at work where I say to myself,
               | "self, this is a linear algebra problem" and I have next
               | to no idea how to transform the problem I have into a
               | matrices or whatever.
               | 
               | But I can write a really stinkin' fast matrix
               | multiplication algorithm. So there's that I guess.
               | 
               | Modern CPUs with big ass SIMD registers are incredibly
               | fast at slogging through a linear algebra problem. Lots
               | of incredibly intelligent people (ie, not me) spend an
               | incredible amount of effort optimizing every last FLOP
               | out of their BLAS of choice. For several years the only
               | question Intel asked itself when designing the next CPU
               | was, "How much faster can we make the SPECfp benchmark?"
               | and it shows. Any time you can convert a problem using
               | whatever ad-hoc algorithm you came up with into a linear
               | algebra problem, you can get absurd speedups. But most
               | programmers don't know how to do that, because most of
               | their linear algebra class was spent proving that the
               | only invertible idempotent nxn matrix is the identity
               | matrix or whatever.
               | 
               | Discrete math has the same problem. When I took discrete
               | math in college, the blurb in the course catalog promised
               | applications to computer science. It turns out the course
               | was just mucking through literally dozens of trivial
               | proofs of trivial statements in basic number and set
               | theory, and then they taught us how to add two binary
               | numbers together. The chapters on graphs, trees,
               | recursion, big-O notation and algorithm analysis, finite
               | automata? Skipped 'em.
        
             | hintymad wrote:
             | > how it's frequently taught in absence of practical
             | examples
             | 
             | Unfortunately this has been the debate for thousands of
             | years. One of Euclid's students asked him what practical
             | use geometry had. In response, Euclid instructed a servant
             | to give the student a coin, saying, "He must make gain out
             | of what he learns."
             | 
             | Legend aside, I'm actually surprised to see, many times
             | actually, that people on HN criticized the "absence of
             | practical examples". If we compare the textbooks written in
             | the US and those in China or Europe, there's a sharp
             | contrast. The textbooks from the US are thick, full of
             | discussion of motivations and practical examples across
             | multiple domains (Thomas' Calculus, for instance). In
             | contrast, Chinese and European textbooks are terse and much
             | thinner. They focus on proofs and derivations and have only
             | sparse examples.
             | 
             | Personally, maths itself is practical enough. I'd even
             | venture to say that those who can naturally progress to
             | college-level maths should be able appreciate high-school
             | maths for its own sake.
        
             | AnotherGoodName wrote:
             | Yes I'm currently dealing with text that has a line "you
             | will end up with ~2^32 equations and from there it's just a
             | trivial linear algebra problem" without further guidance
             | (from the general number field seive).
             | 
             | I get that 2^32 simultaneous equations may be a
             | straightforward linear algebra problem but I am now going
             | deep to understand the exact mechanism to solve this.
        
         | mamonster wrote:
         | Because it's taught in the 1st year of most math courses and
         | most people don't take it seriously and think that it's "easy
         | 1st year math" when its actually the base for a whole lot of
         | maths down the line.
         | 
         | I've actually stopped asking questions about Cayley-Hamilton
         | and Jordan form (both covered in year 1 of any decent BS math
         | course in Europe) when I used to do inteviews for trading/quant
         | positions because so many people failed.
        
         | xanderlewis wrote:
         | You can probably reduce it down to a single definition. As soon
         | as matrix multiplication is introduced without careful
         | motivation, almost everyone is lost. And those who aren't
         | should be.
         | 
         | It's mostly because, as Axler explicitly tries to address,
         | linear algebra is a subject that can be viewed through at least
         | two different lenses: the intuitive, (possibly higher-
         | dimensional) geometric lens, or the algorithmic and numerical
         | lens. Of course they are equivalent, but almost every course in
         | linear algebra teaches the second and barely even touches on
         | the first. It's like learning Euclid's algorithm before you've
         | learnt to count. No wonder everyone's so confused.
        
       | generationP wrote:
       | FYI, there is a newer version at
       | https://sites.google.com/a/brown.edu/sergei-treil-homepage/l... .
       | Google seems to never have learnt about this site, which is weird
       | as it is on a Google server.
        
         | godelski wrote:
         | Something seems to be seriously wrong with google. By
         | "something" I mean "a lot".
         | 
         | I know there are Google engineers here, possibly search. Do you
         | guys not also feel it? I mean you dogfood, right? I know it is
         | a hard problem but boy has it become harder to use, especially
         | in the last 6 months.
        
           | akira2501 wrote:
           | Like.. the very fact I typed "javascript" or "css" or "html"
           | means you probably don't need to apply your "did you mean"
           | logic to my searches. If I'm searching for something highly
           | technical, I absolutely DO NOT NEED your help, Google.
        
             | p1esk wrote:
             | You're not the type of users they care about.
        
               | akira2501 wrote:
               | Well, I write software for the web, and at this point I
               | consider Google to be an entirely negative company.
               | 
               | So now, to the maximum extent possible, I work to
               | undermine their interests and to not participate with
               | their products or addons or browsers.
               | 
               | Great strategy on their part.
        
               | godelski wrote:
               | You've over simplified and created a major error. It's a
               | common one though, so allow me to explain.
               | 
               | There is no single entity Google cares about. There is no
               | "average user" (nor median). Trying to make the product
               | best for an "average" user makes it bad for everyone. You
               | have made the assumption that this solution space is
               | smooth and relatively uniformly distributed.
               | 
               | Consider this: on a uniformly distribution, if you
               | interpolate between two points, those interpolations are
               | representative of the distribution. But if you do so on a
               | gaussian distribution, this is not true. This is because
               | the density of these distributions are not evenly
               | distributed (this is what "uniform" in uniform
               | distribution means). For more complicated distributions
               | you need different interpolations. Many real world
               | distributions are the agglomeration of varying
               | distributions (you probably have clusters and pockets).
               | 
               | The distribution of searches (or customers) is neither
               | normal nor uniform.
               | 
               | To understand some further complexity, it is important to
               | remember that the burden of success is the requirement
               | for further nuance. Pareto. It may take few resources to
               | get to "80%"/"good enough" but significant to increase
               | that another 10%. Think about a Taylor Series (Fourier
               | Approximation, or pick your preferred example), the first
               | order approximation will only take you so far. It may be
               | good enough to start, but the need for higher order (and
               | this more complex calculations) approximations are
               | increasingly necessary for precision (non-linearly). So
               | simplify where you can, but don't forget what you've
               | done. If you do, you'll be stuck. Or worse, you will
               | compound your error (also very common).
        
             | Suppafly wrote:
             | >probably don't need to apply your "did you mean" logic to
             | my searches
             | 
             | I wish it had a yes/no prompt that you could click and when
             | you click no it would refresh with the results you wanted,
             | and possibly use those answers to update whatever algorithm
             | does the suggestions.
        
             | godelski wrote:
             | I wish I could get it to respect advanced filters. I can
             | understand how a problem I type in might be very similar to
             | a simple one, but there needs to be a way to signal
             | differentiation. I know they track me, and you're telling
             | me that clicking on the first 5 links to only revisit the
             | search in under a minute and try a new link is not a signal
             | that I'm not getting what I want? Is visiting page 2 not
             | further signaling?
             | 
             | I don't need a different variation of the same result, I
             | need a different result. While the former can be certainly
             | useful, it is absurd to think this is all there is.
             | 
             | If you ever search anything with any bit of nuance, you'll
             | have experienced this. I'll give an example: try searching
             | "out of distribution" and try to find a result that is not
             | about ML based OOD detection. They do exist and you'll find
             | some better queries, but that result still dominates (and
             | frankly, they are poor results).
             | 
             | For our example query `statistics out of distribution
             | -"machine learning"` has a top result of mathworks on
             | neural networks and all results on the front page work.
             | "statistics" is ignored before variations of "Machine
             | learning." Top few results for `statistics out of
             | distribution -"machine learning" -"deep learning"
             | -"artificial intelligence" -"neural network"` returns to me
             | an ICLR paper who's title that ends in "Deep Neural
             | Networks" and many results are still ML based...
             | 
             | How do I tell Google (or any engine) that I want to exclude
             | results!?
             | 
             | And before someone says GPT/Claude or some other LLM, the
             | problem persists. Yes, I've paid. Yes, for the most
             | advanced models. Derailing these tools from the wrong track
             | is difficult. Even harder the longer the conversation goes
             | (i.e. if you don't recognize you're headed in the wrong
             | direction). The nuance matters.
        
         | junar wrote:
         | "Linear Algebra Done Wrong site:sites.google.com" finds the
         | page you linked.
         | 
         | https://www.math.brown.edu/streil/index.html redirects to
         | https://sites.google.com/a/brown.edu/sergei-treil-homepage/h...
         | 
         | But OP's link,
         | https://www.math.brown.edu/streil/papers/LADW/LADW.html does
         | not redirect. So it seems like a problem that the site owner
         | can fix.
        
       | aquafox wrote:
       | For a really intuitive introduction to linear algebra, I highly
       | recommend Gilbert Strang's "Linear Algebra and Learning from
       | Data": https://math.mit.edu/~gs/learningfromdata/
        
       | munchler wrote:
       | > Why should anyone read this book if it presents the subject in
       | a wrong way? What is particularly done "wrong" in the book?
       | 
       | Does he ever answer this question? It's posed at the start of the
       | book, but then immediately ignored.
        
         | Syzygies wrote:
         | The title is a jab at and a response to "Linear Algebra Done
         | Right" (https://linear.axler.net) by Sheldon Axler.
         | 
         | Sheldon's book is more suitable for a second course in linear
         | algebra. It's an excellent book, that my more advanced students
         | have enjoyed.
         | 
         | Sheldon, however, approaches the subject with some rigid dogma:
         | Determinants are evil. His view is they're not practical to
         | compute at scale. It can be good to learn theory with one's
         | determinant hand tied behind one's back, just as it can be good
         | to learn "constructive" mathematics, but this should not be
         | one's only approach.
         | 
         | I used to attend an academic sponsor's day at the math
         | institute MSRI (now SLMath). Afternoons reached to keep us
         | busy. One year Sheldon's wife demonstrated some academic
         | teaching software, and my group was tasked to solve a linear
         | algebra problem that Sheldon had provided. Unaware that
         | everyone was watching us on the big screen, I solved it
         | instantly, using determinants. I still don't understand
         | Sheldon's longer solution.
        
           | munchler wrote:
           | This is very informative. Thank you for the explanation.
        
           | btilly wrote:
           | His view is not just that they are not practical to compute
           | at scale. (They actually are practical to compute. Just not
           | with the formula. Use row reduction instead.)
           | 
           | His view is that they are presented as an arbitrary piece of
           | magic that people really don't understand. And, exactly
           | because they don't really understand the determinant, they
           | wind up not properly understanding fundamental concepts in
           | linear algebra. Such as Jordan normal form.
           | 
           | If your goal is actually to properly understand linear
           | algebra, he's absolutely right. If your goal is to use linear
           | algebra for a wide variety of practical purposes, though,
           | determinants are often essential.
        
             | xanderlewis wrote:
             | Since a significant amount of my knowledge/intuition about
             | linear algebra comes from his book, I now feel not only
             | that I don't like determinants but that I'm allergic even
             | to matrices. Everything seems so much nicer and friendlier
             | if you think of linear maps between finite-dim vector
             | spaces instead.
             | 
             | When I hear, for example, 'upper triangular matrix' I have
             | to translate it in my head into something like 'matrix
             | representing a linear operator that preserves the standard
             | flag' in order to actually feel like I understand it.
             | 
             | Of course, I'm not a programmer or an engineer of any kind,
             | so I have the luxury of not needing the computational
             | efficiency.
        
               | staunton wrote:
               | What's a "standard flag"? Couldn't find anything about
               | it...
               | 
               | I would think "upper triangular" would be a weird/bad
               | notion to talk about in a basis-independent setting
               | (because it depends on the basis).
        
               | xanderlewis wrote:
               | A 'flag' is just a funny name for an increasing (and
               | therefore increasing in dimension) sequence of subspaces.
               | The 'standard flag' is the sequence spanned by the
               | standard basis vectors: start with {0}, then things of
               | the form (x,0,...), then (x,y,0,...), and so on. *
               | 
               | What I'm saying is that an upper triangular matrix
               | preserves each of these subspaces because it sends each
               | basis vector e_i into the span of the e_0, ..., e_{i-1}.
               | 
               | You're absolutely right that upper triangularity is
               | basis-dependent and so is somewhat 'weird'/'evil' (in
               | fact, not even well-defined) as a purported property of
               | maps rather than matrices. What I meant to say was
               | 'triangularisable' by analogy with 'diagonalisable' --
               | matrices which represent such a map in _some_ basis.
               | Given a linear operator, its matrix representation is
               | triangular when expressed in a basis B if and only if it
               | preserves the standard flag in basis B.
               | 
               | * https://en.wikipedia.org/wiki/Flag_(linear_algebra)
        
               | Tainnor wrote:
               | > I now feel not only that I don't like determinants but
               | that I'm allergic even to matrices
               | 
               | Determinants are totally basis-invariant, however,
               | similar matrices have the same determinant.
        
               | xanderlewis wrote:
               | Yeah, and I think that's enough motivation for
               | determinants being important beyond as a computational
               | trick. I was being somewhat hyperbolic.
               | 
               | Seeing the determinant as the unique solution to the
               | problem of assigning a scalar to each linear map/matrix
               | in such a way that several natural axioms are satisfied
               | is a very nice way to motivate it. Along with the fact
               | that it, along with the trace, can be calculated from the
               | eigenvectors (from which the basis-invariance is clear).
        
             | Tainnor wrote:
             | > Use row reduction instead
             | 
             | Or, if you're using exact arithmetic (e.g. cryptography),
             | you should use something like the Bareiss algorithm -
             | Gaussian elimination is only polynomial if you use floats:
             | https://en.m.wikipedia.org/wiki/Bareiss_algorithm
             | 
             | The Leibniz formula is totally impractical for calculating
             | determinants, but it is sometimes useful in proofs due to
             | its symmetric form.
             | 
             | I guess if you just throw a definition of the determinant
             | at people and leave it at that, I can understand that it
             | doesn't make sense. But that's not how I was introduced to
             | the concept (I learned about abstract vector spaces and
             | transformations and how they are represented by matrices -
             | and only then about determinants and their properties and
             | different ways of computing them). That's why Axler's
             | critique rings somewhat hollow to me - maybe it's different
             | if you learn about linear algebra mostly in an engineering
             | context as opposed to pure maths.
             | 
             | For theoretical purposes, determinants are important
             | because they constitute a homomorphism between the general
             | linear group and the multiplicative group of the underlying
             | field, and the kernel of this homomorphism is the special
             | linear group. This leads to some very short and elegant
             | proofs.
        
             | woopsn wrote:
             | Was V.I. Arnold wrong when he said everything a student
             | needs to know about determinants is derived from basic
             | geometric considerations? It seems true to me that, outside
             | of advanced study linear or abstract algebra, this is the
             | one perspective which can at least dispel the magic.
        
       | beryilma wrote:
       | ... and another beauty of a math book written in LaTeX is that
       | you can fit a 286-page book into just 1.3 MB. Good luck doing
       | that in MS Word.
        
       ___________________________________________________________________
       (page generated 2024-08-08 23:01 UTC)