hngopher.com

       [HN Gopher] What Is a Schur Decomposition? (2022)
       ___________________________________________________________________
        
       What Is a Schur Decomposition? (2022)
        
       Author : vector_spaces
       Score  : 80 points
       Date   : 2024-03-04 10:36 UTC (12 hours ago)
        
 (HTM) web link (nhigham.com)
 (TXT) w3m dump (nhigham.com)
        
       | jjgreen wrote:
       | It is a terrible shame that this excellent "What is .." series is
       | concluded, Higham died earlier this year aged 62.
        
         | stabbles wrote:
         | It's chilling to see the blog archive end at "January 2024"
        
         | an_d_rew wrote:
         | Oh no! I had no idea! I met Higham at a conference many years
         | ago, and the lecture he gave was inspiring!
         | 
         | From https://ilasic.org/ilas-net-archive-2024/
         | 2024 January 22       ILAS-NET Message no. 2515
         | CONTRIBUTED ANNOUNCEMENT FROM:   Francoise Tisseur and Des
         | Higham       SUBJECT: Nick Higham (1961--2024)
         | 
         | With great sadness we report that Nick Higham, Royal Society
         | Research Professor and Richardson Professor of Applied
         | Mathematics at the University of Manchester, passed away on
         | January 20, 2024, at the age of 62 after an 18 month struggle
         | with a form of blood cancer. An obituary describing Nick's
         | research and leadership contributions will appear in SIAM News
         | in due course.
         | 
         | Francoise Tisseur and Des Higham
        
         | krosaen wrote:
         | Wow, apparently published his last post, "What Is the Spectral
         | Radius of a Matrix?" just 8 days before his passing. Legend.
         | 
         | https://nhigham.com/2024/01/
        
       | stabbles wrote:
       | For large and sparse matrices you can use the (restarted) Arnoldi
       | method to compute a partial Schur decomposition AQ=QR where Q is
       | tall and skinny and R has a few dominant eigenvalues on the
       | diagonal (i.e. eigenvalues on the boundary of the convex hull).
       | 
       | MATLAB uses ARPACK's implementation of this when you call `eigs`
       | 
       | I wrote my own implementation ArnoldiMethod.jl in julia, which
       | unlike MATLAB/ARPACK supports arbitrary number types, and also
       | should be more stable in general, and equally fast.
       | 
       | [1] https://github.com/JuliaLinearAlgebra/ArnoldiMethod.jl
        
         | sfpotter wrote:
         | Why should it be more stable?
        
           | stabbles wrote:
           | All Arnoldi methods basically repeatedly grow and shrink a
           | search space until they contain good approximations to Schur
           | vectors. Shrinking (=restart) is necessary to keep constant
           | memory requirements, as well as performance.
           | 
           | Restart is somewhat non-trivial because it has to preserve
           | the Arnoldi relation. ARPACK uses Implicit Restart, which is
           | a very beautiful trick based on "perfect shifts": a single
           | iteration of the shifted QR algorithm where the shift is
           | taken as an exact eigenvalue is a way to reorder basis
           | vectors (moving vectors you want to remove to the back, so
           | you can truncate them) while preserving the Arnoldi relation.
           | However, perfect shifts in the QR algorithm are good on paper
           | but can lead to arbitrary loss of precision in finite
           | arithmetic, especially with nearly repeated eigenvalues.
           | 
           | I think newer implementations (like SLEPc) are often based on
           | Stewart's Krylov-Schur paper. It's not as pretty but
           | certainly more stable.
           | 
           | In short: give ARPACK a matrix with a cluster of almost
           | repeated eigenvalues at the boundary of the convex hull, and
           | it tends to fail.
        
             | sfpotter wrote:
             | Thanks, very helpful. I've used ARPACK extensively and have
             | had no complaints with it, but I haven't tried this
             | particularly pathological case.
        
         | ilayn wrote:
         | I guess it is a rite of passage to rewrite it. I'm doing it for
         | SciPy too together with Propack in [1]. Somebody already
         | mentioned your repo. Thank you for your efforts.
         | 
         | [1]: https://github.com/scipy/scipy/issues/18566
        
           | stabbles wrote:
           | That is an ambitious goal in that issue. What's the reason to
           | get rid of Fortran?
        
             | ilayn wrote:
             | Basically lack of expertise among maintainers, f77
             | codebase, and neverending packaging/distribution problems.
        
       | platz wrote:
       | explain matrix functions in more detail and how computing a
       | matrix function on an upper triangular matrix is an advantage.
       | 
       | how is this a fundamental tenet of numerical linear algebra?
       | 
       | also, if I just want the Eigendecomposition, why do I need schur
       | decomposition?
       | 
       | The motivation needs more explanation.
        
         | epgui wrote:
         | Well, either it needs more explanation, or the audience is
         | people with certain prerequisite knowledge.
        
         | cat_man wrote:
         | Matrix functions (at least ones I learned about way back when)
         | are Taylor series representations of a function with the matrix
         | plugged in. For example:
         | 
         | exp(A) = I + A + A^2/2 + ... + A^n/n!
         | 
         | For matrices, you can show these end up as finite series (i.e.,
         | it can be written with the highest n being the dimension of the
         | matrix)
         | 
         | The A^n factors can be computed with fewer flops than dense
         | matrices, I believe. The "Q's" don't involve little work since
         | they're "unitary" and satisfy Q _Q = I, so that for example
         | 
         | A^2 = (QTQ_)^2 = (QTQ _QT_ ) = QT^2Q*
         | 
         | That's where the f(A)= Qf(T)Q* part comes from.
         | 
         | It sounds to me like he's quoting what someone described as a
         | "fundamental tenet of numerical linear algebra" as "anything
         | Jordan decomposition can do, Schur can do better". He doesn't
         | seem to be defending "this is a fundamental tenet", but is
         | saying matrix functions illustrate the "Schur can do better"
         | concept. Both Jordan decomposition and Schur are QTQ*
         | decompositions, so I think he's justifying "Schur can do
         | better" with his comments on poorer stability of Jordan form
         | (i.e., the matrix function property depends on A = QTQ*, which
         | both cases satisfy, but Schur has better numerical properties).
         | 
         | I don't think the post said you need the Schur decomposition if
         | you want an eigendecomposition (I'm taking to mean the full
         | eigenvalue + eigenvector pair). It just pointed out that the
         | diagonals are eigenvalues, the first column of Q will be have
         | the eigenvector for the first eigenvalue, and you can obtain
         | eigenvectors from the remaining info if you want it. I'm not
         | sure what the typical approach is for doing
         | eigendecompositions, it might not directly involve Schur
         | decompositions at all.
         | 
         | So I agree partially that you could add a little motivation to
         | make it more self contained, but also agree it's probably aimed
         | at a less general audience, like students who'd have more
         | context around some stuff where the details are light.
        
           | petters wrote:
           | > For matrices, you can show these end up as finite series
           | (i.e., it can be written with the highest n being the
           | dimension of the matrix)
           | 
           | Can you clarify what you mean here? Don't we want the
           | function to e.g. agree with the real-valued one for 1*1
           | matrices?
        
           | mturmon wrote:
           | > ...he's quoting what someone described as a "fundamental
           | tenet of numerical linear algebra" as "anything Jordan
           | decomposition can do, Schur can do better"...
           | 
           | I suspect a reason to keep this little slogan in mind is that
           | the Jordan decomposition is taught as a tool in various
           | classroom settings. I first learned it in studying linear
           | feedback systems of the form:                   x(t+1) = A
           | x(t) + u(t)
           | 
           | where u(t) is a system input. These are used in control
           | systems, audio filters, etc.
           | 
           | In this case, the Jordan decomposition of the nXn matrix A
           | has many properties that explain intuitively what the system
           | is doing.
           | 
           | For instance, if there are two sub-blocks in the Jordan
           | decomposition, these can be interpreted as two smaller non-
           | interacting feedback systems. So in that case the original
           | nXn system is equivalent to two smaller systems.
           | 
           | Anyway, because the Jordan decomposition is remembered as a
           | relevant toolbox item, sometimes people try to use it for
           | numerical problems as well -- e.g., when the "A" above is
           | estimated from data.
           | 
           | And it doesn't work well for that purpose because the
           | decomposition is very unstable and therefore the above
           | interpretations are similarly shaky.
        
         | stabbles wrote:
         | "why Schur vectors": stability + they're always computed when
         | you compute eigenvectors.
         | 
         | For non-symmetric matrices, the standard approach to compute an
         | eigen decomposition is the QR algorithm, which is an iterative
         | method that converges to a Schur decomposition AQ=QR. So, it's
         | always cheaper to compute only Schur vectors. Eigenvectors are
         | obtained by computing the eigen decomposition RY = YL (easy for
         | R diagonal): A(QY) = (QY)L is the eigen decomposition of A.
         | 
         | The QR algorithm works well because it uses stable operations
         | throughout: householder reflections and given's rotations.
         | Those preserve norm, so they don't amplify rounding errors.
         | 
         | The Schur decomposition itself is useful for the same reason:
         | Schur vectors are an orthonormal basis for eigenvectors. If you
         | need to project something onto eigenvectors, it's more stable
         | to work with Schur vectors than eigenvectors.
         | 
         | The last step RY = YL is done implicitly, with a backwards
         | solve involving R. If that matrix is badly conditioned, the
         | eigenvectors have large errors.
        
       | gmfawcett wrote:
       | Remember: if you're not certain that your matrix meets the
       | prerequisites for applying the Schur decomposition, you can
       | always apply the Unschur decomposition instead.
        
         | esafak wrote:
         | Found the dad :)
        
         | jjgreen wrote:
         | Bravo
        
       | lxe wrote:
       | This needs a "visual explanation" treatment. The concepts are
       | graspable by the average HN reader, but this presentation is just
       | difficult to approach.
       | 
       | I tried using ChatGPT to decompose a lot of the terms, which
       | actually helps, but it's hard to vouch for its accuracy:
       | https://chat.openai.com/share/709f61b3-b4cb-48df-b97f-fa6877...
        
       ___________________________________________________________________
       (page generated 2024-03-04 23:01 UTC)