[HN Gopher] Principal Component Analysis Explained Visually
___________________________________________________________________
Principal Component Analysis Explained Visually
Author : spking
Score : 93 points
Date : 2022-10-29 18:02 UTC (4 hours ago)
(HTM) web link (setosa.io)
(TXT) w3m dump (setosa.io)
| aquafox wrote:
| Here is a much better explanation of PCA:
| https://stats.stackexchange.com/questions/2691/making-sense-...
|
| The key insight that many are missing is that PCA solves a series
| of optimization problems, namely that reconstructing the data
| from the first k PCs gives the best k-dimensional approximation
| in terms of the squared error. Even more, this is equivalent to
| assuming that the data lives in a k-dimensional subspace and
| becomes truly high-dimensional because of normally distributed
| noise that spills into every direction (dimension).
| larrydag wrote:
| I really like the way Harrell uses PCA to build regression
| analysis in Regression Modeling Strategies
|
| https://link.springer.com/book/10.1007/978-3-319-19425-7
| swyx wrote:
| Principal Components is a wonderful concept, together with
| sister concepts eigenvalues/vectors, and orthogonality. i wish
| i could force everyone i talk to to internalize these ideas so
| that I could have more useful discussions with them.
|
| that said, yeah not everything is linearly separable
| blt wrote:
| In the UK eating example, it would be better to examine the
| feature-space singular vector associated with the first singular
| value instead of instructing the reader to "go back and look at
| the data in the table". PCA has already done that work, no
| additional (error-prone, subjective) interpretation needed.
| lxe wrote:
| Also see
|
| - Markov Chains (https://setosa.io/ev/markov-chains/)
|
| - Image Kernels (https://setosa.io/ev/image-kernels/)
|
| - Bus Bunching (https://setosa.io/bus/)
|
| Wish these guys kept producing more visualizations!
| wjnc wrote:
| Best thing I've ever read on PCA is Madeleine Udell's PhD-thesis
| [1]. It extends PCA in many directions and shows that well-known
| techniques fit into the developed framework. (Was also impressed
| with a 138 page thesis in math that is readable as well. Quite
| the achievement.)
|
| [1] https://people.orie.cornell.edu/mru8/doc/udell15_thesis.pdf
| Bukhmanizer wrote:
| It's kind of crazy that so many people have read this thesis,
| but it's really good. I came across it independently a few
| years ago when I was trying to understand some stuff, but ended
| up saving it as a reference because I liked it so much.
| isoprophlex wrote:
| This is some hot stuff! Thanks for sharing. Very lucid writing,
| clearly she has some deep understanding of the subject matter
| to be able to write that down so eloquently
| flashfaffe2 wrote:
| Indeed, this seems worth a deep read as this especially address
| main PCA shortcomings ( heterogeneous data, non numerical
| data,.etc...). Thanks mate I've definitely find a way to keep
| myself busy this weekend.
| nerdponx wrote:
| I'm not sure this is an explanation as much as an introductory
| demo. Nice visualizations though.
___________________________________________________________________
(page generated 2022-10-29 23:00 UTC)