[HN Gopher] Make Patterns Pop Out of Heatmaps with Seriation (2018)
___________________________________________________________________
Make Patterns Pop Out of Heatmaps with Seriation (2018)
Author : ingve
Score : 113 points
Date : 2021-07-03 07:48 UTC (15 hours ago)
(HTM) web link (nicolas.kruchten.com)
(TXT) w3m dump (nicolas.kruchten.com)
| edgeform wrote:
| https://news.ycombinator.com/item?id=27713441
|
| Today's hot topic, TSP solvers.
| unearth3d wrote:
| Thanks for posting, some great info on kreuchten's Twitter too.
| rocqua wrote:
| If you truly have unordered table data, and column and row swaps
| give you a meaningful heatmap data, I suspect the new row and
| column order probably tell you A LOT. Because before, you just
| had random labels. Now all of a sudden, your random labels admit
| a meaningful order. That is a huge deal and suggests underlying
| structure in your labels!
| noduerme wrote:
| One good query is worth its weight in gold.
| kzrdude wrote:
| Very interesting but would love to see an example on realistic
| data
| lstamour wrote:
| See http://www.brendangregg.com/HeatMaps/subsecondoffset.html
|
| Also compare to DSMs which have the same problem where column
| and row order can matter a lot to making the diagram clearer:
| https://web.mit.edu/eppinger/www/SDE-MIT/DSM_Book.html
| datastoat wrote:
| Here's an example: https://youtu.be/eHwy-neG_W8?t=791
|
| It's from an introductory course on Algorithms; the dataset is
| the students' programming assignments, rated for similarity by
| an off-the-shelf similarity scorer.
|
| This heatmap example is presented in the class on Kruskal's
| algorithm. We normally think of Kruskal's algorithm as a method
| for finding a minimum spanning tree, but it can also be thought
| of as building a classification tree -- which means we can use
| it for seriation. It's not the best method for seriation by any
| means, but it's nice just to see it used in this unconventional
| way.
|
| It's always fun showing this in class, on the coursework that
| the students submitted just a few weeks ago!
| kzrdude wrote:
| Very nice video!
| thechao wrote:
| I was a grad student at a major State university. We had a
| "defensive" grading rubric which caused cheating to be self-
| defeating. We'd do analyses like this to visualize the
| cheaters, and try to predict the GPA average of the class
| (section). Our goal was to speed up or slow down the material
| to make sure the largest number of students actually learned
| the core topics of the course _and_ didn't flunk. (Our worst
| case scenario was the class of student who'd "get it" at the
| end of the semester -- but too late to pass.)
| laurent92 wrote:
| Is this used in image compression? JPEG, because of the Fourier
| transform, tends to have waves around edges of flat-colored
| surfaces. We might have much better compression if we encode
| images with disordered rows, then leave it up to the decoder to
| swap them.
| alanbernstein wrote:
| No... the ripples you're thinking of are compression artifacts.
| They aren't present inside large flat-colored areas. Breaking
| up those flat areas increases detail/entropy/information, from
| the perspective of a perceptual encoder. So there would be MORE
| ripples.
| cousin_it wrote:
| Interesting! I wonder how it would work on other kinds of data.
| For example, take a checkerboard image, rotate it (by, say, 15
| degrees) and then seriate the result. What will it look like?
| nicolaskruchten wrote:
| Should be easy to try! My expectation is that the TSP approach
| would recover the original image unless you added a lot of
| noise to it. Adjacent rows will still end up extremely similar
| to each other.
| cousin_it wrote:
| My expectation is that it wouldn't, because in a tilted
| checkerboard image (or any other periodic image) there are
| many pairs of rows that are more similar than neighbors.
| nicolaskruchten wrote:
| Try it and see :)
___________________________________________________________________
(page generated 2021-07-03 23:01 UTC)