[HN Gopher] Make Patterns Pop Out of Heatmaps with Seriation (2018)
       ___________________________________________________________________
        
       Make Patterns Pop Out of Heatmaps with Seriation (2018)
        
       Author : ingve
       Score  : 113 points
       Date   : 2021-07-03 07:48 UTC (15 hours ago)
        
 (HTM) web link (nicolas.kruchten.com)
 (TXT) w3m dump (nicolas.kruchten.com)
        
       | edgeform wrote:
       | https://news.ycombinator.com/item?id=27713441
       | 
       | Today's hot topic, TSP solvers.
        
       | unearth3d wrote:
       | Thanks for posting, some great info on kreuchten's Twitter too.
        
       | rocqua wrote:
       | If you truly have unordered table data, and column and row swaps
       | give you a meaningful heatmap data, I suspect the new row and
       | column order probably tell you A LOT. Because before, you just
       | had random labels. Now all of a sudden, your random labels admit
       | a meaningful order. That is a huge deal and suggests underlying
       | structure in your labels!
        
         | noduerme wrote:
         | One good query is worth its weight in gold.
        
       | kzrdude wrote:
       | Very interesting but would love to see an example on realistic
       | data
        
         | lstamour wrote:
         | See http://www.brendangregg.com/HeatMaps/subsecondoffset.html
         | 
         | Also compare to DSMs which have the same problem where column
         | and row order can matter a lot to making the diagram clearer:
         | https://web.mit.edu/eppinger/www/SDE-MIT/DSM_Book.html
        
         | datastoat wrote:
         | Here's an example: https://youtu.be/eHwy-neG_W8?t=791
         | 
         | It's from an introductory course on Algorithms; the dataset is
         | the students' programming assignments, rated for similarity by
         | an off-the-shelf similarity scorer.
         | 
         | This heatmap example is presented in the class on Kruskal's
         | algorithm. We normally think of Kruskal's algorithm as a method
         | for finding a minimum spanning tree, but it can also be thought
         | of as building a classification tree -- which means we can use
         | it for seriation. It's not the best method for seriation by any
         | means, but it's nice just to see it used in this unconventional
         | way.
         | 
         | It's always fun showing this in class, on the coursework that
         | the students submitted just a few weeks ago!
        
           | kzrdude wrote:
           | Very nice video!
        
           | thechao wrote:
           | I was a grad student at a major State university. We had a
           | "defensive" grading rubric which caused cheating to be self-
           | defeating. We'd do analyses like this to visualize the
           | cheaters, and try to predict the GPA average of the class
           | (section). Our goal was to speed up or slow down the material
           | to make sure the largest number of students actually learned
           | the core topics of the course _and_ didn't flunk. (Our worst
           | case scenario was the class of student who'd "get it" at the
           | end of the semester -- but too late to pass.)
        
       | laurent92 wrote:
       | Is this used in image compression? JPEG, because of the Fourier
       | transform, tends to have waves around edges of flat-colored
       | surfaces. We might have much better compression if we encode
       | images with disordered rows, then leave it up to the decoder to
       | swap them.
        
         | alanbernstein wrote:
         | No... the ripples you're thinking of are compression artifacts.
         | They aren't present inside large flat-colored areas. Breaking
         | up those flat areas increases detail/entropy/information, from
         | the perspective of a perceptual encoder. So there would be MORE
         | ripples.
        
       | cousin_it wrote:
       | Interesting! I wonder how it would work on other kinds of data.
       | For example, take a checkerboard image, rotate it (by, say, 15
       | degrees) and then seriate the result. What will it look like?
        
         | nicolaskruchten wrote:
         | Should be easy to try! My expectation is that the TSP approach
         | would recover the original image unless you added a lot of
         | noise to it. Adjacent rows will still end up extremely similar
         | to each other.
        
           | cousin_it wrote:
           | My expectation is that it wouldn't, because in a tilted
           | checkerboard image (or any other periodic image) there are
           | many pairs of rows that are more similar than neighbors.
        
             | nicolaskruchten wrote:
             | Try it and see :)
        
       ___________________________________________________________________
       (page generated 2021-07-03 23:01 UTC)