[HN Gopher] The design philosophy of Great Tables
       ___________________________________________________________________
        
       The design philosophy of Great Tables
        
       Author : randyzwitch
       Score  : 238 points
       Date   : 2024-04-04 18:00 UTC (4 hours ago)
        
 (HTM) web link (posit-dev.github.io)
 (TXT) w3m dump (posit-dev.github.io)
        
       | ttymck wrote:
       | Wow. This looks incredible, thanks for sharing.
       | 
       | It makes me wonder how we've gone this long with increasingly
       | _poor_ data table presentations (the mid-century modern tables
       | are astutely pointed to as shining examples).
       | 
       | This makes me excited to get back into data analysis with python.
       | Moreover, I see some possible API improvements and extensions I'd
       | like to make.
        
       | sebastiansm wrote:
       | It's great that the RStudio team is working on Python libraries.
       | 
       | Hope to see dplyr and ggplot someday on Python.
        
         | hadley wrote:
         | You should check out https://siuba.org and https://plotnine.org
         | :)
        
         | randyzwitch wrote:
         | plotnine is a ggplot also funded by Posit (though, externally
         | developed)
         | 
         | https://plotnine.org/
        
           | kyllo wrote:
           | I use plotnine whenever I need to make (static) plots in
           | Python. It's really quite well done, a close match to R's
           | ggplot2, and more feature complete than any of the other
           | Python grammar of graphics packages I've tried.
        
       | acrophiliac wrote:
       | While I'm waiting for the packages to download, can you explain
       | how I get tabular output when I run a python application using
       | your package at the command line? Does it produce HTML output?
       | PDF? Your "getting started" docs doesn't explain.
        
         | cscheid wrote:
         | You use it indeed to generate HTML output. For Python+Jupyter
         | folk, that's most directly applicable to Jupyter Lab or Jupyter
         | Notebook settings. You can use it with Jupyter book, nbconvert,
         | or any other tools that convert .ipynb to HTML output...
         | 
         | (Disclosure: Quarto dev here) ..., like Quarto. You can use
         | `great_tables` in code cells in Quarto to get great tables in
         | your RevealJS presentation or website,
         | https://quarto.org/docs/output-formats/html-code.html.
        
       | throwaway81523 wrote:
       | This article is mostly blither, whether or not it is AI
       | generated. It is about a Python library for generating nicely
       | formatted HTML tables, though they don't tell you much about it
       | til the near the end. The library seems to use an OOP approach.
       | An alternative approach might be more declarative. The product
       | name "Great Tables" appears in boldface over and over (no idea if
       | the font helps SEO) and the name itself is awfully pretentious
       | imho. Overall, the library itself sounds ok,, but the blog post
       | is the annoying market-speak that frequently makes me cringe here
       | on HN.
       | 
       | It would be nice to add some interactivity features to the
       | tables, like ActiveAdmin in Rails.
        
         | paddy_m wrote:
         | It's not AI generated. I have tracked the PR as they have
         | worked on it. From what I have seen, the library has
         | declarative elements that are similar to the grammar of
         | graphics.
        
       | paddy_m wrote:
       | Great tables has done some really nice work on python/jupyter
       | tables. It looks like they are almost building a "grammar of
       | tables" similar to a grammar of graphics. More projects should
       | write about their philosophy and aims like this.
       | 
       | I have built a different table library for jupyter called
       | buckaroo. My approach has been different. Buckaroo aims to allow
       | you to interactively cycle through different formats and post-
       | processing functions to quickly glean important insights from a
       | table while working interactively. I took the view that I type
       | the same commands over and over to perform rudimentary
       | exploratory data analysis, those commands and insights should be
       | built into a table.
       | 
       | Great tables seems built so that you can manually format a table
       | for presentation.
       | 
       | https://github.com/paddymul/buckaroo
       | 
       | https://youtu.be/GPl6_9n31NE
        
       | xnx wrote:
       | Tables are underutilized for how concise and descriptive they can
       | be when making comparisons. It's a shame most text editors start
       | with a blank table instead of inserting one pre-configured with
       | some good design choices.
        
       | tomcam wrote:
       | Fantastic article, duly bookmarked. However.
       | 
       | "The democratization of computational tables arguably began with
       | VisiCalc in 1979... I mean, try it out and you'll see that this
       | is quite limited in more than a few ways."
       | 
       | Them's fightin' words. IMHO VisiCalc's ability to generate models
       | quickly changed civilization. It freed people to try out ideas at
       | no cost and to view or manipulate data in ways no one could hope
       | to do before.
        
       | narush wrote:
       | This is an excellent blog post - I'd never heard of Great Tables
       | before, and I'm a newly minted fan!
       | 
       | > confronted with an all-too-familiar dilemma: copy your data
       | into a tool like Excel to make the table, or, display an
       | otherwise unpolished table.
       | 
       | One add-on (coming from the past 4 years of working on a tabular-
       | data from Pythons startup [1]) is that users aren't just copying
       | data into Excel because if it's good formatting capability: very
       | often, there are organizational constraints that mean that Excel
       | _needs_ to be where this data ends up.
       | 
       | The most common reasons I've seen for data ending up in Excel: 1.
       | Other parts of the report rely on Excel features - you want to
       | build pivot tables or graphs in Excel (often, these are much
       | easier to build in Excel than in Python for anyone who isn't a
       | real Pythonista) 2. The report you're sending out for display is
       | _expected_ in an Excel format. The two main reasons for this are
       | just organizational momentum, or that you want to let the
       | receiver conduct additional ad-hoc analysis (Excel is best for
       | this in almost every org).
       | 
       | The way we've sliced this problem space is by improving the
       | interfaces that users can use to export formatting to Excel. You
       | can see some of our (open-core) code here [2]. TL;DR: Mito gives
       | you an interface in Jupyter that looks like a spreadsheet, where
       | you can apply formatting like Excel (number formatting,
       | conditional formatting, color formatting) - and then Mito
       | automatically generates code that exports this formatting to an
       | Excel. This is one of our more compelling enterprise features,
       | for decision makers that work with non-expert Python programmers
       | - getting formatting into Excel is a big hassle.
       | 
       | Of course, for folks who can ditch Excel entirely, this is
       | entirely unnecessary. Great Tables seems excellent in this case
       | (and anyone writing blog posts this good is probably writing good
       | code too... :) )
       | 
       | [1] https://trymito.io
       | 
       | [2] https://github.com/mito-
       | ds/mito/blob/dev/mitosheet/mitosheet...
        
       | semireg wrote:
       | Does anyone know any similar projects that can render to an HTML
       | canvas?
        
         | paddy_m wrote:
         | Why do you want to render to canvas?
         | 
         | Perspective seems to be the most performant html table. It is
         | more focused on extremely fast updates than styling, although
         | it looks good.
         | 
         | Glide is a newcomer that also renders to canvas.
         | 
         | https://github.com/finos/perspective
         | 
         | https://github.com/glideapps/glide-data-grid
        
       | antidnan wrote:
       | There's also a book on the subject:
       | https://en.wikipedia.org/wiki/The_History_of_Mathematical_Ta...
       | 
       | Interesting aside: AI models trained on spreadsheets need "good
       | tables" such as column names, headers, etc. to understand
       | context. Like Fortap: https://arxiv.org/abs/2109.07323
        
         | richmeister wrote:
         | Thanks for sharing the book info! I really need to find a copy
         | of that somewhere :)
        
       | jimhefferon wrote:
       | I'm interested in the midcentury modern ones because they have
       | lots of vertical rules. I'm active on the subreddit for LaTeX and
       | there is a religion common there that even one vertical rule is
       | an unforgivable abomination.
        
         | tadfisher wrote:
         | Great summary of the problem on StackOverflow:
         | https://tex.stackexchange.com/a/40555
        
           | jimhefferon wrote:
           | Thanks. I've not seen that particular post.
           | 
           | I just made a table this morning for Calc II notes. The first
           | column says something like $f'(x)$ in the first row and
           | $f'(0)$ on the second. The table body lists values for
           | different functions, one per column. I put in a column rule
           | separator because the leftmost column seems separate from the
           | others.
           | 
           | In any event, I'm suspicious of rules (pun noted).
        
       | two_handfuls wrote:
       | Summary:
       | 
       | This article is about a Python library called "Great Tables" that
       | is focused on the display of tables for publication and
       | presentation (not for interactive browsing).
       | 
       | The article does not specify which output format it supports.
       | 
       | Also you get some bonus historical context on tables.
        
         | frodowtf wrote:
         | ... the obligatory "historical context" nobody asked for.
        
       | countrymile wrote:
       | I love this package and have been using it for a few years in R.
       | It's great [for making] tables in html but the pdf and docx
       | output is a little less polished. I do worry that the recent
       | shift to bringing the python version up to speed with the R
       | version has slowed down the R development. Though it's well worth
       | checking out whatever your language.
        
       | jszymborski wrote:
       | The example they show of a Great Table is, to my taste, way too
       | busy. Here is my unsolicited opinion:
       | 
       | The top and bottom horizontal rules on the Title appear to be
       | superfluous, and I dislike how it is aligned with the first
       | column (row labels) rather than the second. I feel like a little
       | space to breath at the bottom, along with a bold font would add
       | visual hierarchy w/o the clutter.
       | 
       | The row label backgrounds are far too dark and the font weight
       | makes it hard to read. I'd prefer a very light blue here instead.
       | I don't like the row group label ("Name") being italicized.
       | 
       | The spanner labels floating in the centre make the table hard to
       | scan. Would be much nicer aligned left.
       | 
       | Finally, I really dislike the font (maybe this is just my
       | browser, though).
       | 
       | I mocked-up some of the changes here, I think this is a much
       | easier to read table:
       | 
       | https://i.imgur.com/iMMf5vo.png
        
         | sixQuarks wrote:
         | I totally agree with you. You should start a new library called
         | Even Greater Tables.
        
         | pimlottc wrote:
         | I don't understand why there aren't any horizontal rules or
         | stripes etc to reinforce the idea that each row is its own
         | record.
        
         | hooloovoo_zoo wrote:
         | Keep going IMO: shorten the title to remote correspondents
         | since the rest is redundant with the column names. The blue
         | highlight is now redundant with the title so ditch all of it.
         | Personal characteristics vs location don't meaningfully improve
         | the organization so ditch those as well.
        
         | zem wrote:
         | the white text on a dark background really was a glaring
         | misfeature in the original example, to the extent that i wonder
         | if the colours looked different on the author's monitor
        
         | mikehollinger wrote:
         | You might want to read Edward Tufte's Beautiful Evidence.[1] He
         | discusses stuff like what you brought up about readability and
         | distracting from the message / point of the data.
         | 
         | If you've seen sparklines, [2] Tufte coined the term.
         | 
         | Whenever I do a UI review I end up paging through it just to
         | see if there's something we're not thinking about, and its an
         | interesting book to just open to a random page and read.
         | 
         | Plus he has an entire treatise on why PowerPoint is terrible.
         | 
         | [1] https://www.edwardtufte.com/tufte/books_be
         | 
         | [2] https://en.wikipedia.org/wiki/Sparkline
        
           | airstrike wrote:
           | _> Plus he has an entire treatise on why PowerPoint is
           | terrible._
           | 
           | As someone trying to build a PowerPoint competitor, this is
           | awesome. I'm going to start here and work my way through his
           | whole corpus
        
             | esafak wrote:
             | See also https://norvig.com/Gettysburg/
        
       | simonbarker87 wrote:
       | This looks great. I so wish that the HTML table element would get
       | some progress - it's so limited.
       | 
       | I don't want to have to use some JS library component just to
       | show tabular data especially given how badly they perform one big
       | - but a server side rendered HTML table can be enormous and
       | render fine. But again, so limited.
        
         | paddy_m wrote:
         | Past a certain table size, the JS libraries will use less
         | memory. DOM elements take a lot of memory. Libraries like ag-
         | grid only render a small portion of the total table at a time.
         | 
         | The next performance gain web tables comes from using a binary
         | encoding instead of JSON, particularly arrow. Perspective uses
         | arrow (in addition to rendering to canvas).
         | 
         | IME building buckaroo on top of ag-grid, I can render the table
         | with up to about 300k elements very performantly with just
         | JSON. Rendering speed is a non factor because only 50 rows are
         | rendered at a time. Moving to arrow-js should be about 3 times
         | faster for the entire system (python serialize, js deserialize,
         | js render). Beyond 900k elements, you really want to lazily
         | load from the server as the user scrolls. The memory usage for
         | just the data in the browser tends to slow things down. (I am
         | working on a library and benchmark for different serialization
         | techniques).
        
           | benibela wrote:
           | >Libraries like ag-grid only render a small portion of the
           | total table at a time
           | 
           | such libraries often mess the scrolling and searching up
        
       | closed wrote:
       | Hey one of the co-maintainers of Great Tables, along with Rich
       | Iannone, here!
       | 
       | I just wanted to say that Rich is the only software developer I
       | know, who when asked to lay out the philosophy of his package,
       | would give you 5,000 years of history on the display of tables.
       | :)
        
       | boringg wrote:
       | I was really looking forward to a discussion about beautiful wood
       | tables. I should have known better
        
       | flobosg wrote:
       | Regarding "nanoplots": they are essentially sparklines, aren't
       | they?
        
         | icarusz wrote:
         | Yes. They are sparklines. I actually asked Rich the author if
         | they should just be called that in great_tables but he had some
         | reasonable thoughts on why a distinct name made sense.
        
       | tonymet wrote:
       | Imagine the web if every site was exclusively tabular. No UIs
       | just a table of figures and a CRUD for modifying it. Something
       | like hypercard meets excel
        
         | flobosg wrote:
         | Ah... the good old spacer.gif days...
        
           | tonymet wrote:
           | ha! i mean actual tabular data not abusing <table> for layout
        
             | dkh wrote:
             | Your proposal is the most extreme opposite of this
             | practice. Still got PTSD from early 2000s webdev? ;)
        
             | samatman wrote:
             | If you've never done a "view source" on Hacker News, now
             | might be a good time...
        
           | akira2501 wrote:
           | tabularasa.webp
        
       | mcswell wrote:
       | Not mentioned yet are DocBook tables, of which there are several
       | types. The kind we used starts here:
       | https://tdg.docbook.org/tdg/5.1/cals.table. You have to drill
       | down to get inside the tables. They have some--but I think not
       | all-- the structure of GT.
       | 
       | There's also of course LaTeX (mentioned in a couple other
       | comments here), which has "ordinary" tables and long tables
       | (tables that span more than one page).
        
       | seanwilson wrote:
       | How does this compare to https://github.com/jieter/django-
       | tables2? That one makes it really easy to display database models
       | as HTML tables with column sorting and pagination, and
       | search/filtering can be added on top with django-filter.
        
       | jiggawatts wrote:
       | Something that always annoyed me about numeric data like dollar
       | amounts in tables is that _visually_ the comparison between
       | quantities is logarithmic instead of linear.
       | 
       | E.g.:                   Cost         $1500          $130
       | $110          $210
       | 
       | The text in the last three rows look 4/5ths the size of the text
       | in the first row. However, even if summed, the last three costs
       | add up to only 1/3rd of the top row! People visually see the
       | number digits, which is roughly the same as Log 10.
       | 
       | I've so often had this issue that I started putting in-cell bar
       | charts into every finance-related spreadsheet.
       | 
       | Otherwise meetings will get derailed debating the cost of
       | something trivial that is totally irrelevant compared to the
       | biggest absolute costs.
       | 
       | As a real example, I had many meetings spent debating a $15
       | monthly cost for server log collection in the cloud for a VM
       | running a database engine that costs $15K monthly for the license
       | alone.
        
       ___________________________________________________________________
       (page generated 2024-04-04 23:00 UTC)