[HN Gopher] The design philosophy of Great Tables
___________________________________________________________________
The design philosophy of Great Tables
Author : randyzwitch
Score : 238 points
Date : 2024-04-04 18:00 UTC (4 hours ago)
(HTM) web link (posit-dev.github.io)
(TXT) w3m dump (posit-dev.github.io)
| ttymck wrote:
| Wow. This looks incredible, thanks for sharing.
|
| It makes me wonder how we've gone this long with increasingly
| _poor_ data table presentations (the mid-century modern tables
| are astutely pointed to as shining examples).
|
| This makes me excited to get back into data analysis with python.
| Moreover, I see some possible API improvements and extensions I'd
| like to make.
| sebastiansm wrote:
| It's great that the RStudio team is working on Python libraries.
|
| Hope to see dplyr and ggplot someday on Python.
| hadley wrote:
| You should check out https://siuba.org and https://plotnine.org
| :)
| randyzwitch wrote:
| plotnine is a ggplot also funded by Posit (though, externally
| developed)
|
| https://plotnine.org/
| kyllo wrote:
| I use plotnine whenever I need to make (static) plots in
| Python. It's really quite well done, a close match to R's
| ggplot2, and more feature complete than any of the other
| Python grammar of graphics packages I've tried.
| acrophiliac wrote:
| While I'm waiting for the packages to download, can you explain
| how I get tabular output when I run a python application using
| your package at the command line? Does it produce HTML output?
| PDF? Your "getting started" docs doesn't explain.
| cscheid wrote:
| You use it indeed to generate HTML output. For Python+Jupyter
| folk, that's most directly applicable to Jupyter Lab or Jupyter
| Notebook settings. You can use it with Jupyter book, nbconvert,
| or any other tools that convert .ipynb to HTML output...
|
| (Disclosure: Quarto dev here) ..., like Quarto. You can use
| `great_tables` in code cells in Quarto to get great tables in
| your RevealJS presentation or website,
| https://quarto.org/docs/output-formats/html-code.html.
| throwaway81523 wrote:
| This article is mostly blither, whether or not it is AI
| generated. It is about a Python library for generating nicely
| formatted HTML tables, though they don't tell you much about it
| til the near the end. The library seems to use an OOP approach.
| An alternative approach might be more declarative. The product
| name "Great Tables" appears in boldface over and over (no idea if
| the font helps SEO) and the name itself is awfully pretentious
| imho. Overall, the library itself sounds ok,, but the blog post
| is the annoying market-speak that frequently makes me cringe here
| on HN.
|
| It would be nice to add some interactivity features to the
| tables, like ActiveAdmin in Rails.
| paddy_m wrote:
| It's not AI generated. I have tracked the PR as they have
| worked on it. From what I have seen, the library has
| declarative elements that are similar to the grammar of
| graphics.
| paddy_m wrote:
| Great tables has done some really nice work on python/jupyter
| tables. It looks like they are almost building a "grammar of
| tables" similar to a grammar of graphics. More projects should
| write about their philosophy and aims like this.
|
| I have built a different table library for jupyter called
| buckaroo. My approach has been different. Buckaroo aims to allow
| you to interactively cycle through different formats and post-
| processing functions to quickly glean important insights from a
| table while working interactively. I took the view that I type
| the same commands over and over to perform rudimentary
| exploratory data analysis, those commands and insights should be
| built into a table.
|
| Great tables seems built so that you can manually format a table
| for presentation.
|
| https://github.com/paddymul/buckaroo
|
| https://youtu.be/GPl6_9n31NE
| xnx wrote:
| Tables are underutilized for how concise and descriptive they can
| be when making comparisons. It's a shame most text editors start
| with a blank table instead of inserting one pre-configured with
| some good design choices.
| tomcam wrote:
| Fantastic article, duly bookmarked. However.
|
| "The democratization of computational tables arguably began with
| VisiCalc in 1979... I mean, try it out and you'll see that this
| is quite limited in more than a few ways."
|
| Them's fightin' words. IMHO VisiCalc's ability to generate models
| quickly changed civilization. It freed people to try out ideas at
| no cost and to view or manipulate data in ways no one could hope
| to do before.
| narush wrote:
| This is an excellent blog post - I'd never heard of Great Tables
| before, and I'm a newly minted fan!
|
| > confronted with an all-too-familiar dilemma: copy your data
| into a tool like Excel to make the table, or, display an
| otherwise unpolished table.
|
| One add-on (coming from the past 4 years of working on a tabular-
| data from Pythons startup [1]) is that users aren't just copying
| data into Excel because if it's good formatting capability: very
| often, there are organizational constraints that mean that Excel
| _needs_ to be where this data ends up.
|
| The most common reasons I've seen for data ending up in Excel: 1.
| Other parts of the report rely on Excel features - you want to
| build pivot tables or graphs in Excel (often, these are much
| easier to build in Excel than in Python for anyone who isn't a
| real Pythonista) 2. The report you're sending out for display is
| _expected_ in an Excel format. The two main reasons for this are
| just organizational momentum, or that you want to let the
| receiver conduct additional ad-hoc analysis (Excel is best for
| this in almost every org).
|
| The way we've sliced this problem space is by improving the
| interfaces that users can use to export formatting to Excel. You
| can see some of our (open-core) code here [2]. TL;DR: Mito gives
| you an interface in Jupyter that looks like a spreadsheet, where
| you can apply formatting like Excel (number formatting,
| conditional formatting, color formatting) - and then Mito
| automatically generates code that exports this formatting to an
| Excel. This is one of our more compelling enterprise features,
| for decision makers that work with non-expert Python programmers
| - getting formatting into Excel is a big hassle.
|
| Of course, for folks who can ditch Excel entirely, this is
| entirely unnecessary. Great Tables seems excellent in this case
| (and anyone writing blog posts this good is probably writing good
| code too... :) )
|
| [1] https://trymito.io
|
| [2] https://github.com/mito-
| ds/mito/blob/dev/mitosheet/mitosheet...
| semireg wrote:
| Does anyone know any similar projects that can render to an HTML
| canvas?
| paddy_m wrote:
| Why do you want to render to canvas?
|
| Perspective seems to be the most performant html table. It is
| more focused on extremely fast updates than styling, although
| it looks good.
|
| Glide is a newcomer that also renders to canvas.
|
| https://github.com/finos/perspective
|
| https://github.com/glideapps/glide-data-grid
| antidnan wrote:
| There's also a book on the subject:
| https://en.wikipedia.org/wiki/The_History_of_Mathematical_Ta...
|
| Interesting aside: AI models trained on spreadsheets need "good
| tables" such as column names, headers, etc. to understand
| context. Like Fortap: https://arxiv.org/abs/2109.07323
| richmeister wrote:
| Thanks for sharing the book info! I really need to find a copy
| of that somewhere :)
| jimhefferon wrote:
| I'm interested in the midcentury modern ones because they have
| lots of vertical rules. I'm active on the subreddit for LaTeX and
| there is a religion common there that even one vertical rule is
| an unforgivable abomination.
| tadfisher wrote:
| Great summary of the problem on StackOverflow:
| https://tex.stackexchange.com/a/40555
| jimhefferon wrote:
| Thanks. I've not seen that particular post.
|
| I just made a table this morning for Calc II notes. The first
| column says something like $f'(x)$ in the first row and
| $f'(0)$ on the second. The table body lists values for
| different functions, one per column. I put in a column rule
| separator because the leftmost column seems separate from the
| others.
|
| In any event, I'm suspicious of rules (pun noted).
| two_handfuls wrote:
| Summary:
|
| This article is about a Python library called "Great Tables" that
| is focused on the display of tables for publication and
| presentation (not for interactive browsing).
|
| The article does not specify which output format it supports.
|
| Also you get some bonus historical context on tables.
| frodowtf wrote:
| ... the obligatory "historical context" nobody asked for.
| countrymile wrote:
| I love this package and have been using it for a few years in R.
| It's great [for making] tables in html but the pdf and docx
| output is a little less polished. I do worry that the recent
| shift to bringing the python version up to speed with the R
| version has slowed down the R development. Though it's well worth
| checking out whatever your language.
| jszymborski wrote:
| The example they show of a Great Table is, to my taste, way too
| busy. Here is my unsolicited opinion:
|
| The top and bottom horizontal rules on the Title appear to be
| superfluous, and I dislike how it is aligned with the first
| column (row labels) rather than the second. I feel like a little
| space to breath at the bottom, along with a bold font would add
| visual hierarchy w/o the clutter.
|
| The row label backgrounds are far too dark and the font weight
| makes it hard to read. I'd prefer a very light blue here instead.
| I don't like the row group label ("Name") being italicized.
|
| The spanner labels floating in the centre make the table hard to
| scan. Would be much nicer aligned left.
|
| Finally, I really dislike the font (maybe this is just my
| browser, though).
|
| I mocked-up some of the changes here, I think this is a much
| easier to read table:
|
| https://i.imgur.com/iMMf5vo.png
| sixQuarks wrote:
| I totally agree with you. You should start a new library called
| Even Greater Tables.
| pimlottc wrote:
| I don't understand why there aren't any horizontal rules or
| stripes etc to reinforce the idea that each row is its own
| record.
| hooloovoo_zoo wrote:
| Keep going IMO: shorten the title to remote correspondents
| since the rest is redundant with the column names. The blue
| highlight is now redundant with the title so ditch all of it.
| Personal characteristics vs location don't meaningfully improve
| the organization so ditch those as well.
| zem wrote:
| the white text on a dark background really was a glaring
| misfeature in the original example, to the extent that i wonder
| if the colours looked different on the author's monitor
| mikehollinger wrote:
| You might want to read Edward Tufte's Beautiful Evidence.[1] He
| discusses stuff like what you brought up about readability and
| distracting from the message / point of the data.
|
| If you've seen sparklines, [2] Tufte coined the term.
|
| Whenever I do a UI review I end up paging through it just to
| see if there's something we're not thinking about, and its an
| interesting book to just open to a random page and read.
|
| Plus he has an entire treatise on why PowerPoint is terrible.
|
| [1] https://www.edwardtufte.com/tufte/books_be
|
| [2] https://en.wikipedia.org/wiki/Sparkline
| airstrike wrote:
| _> Plus he has an entire treatise on why PowerPoint is
| terrible._
|
| As someone trying to build a PowerPoint competitor, this is
| awesome. I'm going to start here and work my way through his
| whole corpus
| esafak wrote:
| See also https://norvig.com/Gettysburg/
| simonbarker87 wrote:
| This looks great. I so wish that the HTML table element would get
| some progress - it's so limited.
|
| I don't want to have to use some JS library component just to
| show tabular data especially given how badly they perform one big
| - but a server side rendered HTML table can be enormous and
| render fine. But again, so limited.
| paddy_m wrote:
| Past a certain table size, the JS libraries will use less
| memory. DOM elements take a lot of memory. Libraries like ag-
| grid only render a small portion of the total table at a time.
|
| The next performance gain web tables comes from using a binary
| encoding instead of JSON, particularly arrow. Perspective uses
| arrow (in addition to rendering to canvas).
|
| IME building buckaroo on top of ag-grid, I can render the table
| with up to about 300k elements very performantly with just
| JSON. Rendering speed is a non factor because only 50 rows are
| rendered at a time. Moving to arrow-js should be about 3 times
| faster for the entire system (python serialize, js deserialize,
| js render). Beyond 900k elements, you really want to lazily
| load from the server as the user scrolls. The memory usage for
| just the data in the browser tends to slow things down. (I am
| working on a library and benchmark for different serialization
| techniques).
| benibela wrote:
| >Libraries like ag-grid only render a small portion of the
| total table at a time
|
| such libraries often mess the scrolling and searching up
| closed wrote:
| Hey one of the co-maintainers of Great Tables, along with Rich
| Iannone, here!
|
| I just wanted to say that Rich is the only software developer I
| know, who when asked to lay out the philosophy of his package,
| would give you 5,000 years of history on the display of tables.
| :)
| boringg wrote:
| I was really looking forward to a discussion about beautiful wood
| tables. I should have known better
| flobosg wrote:
| Regarding "nanoplots": they are essentially sparklines, aren't
| they?
| icarusz wrote:
| Yes. They are sparklines. I actually asked Rich the author if
| they should just be called that in great_tables but he had some
| reasonable thoughts on why a distinct name made sense.
| tonymet wrote:
| Imagine the web if every site was exclusively tabular. No UIs
| just a table of figures and a CRUD for modifying it. Something
| like hypercard meets excel
| flobosg wrote:
| Ah... the good old spacer.gif days...
| tonymet wrote:
| ha! i mean actual tabular data not abusing <table> for layout
| dkh wrote:
| Your proposal is the most extreme opposite of this
| practice. Still got PTSD from early 2000s webdev? ;)
| samatman wrote:
| If you've never done a "view source" on Hacker News, now
| might be a good time...
| akira2501 wrote:
| tabularasa.webp
| mcswell wrote:
| Not mentioned yet are DocBook tables, of which there are several
| types. The kind we used starts here:
| https://tdg.docbook.org/tdg/5.1/cals.table. You have to drill
| down to get inside the tables. They have some--but I think not
| all-- the structure of GT.
|
| There's also of course LaTeX (mentioned in a couple other
| comments here), which has "ordinary" tables and long tables
| (tables that span more than one page).
| seanwilson wrote:
| How does this compare to https://github.com/jieter/django-
| tables2? That one makes it really easy to display database models
| as HTML tables with column sorting and pagination, and
| search/filtering can be added on top with django-filter.
| jiggawatts wrote:
| Something that always annoyed me about numeric data like dollar
| amounts in tables is that _visually_ the comparison between
| quantities is logarithmic instead of linear.
|
| E.g.: Cost $1500 $130
| $110 $210
|
| The text in the last three rows look 4/5ths the size of the text
| in the first row. However, even if summed, the last three costs
| add up to only 1/3rd of the top row! People visually see the
| number digits, which is roughly the same as Log 10.
|
| I've so often had this issue that I started putting in-cell bar
| charts into every finance-related spreadsheet.
|
| Otherwise meetings will get derailed debating the cost of
| something trivial that is totally irrelevant compared to the
| biggest absolute costs.
|
| As a real example, I had many meetings spent debating a $15
| monthly cost for server log collection in the cloud for a VM
| running a database engine that costs $15K monthly for the license
| alone.
___________________________________________________________________
(page generated 2024-04-04 23:00 UTC)