[HN Gopher] Data Organization in Spreadsheets (2017)
___________________________________________________________________
Data Organization in Spreadsheets (2017)
Author : Hagelin
Score : 41 points
Date : 2021-04-24 16:42 UTC (6 hours ago)
(HTM) web link (www.tandfonline.com)
(TXT) w3m dump (www.tandfonline.com)
| jpcooper wrote:
| Maybe not the best place to ask, but how to people organise their
| Pandas research? Are there any useful methods?
| closed wrote:
| Is there a specific aspect of pandas research you're interested
| in? There are a lot of useful guides around table-based
| workflows that might be helpful :).
|
| I would start w/ different strategies on how to model data in
| tables. One problem that I often see in pandas data analyses is
| people treating the data like it's a web app database (many
| small, normalized tables), rather than joining the data into a
| few big, denormalized tables. The latter makes it easier for
| people to answer their own questions / vs relying on a bunch of
| tiny custom functions someone wrote!
|
| * Hadley's tidy data paper: https://vita.had.co.nz/papers/tidy-
| data.pdf
|
| * Normalizing data:
| https://en.wikipedia.org/wiki/Database_normalization
|
| * Denormalized data:
| https://en.wikipedia.org/wiki/Denormalization
|
| * Emily Riederer, column names as contracts:
| https://emilyriederer.netlify.app/post/column-name-contracts...
| dang wrote:
| One past thread:
|
| _Data Organization in Spreadsheets_ -
| https://news.ycombinator.com/item?id=17790545 - Aug 2018 (27
| comments)
___________________________________________________________________
(page generated 2021-04-24 23:01 UTC)