[HN Gopher] Data Organization in Spreadsheets (2017)
       ___________________________________________________________________
        
       Data Organization in Spreadsheets (2017)
        
       Author : Hagelin
       Score  : 41 points
       Date   : 2021-04-24 16:42 UTC (6 hours ago)
        
 (HTM) web link (www.tandfonline.com)
 (TXT) w3m dump (www.tandfonline.com)
        
       | jpcooper wrote:
       | Maybe not the best place to ask, but how to people organise their
       | Pandas research? Are there any useful methods?
        
         | closed wrote:
         | Is there a specific aspect of pandas research you're interested
         | in? There are a lot of useful guides around table-based
         | workflows that might be helpful :).
         | 
         | I would start w/ different strategies on how to model data in
         | tables. One problem that I often see in pandas data analyses is
         | people treating the data like it's a web app database (many
         | small, normalized tables), rather than joining the data into a
         | few big, denormalized tables. The latter makes it easier for
         | people to answer their own questions / vs relying on a bunch of
         | tiny custom functions someone wrote!
         | 
         | * Hadley's tidy data paper: https://vita.had.co.nz/papers/tidy-
         | data.pdf
         | 
         | * Normalizing data:
         | https://en.wikipedia.org/wiki/Database_normalization
         | 
         | * Denormalized data:
         | https://en.wikipedia.org/wiki/Denormalization
         | 
         | * Emily Riederer, column names as contracts:
         | https://emilyriederer.netlify.app/post/column-name-contracts...
        
       | dang wrote:
       | One past thread:
       | 
       |  _Data Organization in Spreadsheets_ -
       | https://news.ycombinator.com/item?id=17790545 - Aug 2018 (27
       | comments)
        
       ___________________________________________________________________
       (page generated 2021-04-24 23:01 UTC)