[HN Gopher] Schema on write is better to live by
       ___________________________________________________________________
        
       Schema on write is better to live by
        
       Author : hrishi
       Score  : 27 points
       Date   : 2021-08-21 20:31 UTC (2 hours ago)
        
 (HTM) web link (hrishioa.github.io)
 (TXT) w3m dump (hrishioa.github.io)
        
       | Timothycquinn wrote:
       | To OP - You sound kinda like me. I went over to Ecco Pro when I
       | started doing programming as the amount of information coming
       | into my head was way too much and I needed something that had low
       | overhead seamless moving and organization of data. I found A
       | single panel outliner (like Ecco Pro, Omni outliner...) works
       | great for my brain.
       | 
       | Every time I download a notetaking app to try a new system, I
       | pray it has an single panel outlining feature but nope. I can't
       | count the number of times I considered writing my own.
       | 
       | Now you have mentioned Notion, I checked and it has an outline
       | mode built right in. Woo Woo!! The only thing left to sell me is
       | Linux support which is my primary DT env.
       | 
       | Thanks for the post!
        
       | InGoldAndGreen wrote:
       | This is super interesting! I've always had a strong preference
       | for schema on write, both in databases and in life - prefer to
       | organize the cupboards when I first set up house, rather than
       | dumping everything in and hoping for the best. So I'm definitely
       | very inclined to immediately accept your basic premise: schema on
       | write is substantially better.
       | 
       | But I've ended up in several headed discussions on this, in the
       | context of SQL vs NoSQL. The one argument I've been slightly
       | persuaded by is: schema-on-read is significantly more flexible
       | than schema-on-write. In most of my actual programming
       | applications I still use SQL, because in coding I think it's
       | better to prioritize planning and structure over flexibility.
       | 
       | If there's one area where flexibility is necessary, it's real
       | life. When I first start researching something new, I don't
       | usually have enough knowledge to actually structure my schema
       | effectively, and the usefulness degrades. Of course, you can
       | update the schema. This isn't always great. If your schema needs
       | to change constantly, it wastes a lot of time. Depending on
       | complexity, it can also just be a massive cost. Eg: when I have
       | the kitchen set up all nicely, but then we get a new blender with
       | five different attachments and now I need to find an empty shelf
       | for them. Took about an hour to reshuffle everything
       | satisfactorily.
       | 
       | All that said, I'd still say that schema-on-write is better than
       | schema-on-read. Some structure is typically always better than no
       | structure.
       | 
       | However, I've recently been reading a book that I think gives an
       | interesting different insight to this problem - Designing Data-
       | Intensive Applications, by Martin Kleppmann. I've always
       | considered the main categorization of databases to be schema-on-
       | write vs schema-on-read, but this gives a completely different
       | method: databases are either document-based, relational, or
       | graph. Relational databases we're all generally familiar with,
       | while document-based is similar to today's NoSQL.
       | 
       | Graph databases have fallen out of favour, but I actually think
       | that they might be the best at representing the human information
       | gathering process. They have a structure that's provided on
       | write, but isn't always consistent across entries - because it's
       | flexible, and can be added to very easily. This lets us expand
       | the schema as we gather more information and our view of the
       | world changes, without needing to rearrange our past knowledge. I
       | also feel like a graphlike structure better represents how we
       | think.
       | 
       | Honestly, the main useful point I got from the book is that
       | schema-on-write doesn't need to mean lack of flexibility. That is
       | the case with most of the RDSs we use today, so it's what I've
       | come to expect. But that shows a lack of imagination on my part,
       | rather than any inherent restriction.
        
         | Timothycquinn wrote:
         | I agree completely about Graph DB's. I used a third party Graph
         | database that was schema heavy on Nodes and edges and it was
         | amazing for mapping business processes as the customers can
         | actually understand the data model and we did not need any
         | DBA's. I wrote several custom enterprise systems based on this
         | graph backend that were very successful in their deployment and
         | upkeep.
         | 
         | The company that has this DB never sold it on its own and its
         | only available with their $$$$ CAD / CAM / 3D / PDM / PLM
         | software packages used by the big guys (E.g. Boeing, Honeywell
         | ...)
        
       | simonw wrote:
       | I've come around to almost the opposite approach.
       | 
       | I pull all of the data I can get my hands on (from Twitter,
       | GitHub, Swarm, Apple Health, Pocket, Apple Photos and more) into
       | SQLite database tables that match the schema of the system that
       | they are imported from.
       | 
       | For my own personal Dogsheep
       | (https://simonwillison.net/2020/Nov/14/personal-data-warehous...)
       | that's 119 tables right now.
       | 
       | Then I use SQL queries against those tables to extract and
       | combine data in ways that are useful to me.
       | 
       | If the schema of the systems I am importing from changes, I can
       | update my queries to compensate for the change.
       | 
       | This protects me from having to solve for a standard schema up
       | front - I take whatever those systems give me. But it lets me
       | combine and search across all of the data from disparate systems
       | essentially at runtime.
       | 
       | I even have a search engine for this, which is populated by SQL
       | queries against the different source tables. You can see an
       | example of how that works at
       | https://github.com/simonw/datasette.io/blob/main/templates/d... -
       | which powers the search interface at https://datasette.io/-/beta
        
       | tjoff wrote:
       | This hits home in lots of ways and I'm going through a similar
       | struggle and realizations.
       | 
       | Thing is, I used to have an excellent memory. I could recall a
       | four year old blog post I read once and instinctively know,
       | whether it would help me with my current task (or enough of it to
       | solve it right away).
       | 
       | But now my memory is failing me (overload? stress? who knows). I
       | still instinctively know I've read something related to something
       | but I can't remember enough for it to help me, nor to find it
       | again. I can't remember the punch-line, or if there even was one.
       | It makes for the most boring and cringe anecdotes you've ever
       | heard. "Oh, yeah! I read about that, there was a thing and then
       | there was a conclusion. I'm not sure on the thing and the
       | conclusion could go either way." Worthless. Yet I still go for
       | it, because I'm used to remember enough of it for it to be
       | helpful/relevant. Now I can barely trust what I do think I
       | remember.
       | 
       | I've come to the same conclusion. Schema on write. But as is
       | noted " _It is a lot of work._ ". And I'm struggling. Because
       | previously the _very_ useful and rewarding  "hoarding" (wasn't
       | set out to hoard information, was just a side effect driven by
       | curiosity) was dirt cheap. But now it takes a lot of work to
       | condense, and it isn't at all clear that the effort is worth it.
       | For sure it depends. But even the act of looking up condensed
       | information tilts the scale a bit, you ~need to know that the
       | thing you are looking for exists in the first place or it might
       | just become another distraction. But it doesn't matter, I'm too
       | tired for it anyway and my mind refuses to adapt. Wanting more,
       | but I can't cope and gets nothing done.
       | 
       | More than a schema and order I need balance, perhaps by
       | (forcefully?) constrain myself. Someone on HN tried text-mode-
       | only linux ( recommend the read/skim https://dev.to/jackdoe/tty-
       | only-1ijn ). Maybe. I am lost. (hence the ramble)
        
       ___________________________________________________________________
       (page generated 2021-08-21 23:00 UTC)