[HN Gopher] B-Trees: More Than I Thought I'd Want to Know
       ___________________________________________________________________
        
       B-Trees: More Than I Thought I'd Want to Know
        
       Author : hochmartinez
       Score  : 270 points
       Date   : 2025-01-03 09:04 UTC (1 days ago)
        
 (HTM) web link (benjamincongdon.me)
 (TXT) w3m dump (benjamincongdon.me)
        
       | prpl wrote:
       | Missing b-link trees/concurrency/locking, but maybe that really
       | is more than you want to know
        
         | brcmthrowaway wrote:
         | Resource for that?
        
         | iambvk wrote:
         | I am working on implement on-disk B+Tree for the last few
         | months. Man, keeping the on-disk state consistent with proper
         | locking is a real challenge -- specially when we want to avoid
         | IO-holding-mutex. And forward/backward iterators make me doubt
         | the correctness.
         | 
         | All this with just fixed size keys/values. I am yet to start on
         | variable sized keys and values, but I already want to give up
         | on my initial performance targets.
        
       | qianli_cs wrote:
       | Great article -- it clearly explains "The devil is in the
       | details" :) Would love to see another one for LSM-Tree, and the
       | comparison between B-Trees and LSM-Trees.
        
       | kuharich wrote:
       | Past comments: https://news.ycombinator.com/item?id=28221612
        
       | shayansm1 wrote:
       | a good source for implementing b-tree using golang:
       | https://build-your-own.org/database/04_btree_code_1
        
       | samlightfoot wrote:
       | A big fan of this article. I was in a similar position to the
       | author in having a 'vague understanding'. Excellent for anyone
       | wanting to solidify their mental model.
        
       | sgloutnikov wrote:
       | Great article! Here's one more, including animations on B-trees:
       | https://planetscale.com/blog/btrees-and-database-indexes
        
         | GamerAlias wrote:
         | Personally this was more useful than the original article.
         | Original article mostly aimed at someone who is slightly more
         | familiar with BTrees which is fine
        
           | dxdm wrote:
           | I don't know, this article is immediately confusing, because
           | the rules seem wrong, or inconsistent at least. That makes it
           | harder to understand what's going on, unless I suppose one
           | knows already.
           | 
           | The first rule does not allow trees containing less than 2
           | elements.
           | 
           | The second rule makes me wonder how "N" is being used in the
           | rules. The first rule treats it as a per-node variable, per
           | the second rule it has to be a variable external to the node.
           | Also, what if N is not divisible by 2?
           | 
           | Now that I don't trust the rules anymore, I cannot tell what
           | the third rule is trying to do or if it's even correct,
           | because so far it has not been stated what purpose these
           | rules are trying to accomplish.
           | 
           | Rule 4 contradicts rule 1.
           | 
           | Now, the ordering rules (and the demo) do not allow multiple
           | elements with the same key. Is this on purpose? Database
           | indexes support this, so it would be nice to get one sentence
           | about, so I don't have to wonder why this general
           | introduction does not seem to deal with it.
           | 
           | I'm only this far in, and it already threw multiple wrenches
           | into my attempts at thinking along. Now I can try to fill the
           | gaps in the explanation myself, but I'm wondering how much I
           | can trust the interactive elements to test my own
           | understanding.
           | 
           | This article has a lot of potential, but it could really use
           | an editing pass or two.
        
       | hiimkeks wrote:
       | > Slotted pages are composed of three components: a header
       | (containing metadata about the page), cells (variable-sized
       | "slots" for data to be stored in), and offset pointers (an array
       | of pointers to those cells). The benefit of this layout is that
       | you can store variable sized data, since the cells can be of
       | variable size, and you don't need to move that data to logically
       | reorder it. Reordering the positions of the pointers in the
       | pointer array is sufficient. This is inexpensive since the
       | pointers are small, and in a well-known position at the beginning
       | of the page. In other words, as long as the offset pointers are
       | ordered in key-sorted order, it doesn't matter where in the
       | actual page the keys are stored.
       | 
       | Does that really make a difference if we have to re-write the
       | entire tree node anyway, even if it's just for reordering the
       | pointers (because node == page)?
        
         | tomnipotent wrote:
         | > Does that really make a difference if we have to re-write the
         | entire tree node
         | 
         | The I/O write is a fixed cost, but sorting the data to match
         | the pointers would require additional CPU cycles for an outcome
         | that wouldn't really make a difference for subsequent
         | operations.
        
       | QuadrupleA wrote:
       | I made a SQLite disk-page explorer a little while ago, helped me
       | understand B-trees a lot better:
       | 
       | https://github.com/QuadrupleA/sqlite-page-explorer
        
       ___________________________________________________________________
       (page generated 2025-01-04 23:00 UTC)