[HN Gopher] Grokking AVL and RAVL Trees
       ___________________________________________________________________
        
       Grokking AVL and RAVL Trees
        
       Author : photon_lines
       Score  : 45 points
       Date   : 2023-08-06 12:15 UTC (10 hours ago)
        
 (HTM) web link (photonlines.substack.com)
 (TXT) w3m dump (photonlines.substack.com)
        
       | tgv wrote:
       | This feels written by ChatGPT, certainly the first and last
       | paragraph. And it omits a few important characteristics. Balanced
       | trees were known, but balancing was expensive. AVL trees have an
       | O(log n) cost for search, insertion and deletion, and achieves
       | this not by balancing but by almost balancing.
        
         | photon_lines wrote:
         | It was mostly written by me, although I do admit to using
         | ChatGPT to help out in generating some of my material. Most of
         | the recent blog posts are Feynman style lecture notes I wrote
         | for myself to learn new concepts and this piece I actually
         | wrote well over 10 years ago and I figured to share it in case
         | anyone else wanted a quick and visual overview. Apologies for
         | missing out any important concepts. At the time I wrote down
         | material that I found important so there are definitely things
         | I may have not explained correctly.
        
           | rrobukef wrote:
           | All examples rotate a node with a single child, what about
           | the case where you have two children?
        
             | superdisk wrote:
             | AVL trees are binary trees so they can only have 2
             | children.
        
               | tgv wrote:
               | There can be nodes with a single child. Otherwise an AVL
               | tree would always have 2^n - 1 or 2^n - 2^(n-2) - 1
               | nodes. Imagine adding a node to a balanced tree with 3
               | nodes.
        
       | Dowwie wrote:
       | What kind of data persistence strategies make sense for AVL
       | trees?
        
         | tgv wrote:
         | It's only used in memory, AFAIK. I doubt they could outperform
         | B-trees in a db context.
        
           | yxhuvud wrote:
           | They won't outperform b-trees in memory either. Cache lanes
           | and cache hierarchies of modern computers (as in built after
           | the 90s) mean that neither AVL trees or red-black trees have
           | any place anymore.
        
             | eatonphil wrote:
             | See also:
             | 
             | > C++ standard libraries commonly implement the map and set
             | data structures using Red-Black trees, which store one
             | element per node, requiring three pointers plus one bit of
             | information per element to keep the tree balanced. By
             | comparison, B-tree containers store a number of elements
             | per node, thereby reducing pointer overhead and saving a
             | significant amount of memory. For small data types, B-tree
             | containers typically reduce memory use by 50 to 80%
             | compared with Red-Black tree containers.
             | 
             | https://opensource.googleblog.com/2013/01/c-containers-
             | that-...
             | 
             | And also:
             | 
             | > However, there are many reasons why B-trees are often a
             | good choice for in-memory collections.The first reason is
             | cache locality. When searching for a key in a binary tree,
             | the algorithm would visit up to logN elements that are very
             | likely dispersed in memory. On a B-tree, this search will
             | consist of two phases -- intra-node search and descending
             | the tree -- executed one after another. And while
             | descending the tree doesn't differ much from the binary
             | tree in the aforementioned sense, intra-node search will
             | access keys that are located next to each other, thus
             | making much better use of CPU caches.
             | 
             | https://www.scylladb.com/2021/11/23/the-taming-of-the-b-
             | tree...
             | 
             | And also, also:
             | 
             | > In this paper, we conduct a comprehensive performance
             | study on five state-of-the-art indices, Masstree, BwTree,
             | FAST, ART and PSL, with respect to throughput, scalability,
             | latency, memory consumption and cache miss rate. According
             | to our results, PSL achieves a similar query performance in
             | terms of get throughput to a read-only index, FAST, and
             | meanwhile performs up to 5x, 0.3x and 2.5x faster than
             | Masstree, ART and BwTree respectively. Another interesting
             | point observed in the results is that the performance of
             | BwTree drops significantly when the workload is write-only,
             | probably due to the contenction caused by the frequent
             | fails on the hot blocks. FAST and ART win the competition
             | on the cost of memory space, leading other indices with a
             | saving factor between 20%-100%. Our experiments also reveal
             | that for the dense dataset, ART and BwTree, which belong to
             | indices built based on tries, may be the suitable
             | candidates for a better overall performance.
             | 
             | https://www.comp.nus.edu.sg/~dbsystem/download/xie-
             | icde18-pa...
        
               | blt wrote:
               | a B-tree in the C++ STL would be a welcome addition!
        
             | o11c wrote:
             | The extra invalidations are nasty, though. There's a reason
             | the standard chose what it did.
        
             | jasonwatkinspdx wrote:
             | Yup, b-trees should be the default these days. Even in
             | memory latencies are so high the high fanout and shallow
             | tree dominates other concerns. Red Black trees can be
             | straight up terrible on modern hardware.
             | 
             | There's radix tree variants that still make sense vs
             | b-trees though. Typically they take up more space than
             | b-trees but have operations bounded by the key length not
             | dataset size. They also form the same structure regardless
             | of insertion order making them useful in cryptographic
             | contexts.
        
             | trealira wrote:
             | As with linked lists, pretty much the only reasons for
             | which you might want them are:
             | 
             | a) If your keys are _really_ big, copying them around may
             | be slower than using pointer manipulations; you 'd have to
             | benchmark this to be sure for particular cases.
             | 
             | b) You can use use them as intrusive data structures, with
             | one structure in multiple trees or lists at a time, and you
             | can insert one without dynamic allocation.
        
       ___________________________________________________________________
       (page generated 2023-08-06 23:01 UTC)