[HN Gopher] Grokking AVL and RAVL Trees
___________________________________________________________________
Grokking AVL and RAVL Trees
Author : photon_lines
Score : 45 points
Date : 2023-08-06 12:15 UTC (10 hours ago)
(HTM) web link (photonlines.substack.com)
(TXT) w3m dump (photonlines.substack.com)
| tgv wrote:
| This feels written by ChatGPT, certainly the first and last
| paragraph. And it omits a few important characteristics. Balanced
| trees were known, but balancing was expensive. AVL trees have an
| O(log n) cost for search, insertion and deletion, and achieves
| this not by balancing but by almost balancing.
| photon_lines wrote:
| It was mostly written by me, although I do admit to using
| ChatGPT to help out in generating some of my material. Most of
| the recent blog posts are Feynman style lecture notes I wrote
| for myself to learn new concepts and this piece I actually
| wrote well over 10 years ago and I figured to share it in case
| anyone else wanted a quick and visual overview. Apologies for
| missing out any important concepts. At the time I wrote down
| material that I found important so there are definitely things
| I may have not explained correctly.
| rrobukef wrote:
| All examples rotate a node with a single child, what about
| the case where you have two children?
| superdisk wrote:
| AVL trees are binary trees so they can only have 2
| children.
| tgv wrote:
| There can be nodes with a single child. Otherwise an AVL
| tree would always have 2^n - 1 or 2^n - 2^(n-2) - 1
| nodes. Imagine adding a node to a balanced tree with 3
| nodes.
| Dowwie wrote:
| What kind of data persistence strategies make sense for AVL
| trees?
| tgv wrote:
| It's only used in memory, AFAIK. I doubt they could outperform
| B-trees in a db context.
| yxhuvud wrote:
| They won't outperform b-trees in memory either. Cache lanes
| and cache hierarchies of modern computers (as in built after
| the 90s) mean that neither AVL trees or red-black trees have
| any place anymore.
| eatonphil wrote:
| See also:
|
| > C++ standard libraries commonly implement the map and set
| data structures using Red-Black trees, which store one
| element per node, requiring three pointers plus one bit of
| information per element to keep the tree balanced. By
| comparison, B-tree containers store a number of elements
| per node, thereby reducing pointer overhead and saving a
| significant amount of memory. For small data types, B-tree
| containers typically reduce memory use by 50 to 80%
| compared with Red-Black tree containers.
|
| https://opensource.googleblog.com/2013/01/c-containers-
| that-...
|
| And also:
|
| > However, there are many reasons why B-trees are often a
| good choice for in-memory collections.The first reason is
| cache locality. When searching for a key in a binary tree,
| the algorithm would visit up to logN elements that are very
| likely dispersed in memory. On a B-tree, this search will
| consist of two phases -- intra-node search and descending
| the tree -- executed one after another. And while
| descending the tree doesn't differ much from the binary
| tree in the aforementioned sense, intra-node search will
| access keys that are located next to each other, thus
| making much better use of CPU caches.
|
| https://www.scylladb.com/2021/11/23/the-taming-of-the-b-
| tree...
|
| And also, also:
|
| > In this paper, we conduct a comprehensive performance
| study on five state-of-the-art indices, Masstree, BwTree,
| FAST, ART and PSL, with respect to throughput, scalability,
| latency, memory consumption and cache miss rate. According
| to our results, PSL achieves a similar query performance in
| terms of get throughput to a read-only index, FAST, and
| meanwhile performs up to 5x, 0.3x and 2.5x faster than
| Masstree, ART and BwTree respectively. Another interesting
| point observed in the results is that the performance of
| BwTree drops significantly when the workload is write-only,
| probably due to the contenction caused by the frequent
| fails on the hot blocks. FAST and ART win the competition
| on the cost of memory space, leading other indices with a
| saving factor between 20%-100%. Our experiments also reveal
| that for the dense dataset, ART and BwTree, which belong to
| indices built based on tries, may be the suitable
| candidates for a better overall performance.
|
| https://www.comp.nus.edu.sg/~dbsystem/download/xie-
| icde18-pa...
| blt wrote:
| a B-tree in the C++ STL would be a welcome addition!
| o11c wrote:
| The extra invalidations are nasty, though. There's a reason
| the standard chose what it did.
| jasonwatkinspdx wrote:
| Yup, b-trees should be the default these days. Even in
| memory latencies are so high the high fanout and shallow
| tree dominates other concerns. Red Black trees can be
| straight up terrible on modern hardware.
|
| There's radix tree variants that still make sense vs
| b-trees though. Typically they take up more space than
| b-trees but have operations bounded by the key length not
| dataset size. They also form the same structure regardless
| of insertion order making them useful in cryptographic
| contexts.
| trealira wrote:
| As with linked lists, pretty much the only reasons for
| which you might want them are:
|
| a) If your keys are _really_ big, copying them around may
| be slower than using pointer manipulations; you 'd have to
| benchmark this to be sure for particular cases.
|
| b) You can use use them as intrusive data structures, with
| one structure in multiple trees or lists at a time, and you
| can insert one without dynamic allocation.
___________________________________________________________________
(page generated 2023-08-06 23:01 UTC)