[HN Gopher] B-Trees Require Fewer Comparisons Than Balanced Bina...
       ___________________________________________________________________
        
       B-Trees Require Fewer Comparisons Than Balanced Binary Search Trees
        
       Author : greghn
       Score  : 155 points
       Date   : 2024-06-23 16:06 UTC (1 days ago)
        
 (HTM) web link (databasearchitects.blogspot.com)
 (TXT) w3m dump (databasearchitects.blogspot.com)
        
       | shagie wrote:
       | Long ago, I was taking the sophomore level data structures and
       | algorithms class. It was being transitioned from C to C++ and
       | being taught by Professor DeWitt. I have vague memories of linked
       | lists and then binary trees. ... And then we got to the spot
       | where the syllabus from previous years had AVL trees and red
       | black trees... and he scoffed at teaching it.
       | 
       | And that I do remember. I suspect he had been doing lecture
       | planning over the weekend (this was in the "we're redoing the
       | class, teach it as it needs to be taught" type mandate) and he
       | had us skip that section of the text. The fist lecture he was
       | going off of overhead transparencies that he was writing as he
       | lectured. The second lecture he gave us photocopies of hand
       | written notes and diagrams.
       | 
       | Instead of AVL and red black trees, we instead learned about 2-3
       | trees and B trees.
       | 
       | For what it's worth, I still know how to do 2-3 trees from
       | scratch.
       | 
       | Later I looked at AVL trees... and they _still_ give me a
       | headache.
        
         | abhgh wrote:
         | Red-black trees use to make me anxious. I think the first time
         | I grasped them was after watching Erik Demaine's lecture on the
         | topic. For anyone interested it is this [1]; its still a very
         | good reference on the topic!
         | 
         | [1] https://videolectures.net/mit6046jf05_demaine_lec10/
        
           | cassepipe wrote:
           | RB trees still make me anxious which is why I use AA trees
           | instead
           | 
           | https://en.wikipedia.org/wiki/AA_tree
        
         | teruakohatu wrote:
         | I feel there must be better ways of learning the principles of
         | algorithms than balancing trees. I suffered through RB trees.
        
           | 082349872349872 wrote:
           | There are two fundamental things to teach in an algorithms
           | course:
           | 
           | 1) that there are many programs that implement the same
           | algorithm, and many algorithms that implement the same
           | function.
           | 
           | 2) that there are a few basic tradeoffs we can make when
           | implementing a function/customising an algorithm: space vs
           | time, read vs write, deterministic vs indeterministic, etc.
           | 
           | I'd guess trees have traditionally been used, because there's
           | a whole zoo of them, which clearly presents (1) and allows
           | (2) to be learned (in "cookbook" style) even if they're never
           | explicitly presented (in "textbook" style).
        
           | ajuc wrote:
           | Suffering is the point. It's like exercises in math or push-
           | ups - if you find a way to do them effortlessly you won't
           | develop the skills you are training for.
           | 
           | I agree they suck tho.
        
         | slaymaker1907 wrote:
         | B-trees are definitely more useful, but I don't think they're
         | really a good idea to start out with as your first balanced
         | tree.
        
         | saghm wrote:
         | > Instead of AVL and red black trees, we instead learned about
         | 2-3 trees and B trees
         | 
         | When we first were taught about B trees in my databases course
         | in college, he professor quipped something like "these are the
         | reason you'll never actually use a red-black tree in practice".
         | I've actually found that in practice I rarely ever up actually
         | needing to use any sort of data structure that maintains sort
         | order; for whatever reason, I've run into cases far more often
         | where I need to maintain insertion order than sort order, but
         | I'm not confident about whether this is due more to the type of
         | stuff I work on or if it's just a more common requirement in
         | general.
        
           | globular-toast wrote:
           | They're not about maintaining sort order, though? They're
           | about arbitrary lookup. Or do you mean "if I had to pick one
           | nice side effect"?
        
             | masklinn wrote:
             | If your concern is _just_ arbitrary lookups then your
             | baseline is a hashmap though.
        
               | globular-toast wrote:
               | Not really. Hashmaps don't have guaranteed performance
               | characteristics and require the "correct" hash function
               | to work properly. In particular, their performance
               | degrades as they become more "full" requiring a re-
               | allocation to a table of generally n-times the size.
               | Balanced trees could be regarded a more generic solution
               | to the mapping problem and are more suitable in many
               | applications.
        
               | fanf2 wrote:
               | Trees require a lot more memory indirections than
               | hashmaps, which makes them slower to traverse. They
               | require a lot more interior pointers, which makes them
               | bigger. It's easy for a hashmap to beat a tree.
        
               | globular-toast wrote:
               | Not in the worst case.
        
               | sqeaky wrote:
               | Which will be the developers responsibility to know when
               | that is and is rarely the default or normal case.
        
               | masklinn wrote:
               | > Not really.
               | 
               | Yes really, that's why the vast majority of languages
               | default to a hashmap (if they even have tree-based maps
               | at all). Turns out the average application is a _lot_
               | more interested in a low-constant-factor O(1) best case
               | than in a high-constant-factor O(log n) worst case.
               | 
               | > Hashmaps don't have guaranteed performance
               | characteristics
               | 
               | Of course they do.
               | 
               | > require the "correct" hash function to work properly.
               | 
               | And tree maps require correct comparators to work
               | correctly.
               | 
               | > In particular, their performance degrades as they
               | become more "full" requiring a re-allocation to a table
               | of generally n-times the size.
               | 
               | Yes? Also "n-times" seems like pretty fuddy fud. And it
               | seems weird to be more worried by the reallocation of a
               | hashmap when a tree requires an allocation per node, so
               | your average balanced tree requires a heap allocation
               | _per entry_. And two pointers, for up to 16x overhead not
               | including the allocator's own metadata.
               | 
               | > Balanced trees could be regarded a more generic
               | solution to the mapping problem and are more suitable in
               | many applications.
               | 
               | And the amphibious cycle could be regarded as the more
               | generic solution to the transport problem.
        
               | lrschaeffer wrote:
               | >> Hashmaps don't have guaranteed performance
               | characteristics
               | 
               | >Of course they do
               | 
               | No, they do not, at least not worst case O(1). If you
               | write a _very_ poor (but functional!) hash function or an
               | adversary is choosing your items, then congratulations,
               | your hashmap is now a linked list with extra steps.
               | 
               | Sure, you _could_ use a robust cryptographic hash
               | function, but you didn 't because (a) it's not the
               | default in your language and (b) it's slow and you are "a
               | _lot_ more interested in a low-constant-factor O(1) best
               | case ".
               | 
               | Otherwise, you're more or less correct.
        
               | Maxatar wrote:
               | I mean it's kind of pedantic but yes hash maps do have
               | guaranteed performance characteristics. They are not
               | guaranteed to be O(1), but they will never be something
               | like O(2^N) either. You can give guarantees about their
               | performance if you can make guarantees about the
               | distribution of the inputs and the properties of the hash
               | used.
               | 
               | In my own experience, most uses of hash maps are
               | typically O(logn). A good rule of thumb is hash maps are
               | preferable unless you need predictable worst case
               | performance or your hash maps are directly exposed to the
               | public.
        
               | masklinn wrote:
               | > A good rule of thumb is hash maps are preferable unless
               | you need predictable worst case performance or your hash
               | maps are directly exposed to the public.
               | 
               | Also, obviously, features only one of them provides e.g.
               | if you need range queries then a hash table is quite
               | useless. Meanwhile if you need insertion ordering, that's
               | reasonably easy to do with a hashmap.
        
               | globular-toast wrote:
               | > Yes? Also "n-times" seems like pretty fuddy fud. And it
               | seems weird to be more worried by the reallocation of a
               | hashmap when a tree requires an allocation per node, so
               | your average balanced tree requires a heap allocation per
               | entry.
               | 
               | Yes, it requires allocation for every entry whereas a
               | hash map requires it for some entries. You don't know
               | which ones. So real time constraints are out the window.
               | But no it's not something your average web programmer
               | (including me) needs to worry about. Some people do,
               | though.
        
               | saghm wrote:
               | Yep, this is the thought process I had; a red-black tree
               | or B-tree is something I'd consider for a map that needed
               | sort order, and a hashmap would be what I'd use by
               | default if I didn't care about order.
               | 
               | For additional context, I work almost entirely in Rust
               | nowadays, and BTreeMap and HashMap are the two map types
               | in the standard library (https://doc.rust-
               | lang.org/std/collections/index.html). I've ended up
               | needing insertion-ordered maps quite often in my work
               | though, which is not something I've seen taught very
               | often or in standard libraries, although there are plenty
               | of third-party implementations out there, and in most
               | languages it's not too difficult to hand-roll an
               | implementation using an unordered map and a linked list
               | by keeping both the value and a reference to the node
               | containing the element in the map, which provides
               | constant time insertion, lookup and deletion. (Rust is
               | somewhat notorious for _not_ being a language where
               | implementing something like this by hand is super easy
               | though, since tracking mutable references to individual
               | nodes of a linked list across separate hashmap values is
               | not something the borrow checker is enthusiastic about)
        
           | jeffparsons wrote:
           | > I've run into cases far more often where I need to maintain
           | insertion order than sort order, but I'm not confident about
           | whether this is due more to the type of stuff I work on or if
           | it's just a more common requirement in general.
           | 
           | I've had the exact opposite experience, so I'm guessing it
           | has a lot to do with the type of stuff you've (and I've)
           | worked on.
        
           | kristopolous wrote:
           | Honestly for me it's either linear search, dictionary, or
           | external store like database and that's actually it.
           | 
           | I think I've had to implement my own traversable data
           | structure literally only in interviews
        
             | saghm wrote:
             | To clarify, I definitely don't mean I've been implementing
             | these structures by hand! Even when using data structures
             | from the standard library or other dependencies though, I
             | just don't end up needing sort-ordered collections very
             | often in my work, and I thought that was interesting
             | (although sibling comments might indicate that this isn't
             | necessarily typical).
        
           | pcwalton wrote:
           | In many cases just using a hash table and sorting the keys is
           | faster than using a B-tree anyway. I tried switching from a
           | separate sort to a hash table in the Bevy game engine for
           | binning of transparent objects recently and it was a big
           | regression.
        
           | shagie wrote:
           | > When we first were taught about B trees in my databases
           | course in college
           | 
           | I'll note that the instructor for this was Professor David
           | DeWitt - https://en.wikipedia.org/wiki/David_DeWitt
           | 
           | He normally taught databases and one of my great regrets was
           | dropping his 500 level database class one semester when
           | things were rough. (Another regret was not taking networking
           | from Professor Landweber
           | https://en.wikipedia.org/wiki/Lawrence_Landweber .)
        
             | prerok wrote:
             | May I ask why you regret it?
             | 
             | It could be that my professors were simply not that good,
             | so that I don't really understand this. I mostly groked the
             | subjects by working through the books and exercises in
             | them. The professors' lectures almost always went over some
             | detail too quickly, so in a 3 hour lecture, I usually ended
             | up following only the first hour or so.
             | 
             | My question is maybe better put as: what quality or
             | approach makes them great lecturers?
        
               | shagie wrote:
               | Having a class in databases from the person who was the
               | DeWitt Clause in Oracle licenses.
               | 
               | I had his class before (the aforementioned sophomore data
               | structures and algorithms) and I knew he was a good
               | lecturer. That wasn't a question. It was a "too much
               | going on that semester and I wasn't able to do the
               | assignments in a timely manner".
               | 
               | Professor Landweber's class had a waiting list that I
               | scoffed at as a college student and looked for something
               | that I could get into and fill the requirement without
               | realizing _who_ he was. It wasn 't until later while
               | reading the UNIX and Linux System Administration Handbook
               | and seeing an image in there that was attributed to him
               | that the name clicked but he wasn't teaching that next
               | semester and I had filled that requirement elseclass.
               | 
               | Another regret was dropping numerical methods with
               | Professor de Boor (
               | https://en.wikipedia.org/wiki/Carl_R._de_Boor ) when I
               | couldn't get my head around splines rather than going to
               | his office hours (he was also the undergraduate advisor
               | and _very_ helpful).
               | 
               | Those classes could have changed the trajectory of my
               | career. I was looking at being a sysadmin instead and
               | writing perl and keeping some big iron running. My career
               | pivoted from sysadmin to webmaster to web developer
               | (perl) to web developer (Java). I have no regrets from
               | that path - I do have regret for not learning what I
               | could from the experts in their field when I was in a
               | place and time to learn from them.
        
           | Twirrim wrote:
           | Slightly random anecdote. Had an intern one summer at a large
           | tech company I worked for, who was determined to inject a
           | red-black tree in to the project they were working on for us,
           | using hand-rolled code.
           | 
           | I'm not sure what was going through his head at the time,
           | whether he'd just learned how to write them at college, or if
           | he just thought they were the pinnacle of engineering.
           | 
           | None of the cases they were trying to shoe-horn them in to
           | made sense. Places where you might debate that even a
           | straight array would be sufficient, due to the limited number
           | of items. The worst case of searching to the end of the array
           | wouldn't matter. Doubly so given they weren't working on
           | anything anywhere near a hot path, or a time sensitive
           | operation. Standard library components would have been
           | perfectly fine, and not carried the maintenance burden.
           | 
           | They did, on at least a couple of occasions, smugly explain
           | what a black-red tree was to their mentor, who had likely
           | implemented their first red-black tree at college before that
           | Intern had ever even touched a computer for the first time.
           | He showed _remarkable_ patience.
        
             | rdtsc wrote:
             | > I'm not sure what was going through his head at the time,
             | whether he'd just learned how to write them at college, or
             | if he just thought they were the pinnacle of engineering.
             | 
             | That's understandable and rather common with new fresh
             | grads. They have the energy and motivation but may not
             | quite have the experience. We were all there once,
             | including me.
             | 
             | My history professor from 8th grade used say: "There is a
             | difference between knowing what you're doing and doing what
             | you know".
        
             | saghm wrote:
             | > I'm not sure what was going through his head at the time,
             | whether he'd just learned how to write them at college, or
             | if he just thought they were the pinnacle of engineering.
             | 
             | I think it's easy to forget that a lot of interns might
             | never have actually worked on a "real" codebase before; I
             | know I certainly hadn't before my first internship! For
             | someone who had only ever worked on homework code in
             | classes before and maybe some side projects for fun, it's
             | very easy to imagine that they honestly didn't understand
             | that implementing the hardest data structure they knew by
             | hand wouldn't actually be an impressive display of
             | engineering to the company they were hoping to get a job
             | offer from.
        
               | mywittyname wrote:
               | I remember part of it being that they beat efficiency
               | into your brain throughout most of early CS. In college,
               | using the naive approach is treated as being ignorant.
               | 
               | Then you work on real software and realize that computers
               | are pretty fast and basic hashmaps & arrays will likely
               | suffice for the size of the datasets you're working with,
               | and beyond that, performance is more about tool/database
               | choice than rolling your own algorithms.
        
               | Twirrim wrote:
               | First code reviews are always interesting with interns, I
               | love reading through them to learn stuff (I'm an
               | operations focussed / sysadmin type, coding is more
               | something I've picked up as I've gone along and wanted to
               | automate stuff)
               | 
               | This particular intern was dead set on this red-black
               | tree, despite it being rejected constructively,
               | repeatedly, every time they tried to introduce it, and
               | despite mentoring on the subject from the experienced
               | developer who was his mentor, and guidance from other
               | engineers in the group.
               | 
               | The only intern I every had to deal with who didn't seem
               | to recognise that production code is different from
               | academic code, where the balance on clever vs
               | maintainable is vastly different :)
        
         | 29athrowaway wrote:
         | Left leaning red black trees are simpler to implement than the
         | regular red black tree.
        
           | skribanto wrote:
           | Left leaning red black trees enforce extra constraints, which
           | remove symmetries. This actually adds an edge case when
           | instead you can just treat left/right rotations and
           | comparisons as symmetric operations.
           | 
           | Left Leaning Red Black Trees Considerer Harmful:
           | https://www.read.seas.harvard.edu/~kohler/notes/llrb.html
        
         | tgv wrote:
         | AVL isn't complicated. Insert the node as you would in a binary
         | tree. Do that recursively. Then, when you return from
         | inserting, you check at each level if the height of the tree
         | under the left and right children differ at most 1. If not, you
         | perform a little shuffle of the nodes, update the levels, and
         | return to the previous level. That's all. The proof that that
         | works is a bit complicated, though.
        
         | e12e wrote:
         | I have a soft spot for red-black trees ever since reading about
         | them in Dr. Dobbs - it's the first time I remember reading
         | about a data structure (and it went mostly over my head at the
         | time - but the fond memory remain):
         | 
         | I believe it was the 1992 article by Bruce Schneier:
         | 
         | https://jacobfilipp.com/DrDobbs/articles/DDJ/1992/9204/9204c...
        
         | DanielHB wrote:
         | > Later I looked at AVL trees... and they still give me a
         | headache.
         | 
         | Why? From what I remember AVL trees were much easier to
         | implement than B-trees. Especially if using recursion.
        
           | whizzter wrote:
           | Insertion in AVL tree's is fairly simple and doesn't really
           | have any cascades so they are quite predictable, deletions on
           | the other hand cascades upwards with a bunch of different
           | imbalances and thus rotations(and post rotation balances) to
           | be accounted for.
           | 
           | Recently implemented them for an advanced hobby project and
           | having a solid fuzzing tester in place really alleviated a
           | lot of headaches in the long term since it was able to
           | produce quite an exhaustive set of conditions.
        
             | nayuki wrote:
             | B-tree deletions are quite complicated too, requiring
             | stealing keys from either-side sibling or merging with
             | either-side sibling.
             | 
             | In practice, I find that a B-tree implementation is 2 to 3x
             | the length of an AVL tree impl.
             | 
             | I also fuzz-tested my implementations thoroughly.
             | https://www.nayuki.io/page/avl-tree-list ,
             | https://www.nayuki.io/page/btree-set , etc.
        
               | DanielHB wrote:
               | Yeah, this is what I was thinking too. The rebalancing on
               | AVL is a bit tricky but once you grok it, it is not that
               | much code...
        
         | nayuki wrote:
         | I would not scoff at AVL trees. They are easy to prove
         | correctness for, and admit a short implementation at ~100
         | lines. They're as short as AA trees, but AA trees are harder to
         | prove with more cases. https://www.nayuki.io/res/aa-tree-
         | set/BasicAvlTreeSet.java , https://www.nayuki.io/page/aa-tree-
         | set
         | 
         | B-trees require a much longer implementation than AVL trees.
         | https://www.nayuki.io/page/btree-set
        
         | harrison_clarke wrote:
         | red black trees can be thought of as 2-4 trees, and they're way
         | less headache-inducing when you do
         | 
         | if you inline the red nodes into the black nodes, you get a 2-4
         | tree node. except that the 3-nodes have chirality. but then,
         | you can just use a 2-4 tree
         | 
         | AVL trees seem not worth learning. they have slightly different
         | performance characteristics than red-black trees. but, if i
         | care about performance beyond big O, i'm probably using a hash
         | table or a cache-aware tree of some kind instead of a CS101
         | data structure
        
       | cperciva wrote:
       | In a sense, red-black and 2-3 trees are just a special case of
       | B-trees with small nodes. Smaller B-tree nodes have (by a
       | constant factor) more expensive searching than larger nodes; but
       | smaller nodes are more efficient for mutating operations. And
       | indeed, if you have an insert/delete heavy workload you can't
       | beat something like a red-black tree.
        
         | anonymoushn wrote:
         | This sounds surprising, unless you mean specifically a workload
         | where a bunch of inserts and deletes are made using a stored
         | cursor, so that they don't need to follow a bunch of pointers
         | from the root.
        
         | aidenn0 wrote:
         | AA trees are isomorphic to 2,3 trees and RB trees are
         | isomorphic to 2,4 trees. IIRC we learned 2,4 trees first in my
         | data-structures class, and then demonstrated that RB trees were
         | a different representation of the same idea.
        
       | masklinn wrote:
       | > Practical B-trees often use fairly large values of k (e.g.,
       | 100) and therefore offer tight bounds -- in addition to being
       | more cache-friendly than binary search trees.
       | 
       | That is true on-disk, where you would not use BSTs anyway. In-
       | memory, mid to high single digit values are practical. For
       | instance iirc Rust's btree (map and set) uses something like k=6.
        
         | leni536 wrote:
         | Interesting, Abseil's btree claims to use a larger value:
         | 
         |  _... for absl::btree_set <int32_t>, nodes currently have 62
         | children ..._
         | 
         | https://abseil.io/about/design/btree
        
           | vlovich123 wrote:
           | From the looks of it Rust [1] uses a constant branching
           | factor based on number of items whereas ABSEIL generally uses
           | a target of 256 bytes [3] for branching and fits however many
           | elements fit within that [2]. Rust's approach seems to be
           | more primitive as ABSEIL is optimizing for cache line usage
           | (not sure why it's several multiples of a cache line - maybe
           | to help the prefetcher or to minimize cache line bouncing?)
           | 
           | [1] https://github.com/rust-
           | lang/rust/blob/master/library/alloc/...
           | 
           | [2] https://github.com/abseil/abseil-
           | cpp/blob/74f8c1eae915f90724...
           | 
           | [3] https://github.com/abseil/abseil-
           | cpp/blob/74f8c1eae915f90724...
        
             | masklinn wrote:
             | > not sure why it's several multiples of a cache line -
             | maybe to help the prefetcher or to minimize cache line
             | bouncing?
             | 
             | The AMD64 cache line is only 64 bytes, that would make for
             | very low branching factors given interior nodes need a
             | pointer per child, plus key, plus record pointer if b-tree
             | (as opposed to b+tree).
        
               | vlovich123 wrote:
               | To be clear, I was examining only leaf nodes. I'm not
               | sure if interior nodes use the same logic. Still, ABSEIL
               | generally uses a larger branching factor than Rust for
               | some undocumented reason.
        
       | rtheunissen wrote:
       | That does not mean that b-trees are unequivocally better than
       | binary search trees. There are some applications, like
       | concurrency or persistence, where comparison count is not as
       | important.
       | 
       | For instance, fewer values per node means less information to
       | copy when copying the node. Locking is more granular with fewer
       | values per node because fewer values are locked at a time.
       | 
       | The binary structure also makes it possible to support randomized
       | balancing and self-adjusting strategies like splay trees.
       | 
       | I am disappointed to still see frequent mention of red-black
       | trees - I see no reason for red-black trees to ever be the best
       | choice in practice, or in theory, or in school. LBST's
       | (logarithmic binary search trees) [1] are generally simpler, more
       | intuitive, and more efficient [2] than red-black trees. They also
       | support positional access inherently, so the balancing
       | information is useful (subtree size).
       | 
       | [1] https://www.semanticscholar.org/paper/A-New-Method-for-
       | Balan... [2] https://rtheunissen.github.io/bst
        
         | dragontamer wrote:
         | > For instance, fewer values per node means less information to
         | copy when copying the node. Locking is more granular with fewer
         | values per node because fewer values are locked at a time.
         | 
         | But this also assumes single-threaded computation.
         | Increasingly, high-performance computers are SIMD-compute (aka:
         | GPUs, and if not, at a minimum you're using AVX512 or ARM SVE).
         | 
         | If you have 1024-threads per thread group (common maximum in
         | GPUs), that will commonly be used to create 1024-element B-Tree
         | nodes, as sorting can be accomplished in a single pass network.
         | (Bitonic network: https://en.wikipedia.org/wiki/Bitonic_sorter)
         | 
         | A singular lock that covers the whole 1024-node would better
         | match these very wide GPUs that perform very wide SIMD-
         | execution in practice. Physically, GPUs are 32-wide for NVidia
         | and 32-wide or 64-wide for AMD. But various software tricks
         | increase the width by either ganging-up neighboring compute-
         | units in parallel and synchronizing them... or by running the
         | same code across sequential passes like in AMD GCN
         | architecture.
         | 
         | However, the programming interface of a 1024-wide (looking)
         | piece of hardware/equipment naturally maps to 1024-wide B-Tree
         | nodes, a 1024-wide Bitonic sort (or MergePath sort), and other
         | such parallel sorting networks.
         | 
         | ----------
         | 
         | B-Trees are "some number larger than 2", which I think for
         | modern high-performance SIMD-based GPUs, the number absolutely
         | should be "more than 2".
         | 
         | Where I'm wondering, is if the optimal algorithm is 32-wide
         | (aka: one NVidia wave, the smallest unit of computation on an
         | NVidia GPU), or if the optimal algorithm is 1024-wide (the
         | maximum gang of waves that can work as a team). Or if maybe
         | using the said maximum-gang across an even larger node (ex:
         | 4096-wide) has any benefits.
         | 
         | --------
         | 
         | We are reaching the point where compute is so cheap that even a
         | 1024-wide parallel sort is a trifling thing.
        
       | orlp wrote:
       | One thing to keep in mind is that keeping the keys sorted on an
       | insert requires O(k) element moves, so an insert is O(log2(n) +
       | k) operations. So if you make k very large your inserts get
       | slower.
        
       | kazinator wrote:
       | How B-trees can win is by taking advantage of the node children
       | being sorted in an array. This allows a binary search to be used.
       | 
       | A B-tree of any branching factor (not only 2) is equivalent to a
       | binary tree.
       | 
       | We can take any B-tree node _n_ :                          n
       | |       ---------+-----------       |   |    |   |   |  |       a
       | b    c   d   e  f
       | 
       | and regard it as an unbalanced tree                   n         |
       | / \       a  /\          b/\           c/\            d/\
       | e/\              f  nil
       | 
       | When searching for an item, if we linearly search every node of
       | the B-tree, like from _a_ to _f_ , it as if we are traversing the
       | unbalanced tree (except for constant-time instruction and cache
       | level efficiencies).
       | 
       | What the unbalanced tree cannot do is binary search through _a_
       | to _f_ to find which one to descend into; it doesn 't have those
       | nodes in a neat array for that.
       | 
       | That binary search makes B-tree more or less equivalent to a very
       | balanced binary tree.
        
         | jvanderbot wrote:
         | I don't think so?
         | 
         | I think the b-tree is using the start/end sequence knowledge at
         | each to do _fewer_ node evaluations. That is, a b-tree is
         | "shorter" so of course it'll take fewer node evaluations to
         | reach the leaf.
         | 
         | "b-tree is equivalent to balanced binary tree", and "unbalanced
         | tree", don't make a ton of sense when ostensibly the article
         | compares b-trees to balanced BST.
        
         | bunderbunder wrote:
         | Though, just spitballing here - I'd guess that, in practice, a
         | linear search still tends to handily outperform a binary search
         | (and, by extension, a well-balanced binary tree) whenever the
         | keys are stored in a compact array. Due to better interaction
         | with the cache and speculative execution.
        
           | sam0x17 wrote:
           | yeah locality of reference is way better as you're traversing
           | the array than as you follow links in the tree
        
           | shpongled wrote:
           | Definitely depends on the branching factor! There are other
           | encodings like Eytzinger that can provide further improves
           | over "standard" binary search too [1].
           | 
           | [1] https://en.algorithmica.org/hpc/data-structures/binary-
           | searc...
        
           | kazinator wrote:
           | Suppose the key comparisons are much more expensive than the
           | array or pointer navigation. Like the article does: it says
           | that it is about theory, and it is looking at number of
           | comparisons as the performance measure.
           | 
           | The fact that we can search a B-tree node that has, say, 256
           | children in 8 steps means that the node effectively contains
           | the equivalent of a small balanced binary tree.
        
           | kevin_thibedeau wrote:
           | It does. The break even point varies by platform but linear
           | will win for sufficiently short arrays.
        
           | torginus wrote:
           | I remember reading a blogpost about this, they compared
           | binary search on integer keys with a highly-optimized
           | unrolled SIMD linear search.
           | 
           | The takeaway was that the break-even point was suprising
           | small - just tens of elements, which is crazy considering a
           | CPU can do like 2x 8-wide AVX2 integer comparisons in a
           | single cycle. Modern CPUs are just that good at executing
           | branchy code.
           | 
           | But with B-trees we can have the best of both worlds - trees
           | with a relatively high branch factor, like 16, that can still
           | be tested in one go using SIMD.
           | 
           | Another advantage is cache locality - which is one of the
           | reasons B-trees are favored when implementing filesystems -
           | much of the tree doesn't need to be fetched into memory while
           | the parts that do are contiguous in memory.
        
         | sqeaky wrote:
         | I think you might be missing the real performance
         | characteristics of the hardware. The amount of comparisons
         | might be the same, but comparisons are only a proxy for the
         | expensive parts of lookups, the latency to tell RAM to provide
         | data.
         | 
         | If the whole tree is in CPU cache then likely the difference is
         | noise, but if the tree is large enough to have some data in RAM
         | then the cost to copy that into cache will be larger based not
         | on the size, but on the amount of queries from the CPU to the
         | RAM (unless the size it way out of proportion). RAM has latency
         | that is generally an order of magnitude slower than CPU Cache
         | and generally an order of magnitude slower than fast
         | contemporary storage.
         | 
         | There are many places you can find "Latency numbers every
         | programmer needs to know"
         | https://fullstackengineer.pro/blog/latency-numbers-programme...
         | 
         | But this trend has held for about 30 years and seem fundamental
         | to the economics of building hardware. CPUs will be fastest,
         | and cache slower, and RAM slower still , and fast storage still
         | slower etc.
         | 
         | From your example, if it takes 100ns to get the node containing
         | a~f then the whole of the B-Tree search for f might take 150ns
         | (1 lookup from RAM + a few CPU intructions), but with the
         | strictly binary tree it will need 6 lookups to RAM, and we can
         | see that we can ignore the CPU time because it is so small.
         | This could take 600ns (or 1300ns if each branch is a node with
         | only pointers and the data needs to be dereferenced too). A
         | smart cache prefetcher might guess that these were pointers and
         | precache a and b once n was referenced, and might even cache
         | things pointed to by loaded pointers but even in that highly
         | optimistic scenario it could still take 300ns to load stuff.
         | 
         | Consider filesystems on storage and how they have had this
         | problem but worse for a long time. Tree structures with many
         | pointers are faster than binary trees in proportion to latency
         | between the thing processing the tree and the place the tree it
         | stored.
         | 
         | EDIT - And don't get me started on clever use of SIMD
         | instruction. Imagine if node in a tree had binary layout
         | matching an AVX register, then comparisons looking for the
         | searched item could check a whole node in a single instruction!
         | I am curious if there are any implementations that do this?
         | 
         | EDIT 2 - It does exist and I am late to the show, people have
         | been doing this for like 15 years:
         | https://stackoverflow.com/questions/20616605/using-simd-avx-...
        
           | kazinator wrote:
           | Like the article, I'm only concerned with comparison counts,
           | trying to explain why B-trees look like balanced BSTs, and
           | why that gets tighter and tighter with increasing numbers of
           | children per node.
           | 
           | It's simply because the nodes themselves contain the
           | equivalent of a balanced binary tree, in the form of a
           | binary-searched array.
           | 
           | In the ultimate case, we set the number of children to be
           | large enough to contain all the data, so then there is only
           | one B-tree node, which is binary searched to find the
           | element. Then we have the same number of comparisons as well
           | balanced binary search tree.
        
           | 9659 wrote:
           | Just because it was first discovered 15 years ago, does not
           | mean it wasn't still a good idea.
           | 
           | Keep on thinking. You never know what else you will come up
           | with.
        
       | shin_lao wrote:
       | B-Tree are a good fit if your values are small, but as the value
       | size grows, they lose their advantage.
       | 
       | Binary tree is a good default choice for sorted structure.
       | 
       | Underappreciated structure: sorted arrays (but come with big
       | caveats).
        
       | nayuki wrote:
       | Blogspot supports HTTPS. Please always use it when linking to the
       | site.
        
       | exabrial wrote:
       | I think either is a good choice for knowing nothing about the
       | data set.
       | 
       | After n grows large enough, I think you'd want to benchmark your
       | -exact- dataset on your -exact- prod hardware. Some datasets may
       | cause certain trees to perform badly. Some hardware is much
       | faster at memory lookups and cache misses are not as painful.
       | 
       | For instance, developing locally on Intel and deploying to Arm
       | could yield pretty wild performance differences!
        
       ___________________________________________________________________
       (page generated 2024-06-24 23:01 UTC)