[HN Gopher] Static B-Trees: A data structure for faster binary s...
___________________________________________________________________
Static B-Trees: A data structure for faster binary search
Author : sereja
Score : 162 points
Date : 2022-02-17 17:42 UTC (5 hours ago)
(HTM) web link (en.algorithmica.org)
(TXT) w3m dump (en.algorithmica.org)
| ww520 wrote:
| For static data would it be faster to use a trie or radix tree?
| sydthrowaway wrote:
| I need this for my interview skills.
| ignoramous wrote:
| An interesting resource.
|
| Just this past day, I prototyped a Skip List backing a sorted
| key/value store, and it was super simple to do so (as compared to
| AVL / Red Black).
|
| B-Trees are notorious for their complexity, and so SingleStore
| (memSQL) reasoned that their move to use Skip Lists to back their
| tables was justified: https://archive.is/Qhpgn
|
| It would be interesting to see the author attempt faster Skip
| Lists, as their expertise in this topic seems deep enough.
| josephg wrote:
| I've been wondering something similar. I recently published a
| rust rope library based on skip lists. (Ropes are "fat strings"
| optimized for efficient insertion and deletion of arbitrary
| characters.)
|
| My rope is backed by a skip list, where each node in the skip
| list has a gap buffer. Even without the gap buffer my skip list
| is ~1.5x faster than the next fastest library on cargo (ropey -
| which is b-tree based). With gap buffers the resulting code is
| 4x faster. My library is also significantly smaller, both in
| lines of code and compiled size.
|
| I've got another b-tree library I've been using for CRDT
| editing in Diamond Types. Based on this result I'm considering
| rewriting my btree as a skip list to see how it compares. It'd
| be a direct apples to apples comparison - I've already tuned my
| btree well as I can. I'm very curious how performance would
| change.
|
| https://crates.io/crates/jumprope
| CamperBob2 wrote:
| What's the advantage of a skip list over a simple hash map?
| Comevius wrote:
| Skip lists are so simple conceptually, I'm managing to make
| them concurrent, verifiable and even persistent.
|
| For reference this is a concurrent skip list that a high school
| student could implement. It's beautiful.
|
| https://people.csail.mit.edu/shanir/publications/LazySkipLis...
| sadiq wrote:
| If anyone wants one based on Herlihy et al's "The Art of
| Multiprocessor Programming" there's a well reviewed, tested
| and heavily commented concurrent lock-free one in OCaml's
| runtime now: https://github.com/ocaml/ocaml/blob/trunk/runtim
| e/lf_skiplis...
|
| It's used to manage code fragments that need to be accessed
| in signal handlers.
| ot wrote:
| Note that the article is exclusively about _static_ data. The
| vast majority of the complexity of B-trees comes from
| supporting updates, but constructing a balanced B-tree from a
| sorted list is relatively straightforward.
| marcosdumay wrote:
| Why does the article benchmark indexes on full scan operations?
| ot wrote:
| Very interesting post. The best results I knew of for static
| ordered search [1] were from the 2015 Khuong-Morin paper [2],
| which is not cited in the article but the author cites it in the
| previous post, and in their benchmarks Eytzinger+prefeching
| performed best. The improvement on top of that is pretty
| impressive. At the larger sizes, the latencies are basically 1
| cache miss (~50ns).
|
| However, I have some comment on the benchmark, the iterations
| have no loop dependency on the previous result, so the CPU is
| allowed to pipeline some loads. This is rarely representative of
| real workloads, where the key being searched comes from other
| computations, so this is more a measure of highest possible
| reciprocal throughput than actual latency. Would be interesting
| to see what happens if they make the key being searched a
| function of the result of the previous lookup. (EDIT: as gfd
| below points out, the article actually discusses this!)
|
| Also some of the tricks only work for 32-bit keys, which allow
| the vectorized search in nodes. For larger keys, different
| layouts may still be preferable.
|
| [1] Ordered search is when you need to look up keys that are not
| in the dataset, and find predecessor/successor. If you only do
| exact lookups, a hashtable will almost certainly perform better,
| though it would require a bit more space.
|
| [2] https://arxiv.org/pdf/1509.05053.pdf
| gfd wrote:
| If you search for "reciprocal throughput" in the article, the
| author did do another benchmark with dependencies to compare
| "real latency".
| ot wrote:
| Thanks! You're right, I missed it because it was in the
| section "Comparison with std::lower_bound" and I thought
| there would be just some gloating :)
|
| Most published data structure benchmarks have this pitfall,
| and don't discuss it, then when trying to measure the
| improvements after plugging into a real workload it turns out
| that the microbenchmark was not predictive. When I see
| latencies that top out at 50ns when they clearly have to make
| more than one causal load I immediately get skeptical.
|
| This is one of the very few articles I've seen that clearly
| explains this, which makes it even more excellent. I only
| wish it didn't label the other graphs "Latency" too, or at
| least put an asterisk there.
| sereja wrote:
| By the time I realized it needs correcting, I was already
| like 15 graphs in and really didn't want to remake them.
| _0ffh wrote:
| Somewhat understandable. As I sometimes stumble over
| comparable situations I have made it a habit to never
| just make any graphs, I always make a script that makes
| the graphs. That makes it easy and quick to go back and
| just re-generate them if you later change your mind about
| some detail.
| rurban wrote:
| Still does not beat static perfect hashes, the usual solution in
| these cases.
| SNosTrAnDbLe wrote:
| This is a very good resource. I am bookmarking it! It is very
| engineer friendly
| tabtab wrote:
| In general the "best" algorithm seems dependent on specific usage
| patterns. A different angle is to study or sample actual usage
| patterns and automatically adjust the structure/algorithm based
| on that. The indexes can be readjusted during off/slow hours. (If
| you don't have off hours, then such a system may not be the best
| fit.)
___________________________________________________________________
(page generated 2022-02-17 23:00 UTC)