[HN Gopher] The subtleties of proper B+Tree implementation
___________________________________________________________________
The subtleties of proper B+Tree implementation
Author : tim_sw
Score : 44 points
Date : 2023-12-10 19:28 UTC (3 hours ago)
(HTM) web link (ayende.com)
(TXT) w3m dump (ayende.com)
| mondobe wrote:
| I've got a Data Structures and Algorithms final tomorrow, likely
| including a few 2-4 Tree problems... I will be forever grateful
| that we at least don't have to implement them, just simulate
| them.
| airstrike wrote:
| just curious.. what language is it being taught in?
| widforss wrote:
| When I did it it wasn't in any language. We just drawed the
| evolution of the trees on paper. I got my degree in 2021.
| waynesonfire wrote:
| Uhh.... maybe dont roll your own data structure if such a basic
| issue is causing you havoc.
| travisjungroth wrote:
| This makes it sound like it's the author's fault or something.
| They're pointing out a pretty subtle gotcha.
| schneems wrote:
| Maybe link to a resource for people to download a good
| implementation of that data structure. Maybe link to an article
| describing hot so implement this data structure correctly in
| depth. Maybe try finding some other way to add to the
| conversation instead of disparaging the author for sharing
| something they learned. Maybe don't post anything at all if you
| don't have something productive to say. Maybe use nonviolent-
| communication (NVC) techniques to say exactly what you just
| said, but without condescension or attack.
| travisjungroth wrote:
| There's something to generalize here. If you run out of X, split
| by X. This bug happens when you run out of _space_ , but split by
| _count_. Author's comment said he fixed it by splitting by size
| (space taken).
| cperciva wrote:
| Note: In most cases you don't want to split pages prior to
| insertion as is being done here. Instead, you want to insert data
| first, and then restore the page size invariants at the end
| (prior to committing the operation / writing to disk / etc). If
| you're doing multiple operations at once (e.g. adding new keys
| and deleting others) it may turn out that you can avoid the
| expensive page-splitting operation by postponing it.
| hyperman1 wrote:
| When I implemented this, long ago, I had a expanded page
| structure containing the old page plus the new key. So it had
| more memory than 1 page. Then the expanded page, when writing it
| back, becomes 2 pages if a split is needed.
|
| I did it this way for a few reasons. One was I could release old
| key pages after new pages were written, avoiding data loss in IO
| errors. Second was key (prefix) compression became a lot more
| easy. I never realised it, but it avoids the trouble in both blog
| post too.
|
| It comes with some copy overhead, unfortunately.
| NooneAtAll3 wrote:
| what's the point of storing different sized keys together?
___________________________________________________________________
(page generated 2023-12-10 23:00 UTC)