[HN Gopher] Solving the ABA Problem in Rust with Tagged Pointers
       ___________________________________________________________________
        
       Solving the ABA Problem in Rust with Tagged Pointers
        
       Author : todsacerdoti
       Score  : 36 points
       Date   : 2025-02-11 12:26 UTC (3 days ago)
        
 (HTM) web link (minikin.me)
 (TXT) w3m dump (minikin.me)
        
       | Phelinofist wrote:
       | Can't the ABA problem circumvented quite easily by making the
       | shared state immutable? Every modification will create a new
       | instance, like so (in JAVA):                 class State {
       | private int value;         private String anotherValue;
       | State(int value, String anotherValue) {           this.value =
       | value;           this.anotherValue = anotherValue;         }
       | State changeValue(int newValue) {           return new
       | State(newValue, anotherValue);         }              // same for
       | anotherValue       }              AtomicReference<State> state =
       | new AtomicReference<>(new State(1, "a");              State
       | oldState;       State newState;       for(;;) {         oldState
       | = this.state.get();         newState = oldState.changeValue(2);
       | if(this.state.compareAndSet(oldState, newState)) {
       | break;         }       }
       | 
       | Since object identity is used for comparison, and every
       | transition is a new object, the ABA problem cannot happen because
       | - even if the state are equal - the comparison will fail.
        
         | whizzter wrote:
         | In Java if you hold an reference to an "old" object then that
         | will be kept alive until all references are exhausted, thus an
         | identity comparison comparing pointers will never succeed as
         | long as someone holds an reference (that is compared to).
         | 
         | In languages with more manual management, the pointer itself if
         | used can be reused and succeed in bad cases (though from my
         | cursory knowledge of Rust this seems to indicate quite a bit of
         | unsafe behaviours).
        
         | wat10000 wrote:
         | The problem isn't the mutability of the State object, but the
         | mutability of the pointer to it.
         | 
         | Your approach works in Java, but it only works if you have a
         | garbage collector that can ensure the memory is not freed until
         | there are no extant pointers to it, or you never free the
         | objects. The ABA problem occurs when one writer reads the old
         | pointer, another writer swaps out the pointer and frees the old
         | object, a third writer allocates a new object that happens to
         | be placed at the exact same location as the first one, then the
         | first writer does its compare-and-swap.
        
         | gpderetta wrote:
         | In any language with GC, a lot of complex concurrency problems
         | become much simpler (but of course now you have to deal with
         | the GC).
        
       | amluto wrote:
       | The tagged pointer example says, in the safety notes:
       | 
       | > Pointer Dereferencing: Pointers are only dereferenced after
       | successful CAS operations
       | 
       | I think a rather stronger argument should be needed. But the code
       | doesn't even follow what the notes say:
       | let current = self.head.load(Ordering::Acquire);
       | if current.ptr.is_null() {                     return None;
       | }                      let next = unsafe { (*current.ptr).next };
       | 
       | No CAS. And this sure looks wrong -- the current head could be
       | popped and freed before being dereferenced.
       | 
       | edit: put the correct code in. Thanks, LegionMammal978
        
         | LegionMammal978 wrote:
         | Those lines don't actually dereference anything, except for
         | (*new_node) which is owned by the function. Immediately
         | following them is the CAS to ensure that current is still at
         | self.head. If it gets deallocated, then the CAS fails, and it
         | tries again with the new self.head.
        
           | amluto wrote:
           | Whoops, I pasted entirely the wrong code!
           | let current = self.head.load(Ordering::Acquire);
           | if current.ptr.is_null() {                     return None;
           | }                      let next = unsafe {
           | (*current.ptr).next };
           | 
           | On the one hand, one might say "well, it's fine if we read
           | the wrong value here if the old head was popped in the mean
           | time", but it's still UB, and it can even segfault if the old
           | head is freed and unmapped.
        
             | empath75 wrote:
             | Miri does catch a problem here:
             | 
             | > error: Undefined Behavior: memory access failed: expected
             | a pointer to 8 bytes of memory, but got 0x2f5f40[noalloc]
             | which is a dangling pointer (it has no provenance) -->
             | src/main.rs:263:33 | 263 | let next = unsafe {
             | (*current.ptr).next }; | ^^^^^^^^^^^^^^^^^^^ memory access
             | failed: expected a pointer to 8 bytes of memory, but got
             | 0x2f5f40[noalloc] which is a dangling pointer (it has no
             | provenance) | = help: this indicates a bug in the program:
             | it performed an invalid operation, and caused Undefined
             | Behavior = help: see https://doc.rust-
             | lang.org/nightly/reference/behavior-conside... for further
             | information = note: BACKTRACE on thread `unnamed-2`: =
             | note: inside `LockFreeStack::pop` at src/main.rs:263:33:
             | 263:52
        
               | amluto wrote:
               | I _think_ this is an unrelated problem: https://doc.rust-
               | lang.org/std/ptr/index.html#provenance
               | 
               | Fixing it cleanly seems like it might need a new
               | AtomicPtrAndU64 or the ability to have an atomic 128-bit
               | struct, and I have no idea whether the latter is even
               | conceptually compatible with miri.
        
       | LegionMammal978 wrote:
       | The slotmap crate [0], which stores an unordered object list in a
       | Vec with persistent indices, implements a similar tagging scheme
       | to avoid ABA problems. It's neat to see how this can be applied
       | to raw pointers as well.
       | 
       | [0] https://docs.rs/slotmap/latest/slotmap/
        
       | drivebyhooting wrote:
       | For 32bit or 64bit tagged pointer is there a big risk of version
       | overflow?
        
         | convolvatron wrote:
         | oh certainly. this is one of those slightly unsatisfactory
         | consistency stories where its not a closed answer, its just
         | arbitrarily unlikely that a conflict will arise. we've already
         | got a _really_ small window where one thread has the old
         | pointer and the other thread frees it, it goes through the
         | heap, and is assigned the same address as the old version. now
         | exponentiate the likelihood of that happening with the size of
         | the tag space.
        
           | gpderetta wrote:
           | You just need to make it less likely than random RAM
           | corruption:)
        
         | Someone wrote:
         | There's a risk, but far from a big one. You'd need exactly 232
         | or even 264 pointer updates (or a multiple of it) between the
         | time you read the pointer and the time you try writing it for
         | running into problems. That's not likely to be a problem even
         | on a 1000-core CPU.
         | 
         | So, technically, this isn't _solving_ the ABA problem, it's
         | _just_ hiding it very, very well.
        
       | damon_dam wrote:
       | A much better solution to the ABA problem is to use load-linked
       | and store-conditional instructions [1] on platforms where they
       | are available (ARM, RISC-V, etc.)
       | 
       | Unfortunately x86 is not one of them. That's when CMPXCHG16B on
       | x86-64 comes in handy, as per the linked article. But if you're
       | going to write library code in 2025, it should at least support
       | x86 and ARM.
       | 
       | Which brings up a more interesting argument IMO: platform
       | fragmentation. Rust has approximately 1.5 advantages over C/C++.
       | To displace them in a reasonable time, it probably needed 4 or 5.
       | Its most significant contribution for the foreseeable future is
       | platform fragmentation.
       | 
       | [1] https://en.wikipedia.org/wiki/Load-link/store-conditional
        
         | explodingwaffle wrote:
         | Who says you can't make a library that does both? Rust makes it
         | pretty easy to conditionally compile code based on
         | architecture.
         | 
         | It could even be possible to make some sort of "ABA primitive"
         | and use that for these sort of data structures. This could well
         | exist: I've not looked. These sorts of things really aren't
         | that common in my experience.
         | 
         | On LR/SC: to any atomics experts listening, isn't it
         | technically "obstruction-free" (as per the Wikipedia
         | definitions at least) rather than lock-free? (though in
         | practice this makes basically no difference and still counts as
         | lock-free in the C++ (and Rust) sense) Just something that
         | stuck out last time I got sucked into this rabbit hole.
        
           | damon_dam wrote:
           | > Who says you can't make a library that does both?
           | 
           | Of course you can. I just meant that the linked article
           | didn't.
           | 
           | > On LR/SC: to any atomics experts listening, isn't it
           | technically "obstruction-free" (as per the Wikipedia
           | definitions at least) rather than lock-free?
           | 
           | The better criterion IMO is loop-free, which makes it a
           | little easier to understand. Consider the following spin-
           | locking code (with overabundant memory barriers):
           | do { p = *a; } while (p == 0x1 || !atomic_compare_and_swap(p,
           | 0x1, a));        memory_barrier();        // do stuff that
           | looks at *p        q->next = p;        memory_barrier();
           | atomic_store(q, a);
           | 
           | Here's the equivalent LL/SC version:                  do {
           | p = ll(a);          memory_barrier();          // do stuff
           | that looks at *p          q->next = p;
           | memory_barrier();        } while (!sc(q, a));
           | 
           | The pointer-tagging version is also obviously not loop-free.
           | Which is faster, in which cases, and by how much?
           | 
           | The oversimplified answer is that LL/SC is probably slightly
           | faster than spin-locking on most platforms and cases, but
           | pointer-tagging might not be.
        
           | gpderetta wrote:
           | My understanding is that all architectures that matter do
           | idiom recognition for ll/sc to guarantee forward progress
           | when ll/sc is used to implement CAS and other common lock-
           | free and wait free patterns, at least as a fallback.
        
             | dzaima wrote:
             | e.g. this is the guarantee in RISC-V:
             | https://github.com/riscv/riscv-isa-
             | manual/blob/70040578316b9...
        
         | saghm wrote:
         | > Which brings up a more interesting argument IMO: platform
         | fragmentation. Rust has approximately 1.5 advantages over
         | C/C++. To displace them in a reasonable time, it probably
         | needed 4 or 5. Its most significant contribution for the
         | foreseeable future is platform fragmentation.
         | 
         | I'm honestly not sure what any of this means. How are you
         | counting "advantages", and how did you come up with the
         | requisite number being 4 or 5? It's not clear to me why all
         | advantages would be equal in magnitude. Maybe this is what you
         | intended to convey by not using whole numbers, although it
         | seems like trying to estimate how advantages compare like this
         | in a way that was precise enough to be informative would be
         | difficult, to say the least.
         | 
         | To be clear, I don't disagree with you that C/C++ are not in
         | any immediate risk of being displaced (and I'd go further and
         | argue that C and C++ are distinct enough in how they're used
         | that displacing one wouldn't necessarily imply displacing the
         | other as well). I just don't think I've ever seen things
         | quantified in this way before, and it's confusing enough to me
         | that I don't want to discount the possibility that I'm
         | misunderstanding something.
        
           | damon_dam wrote:
           | > How are you counting "advantages", and how did you come up
           | with the requisite number being 4 or 5?
           | 
           | The vagueness was intentional. There is of course no
           | homogeneous way of combining advantages into a cardinal
           | measure. It's just a rhetorical device.
           | 
           | The point is that it falls short of the amount needed.
           | Another, more subtle point is that I didn't count
           | disadvantages. The argument applies even if you think Rust
           | doesn't have any.
        
         | wyager wrote:
         | > Rust has approximately 1.5 advantages over C/C++
         | 
         | I can think of at least 3 that deserve a whole integral bullet
         | point:
         | 
         | * ADTs
         | 
         | * Hindley-Milner/typeclass type system
         | 
         | * Lifetimes and affine types
         | 
         | And a bunch of minor ones that count for something like 0.1-0.5
         | of an advantage.
         | 
         | I would guess we're on track for a majority-of-new-code
         | switchover point somewhere around 2030-2035.
        
           | milesrout wrote:
           | Maybe he meant net advantages. Rust has some advantages
           | (ADTs, no constructors, no overloading, no preprocessor, well
           | thought out set of string types) and some disadvantages
           | (compile times, debug performance, heaps of wrapper cruft to
           | step through when debugging, explicit lifetimes, pervasive
           | type inference, ad-hoc polymorphism, no ABI stability, poor
           | support for dynamic linking, anemic standard library,
           | pervasive one-element-thinking encouraging millions of
           | separate allocations all over the place, culture of
           | simplifying code by adding copying and reallocation
           | everywhere because the borrow checker makes sharing verbose
           | and difficult ("just use clone"), general complexity,
           | dramatic and dogmatic community with a penchant for social
           | media bullying (eg serde, actix)).
           | 
           | What is an advantage or a disadvantage depends on who you
           | are. Personally, I find the disadvantages (as I see them)
           | outweigh the advantages. Others believe that some of those
           | disadvantages are actually good things. And some of them
           | depend on whether you are comparing to C or C++: compared to
           | C, Rust has many disadvantages, but C++ has many of the same
           | disadvantages, and often is worse.
           | 
           | I don't think Rust will ever have more new code being written
           | in it than C and C++ combined. I doubt it will overtake
           | either individually.
        
       ___________________________________________________________________
       (page generated 2025-02-14 23:01 UTC)