[HN Gopher] Arenas in Rust
       ___________________________________________________________________
        
       Arenas in Rust
        
       Author : welovebunnies
       Score  : 28 points
       Date   : 2025-10-03 19:47 UTC (3 hours ago)
        
 (HTM) web link (russellw.github.io)
 (TXT) w3m dump (russellw.github.io)
        
       | pizlonator wrote:
       | Arenas let you use OOBAs to do data attacks, and those can give
       | you the moral equivalent of remote code execution.
       | 
       | Like if your arena contains both a buffer that OOB's and a buffer
       | passed to a syscall, then having memory safety _around_ the arena
       | doesn 't help you
        
         | Animats wrote:
         | Yes. This is a big headache in graphics, where there are big
         | tables indexed by texture index down at the GPU level. Managing
         | those as the content changes, with the GPU using the data while
         | the CPU alters it, leads to huge complexity in Vulkan graphics.
         | 
         | The two times I've had to deal with a really tough debugging
         | problem in Rust, it's been related to some crate which did
         | that.
        
       | Animats wrote:
       | This is the back-link problem in Rust. You can do back links with
       | weak pointers, but it's rather verbose and involves extra run-
       | time checks.
       | 
       | I've been trying to figure out a compile-time approach to that
       | problem. Here's an early version of a tech note on that.[1] It
       | looks possible, but there are many issues. But it's important to
       | look at now. The C++ people are trying to figure out safe
       | backlinks.
       | 
       | [1] https://github.com/John-
       | Nagle/technotes/blob/main/docs/rust/...
        
       | IshKebab wrote:
       | Insightful article. As they say doubly linked lists are
       | approximately useless and never used. But trees were nodes can
       | refer to their parents are _extremely_ common.
       | 
       | And I agree, a custom arena is usually the best option here. Even
       | though it _seems_ like  "giving up" and going back to C, it's
       | still way better.
       | 
       | There's this great comparison of Arenas in Rust:
       | https://donsz.nl/blog/arenas/
       | 
       | Some of them have keys that can't be reused too, so you can avoid
       | "use after free" style bugs (at runtime) which is _way_ better
       | than C pointer wrangling.
       | 
       | It's not perfect though. I kind of wish Rust did have some kind
       | of support for this stuff natively. Pretty low on my list of Rust
       | desires though - way behind better compile time and fewer async
       | footguns.
        
       | echelon wrote:
       | I'd like to see something powerful for safety and speed in
       | Crates.io and Cargo:
       | 
       | The ability for crates to specify whether they use certain
       | features of Rust:
       | 
       | - unsafe
       | 
       | - arena allocation or other future "useful" stuff
       | 
       | - proc_macros, eg. serde (a huge compile speed hit)
       | 
       | - deep transitive dependency (total #, or depth)
       | 
       | - some proxy for the "number of compiler steps" (compiler
       | workload complexity)
       | 
       | We should be able to set a crate-wide or workspace-wide policy
       | that can ban other crates that violate these rules.
       | 
       | We could make this hand-written early on, but eventually enforced
       | by the compiler. You could then easily guarantee that no
       | downstream dependency uses unsafe code, has a crazy number of
       | dependencies, or has crazy compile times.
       | 
       | You could opt-in or opt-out to these things without having to
       | think too much about them or constantly (manually) be on the
       | lookout.
       | 
       | This would be hugely beneficial to the Rust ecosystem, safety,
       | code quality, and developer productivity / happiness.
        
       | mwkaufma wrote:
       | > Handles are deterministic. If a bug made your program crash on
       | the last run, it will crash the same way on this run.
       | 
       | > Given that the arena will still be subject to array bounds
       | checking, handle bugs won't allow an attacker to overwrite
       | arbitrary memory the way pointer bugs do.
       | 
       | Just because you're in-bounds of the arena, doesn't mean you're
       | not stomping on in-bounds neighboring data -- which can introduce
       | "nondeterminism" as the author defines it. These are hand-waving
       | arguments.
        
         | mwkaufma wrote:
         | Anticipating pushback: yes, you can disallow "pointer
         | arithmetic" on handles, and store fingerprints in the "slots"
         | to ensure they still contain the handle's identity to detect
         | user-after-free, but congrats, you've implemented sparse sets,
         | which there are dozen's of C++ implementations of with the same
         | safety guarantees, so it's unclear what rust is bringing in
         | that case (e.g. https://github.com/skypjack/entt/blob/master/sr
         | c/entt/entity...)
        
       | sfpotter wrote:
       | Here's an implementation of a doubly-linked list which is
       | perfectly fine on modern hardware: an array of structs where each
       | struct contains a pointer to its next and previous elements in
       | addition to whatever other data is being stored.
       | 
       | Here's a situation where a traditional pointer-based double-
       | linked list is fine: when the payload is very large (e.g., an
       | entire document in some app).
        
       | wasmperson wrote:
       | > There are several ways to solve this problem. One way is to
       | avoid using direct references to the particular class of objects
       | at all. Instead, allocate a big array of objects, and refer to
       | them with integer indexes into that array. There are several
       | names that have been used for such arrays and indexes; let's call
       | them arenas and handles.
       | 
       | I thought the term "Arena" referred to linear allocators, but
       | maybe it's not so narrowly defined.
       | 
       | > At one level, this question is not very fair, because an answer
       | to it as stated would be that one simply does not use doubly
       | linked lists. They have been popular in introductory computer
       | science lectures because they are a neat way to explain pointers
       | in a data structure that's easy to draw on a whiteboard, but they
       | are not a good match to modern hardware. The last time I used one
       | was in the nineties. I know the Linux kernel uses them, but that
       | design was also laid down in the nineties; if you were designing
       | a kernel from scratch today, you would probably not do it that
       | way.
       | 
       | The arena data structure you described is inefficient unless it
       | uses a linked list to track empty slots. All general-purpose heap
       | allocators use linked lists in their implementations. Linked
       | lists show up wherever you want pointer stability, low
       | fragmentation, or a way to decouple the storage of objects from
       | their participation in data structures. I struggle to imagine how
       | it would be possible to implement something like Linux without at
       | least one linked list in it.
       | 
       | > Essentially you are bypassing the notion of pointers provided
       | directly by the hardware
       | 
       | Pointers aren't a hardware concept, they're a language concept.
       | Array indices and pointers both rely on indirect addressing,
       | which is the underlying hardware concept. The "handles" strategy
       | in Rust feels like a kludgy approach not because it bypasses the
       | hardware but because Rust's borrow checker isn't actually
       | involved in ensuring safety in that case, just its bounds
       | checking and mandatory initialization (and if all you need is
       | those two things then...).
        
       ___________________________________________________________________
       (page generated 2025-10-03 23:00 UTC)