[HN Gopher] Implementing a safe garbage collector in Rust
___________________________________________________________________
Implementing a safe garbage collector in Rust
Author : celeritascelery
Score : 70 points
Date : 2022-04-26 11:56 UTC (3 days ago)
(HTM) web link (coredumped.dev)
(TXT) w3m dump (coredumped.dev)
| ridiculous_fish wrote:
| I don't understand how the rooting works here. Usually GC
| languages make the distinction between a "pointer" which is a
| reference into the heap, and a "root" which is something known to
| the GC that keeps a pointer live. I don't see that distinction
| here. I think this is what the author is saying:
|
| 1. You can allocate from an arena; you get an object.
|
| 2. Allocated objects may be freed when a GC is triggered, unless
| they are added as a root. It is your responsibility to add your
| object as a root.
|
| 3. Roots are in a LIFO stack and cannot be moved.
|
| Ok, but we never actually see the root _type_. For example:
| let rooted: Vec<Object<'root>> = ...; // we have a vec of root
| objects let object: Object<'arena> = arena.add("new"); //
| object is bound to arena rooted.push(object); // object
| is now bound to rooted arena.garbage_collect(); // Object
| is marked as live and not freed
|
| How is anything rooted here? The lifetime changed from 'arena to
| 'root but I don't see a root being created.
|
| Some other misc questions:
|
| _What if instead of storing them as a map, we store them as a
| stack instead? When something is rooted, it is pushed on the
| stack. When it drops, it is popped from the stack._
|
| But later we see roots not obeying a LIFO order, under
| "Preventing escapes" where roots are dynamically created and
| destroyed in an arbitrary order.
|
| _The function takes a &mut Arena, and at the end it returns an
| Object with the same lifetime_
|
| But earlier it says "we have to make sure the object can't move."
| I'm confused what is being returned here: an object reference, or
| a root.
| celeritascelery wrote:
| > How is anything rooted here? The lifetime changed from 'arena
| to 'root but I don't see a root being created.
|
| In this example, the Vec has been rooted previously. So pushing
| an object into the Vec will make it "transitively" rooted
| (accessible from the root). You would root a struct with the
| root_struct![1] macro, which works very similar to the root!
| macro shown in the post.
|
| However you made you realize one error; The rooted `Vec` in the
| example you pointed is a by value type, but in the
| implementation you can only get references to rooted structs,
| so that example needs to be updated.
|
| > But later we see roots not obeying a LIFO order, under
| "Preventing escapes" where roots are dynamically created and
| destroyed in an arbitrary order.
|
| Objects are just a copyable wrapper around a pointer, so they
| are not the part that has the LIFO semantics. inside the root!
| macro[2] there is a `StackRoot` type that is the actual "root".
| The object just borrows from that so that is has a 'root
| lifetime and is valid post gc. The actual root struct is not
| exposed outside of the macro.
|
| I hope this makes the distinction between "roots" and "objects"
| clearer. Objects are just pointers to heap data. When we root
| an object we store the data it points to on the root stack and
| create a new `StackRoot`. Then we say this object is rooted.
| But the struct that "does the rooting" is inside the macro and
| not exposed. Rooting a struct works similarly.
|
| [1]
| https://github.com/CeleritasCelery/rune/blob/5a616efbed763b9...
|
| [2]
| https://github.com/CeleritasCelery/rune/blob/5a616efbed763b9...
| ridiculous_fish wrote:
| The existence of the StackRoot type is definitely a missing
| piece of the puzzle, thank you. So we have Object and
| StackRoot. But then what is a "root object?"
| let rooted: Vec<Object<'root>> = ...; // we have a vec of
| root objects
|
| I would expect this vector to just contain Objects?
|
| I think conceptually you have two different Object types: a
| "dangling" Object that is subject to collection, and a
| "rooted" Object which is referenced by a root stored in the
| RootSet. Part of my confusion is that there's no vocabulary
| to distinguish these. In this code:
| root!(value, gc)
|
| Visually it appears to "do something" to value but in fact
| there's two different variables, with two different types,
| both named 'value'. It would be clearer at least to me to
| reify that by giving them different types instead of calling
| both Object.
|
| In a moving GC this distinction becomes unavoidable: dangling
| objects are raw pointers, while rooted objects refer to the
| root location itself and require double indirection to
| access.
|
| Anyways it's just a suggestion, take it or leave it. It's
| very cool how Rust's lifetimes let you enforce that no
| collections happen while an Object dangles.
|
| One last thought: because you're already on the hook for
| passing in the GC when dereferencing an Object, you could
| exploit this by making Object references just a 32 bit
| offset, relative to the GC's heap. This would let you save
| some memory.
| amelius wrote:
| This is cute but modern concurrent garbage collectors like the
| ones used in VMs and languages such as Go are way more
| complicated. In fact, the GC can be the most difficult aspect of
| an entire environment including the compiler.
| NeutralForest wrote:
| Very cool project! What motivated you to start a Rust VM for
| Emacs?
| celeritascelery wrote:
| I talk little bit about that in my introductory post[1].
| Basically my interest in Emacs, Rust, and interpreters came to
| together and I decided to see what I could do. I was also
| inspired by the remacs project[2] and I think Rust has a lot to
| offer as dynamic language backend. I don't know where this will
| go, but I am going to continue to work on it.
|
| [1]https://coredumped.dev/2021/10/21/building-an-emacs-lisp-
| vm-...
|
| [2]https://github.com/remacs/remacs
| NeutralForest wrote:
| I use Emacs a lot so it's always a treat to learn more about
| the internals, one way or another, have fun!
| gwbas1c wrote:
| One thing I'm trying to understand is: Is this a garbage
| collector _for_ Rust; or is this a garbage collector _implemented
| in_ Rust for another environment.
|
| It's confusing because the author refers to their Emacs VM; but
| then refers to Rc in Rust.
| nicoburns wrote:
| It's currently the latter, but the author believes that this
| approach could be cleaned up and generalised to also be the
| former. I believe it would then be a general purpose GC library
| in Rust that could be used for Rust code or any environment
| written in Rust.
| steveklabnik wrote:
| The latter, implemented in Rust (in my understanding).
|
| They bring up Rc<T> because using reference counting as their
| implementation is in theory a possibility, which they then
| quickly (and rightfully, imho) dismiss.
| zamalek wrote:
| Although you probably could use it for something else, Boa[1]
| is a good demonstration of this (it uses the GC crate, but
| the same principle probably applies).
|
| [1]: https://github.com/boa-dev/boa
___________________________________________________________________
(page generated 2022-04-29 23:00 UTC)