[HN Gopher] The inconceivable types of Rust: How to make self-bo...
       ___________________________________________________________________
        
       The inconceivable types of Rust: How to make self-borrows safe
       (2024)
        
       Author : birdculture
       Score  : 113 points
       Date   : 2025-11-15 23:31 UTC (23 hours ago)
        
 (HTM) web link (blog.polybdenum.com)
 (TXT) w3m dump (blog.polybdenum.com)
        
       | Animats wrote:
       | This is going to take some serious reading.
       | 
       | I've been struggling with a related problem over at [1]. Feel
       | free to read this, but it's nowhere near finished. I'm trying to
       | figure out how to do back references cleanly and safely. The
       | basic approach I'm taking is
       | 
       | - We can do just about everything useful with Rc, Weak, RefCell,
       | borrow(), borrow_mut(), upgrade, and downgrade. But it's really
       | wordy and there's a lot of run time overhead. Can we fix the
       | ergonomics, for at least the single-owner case? Probably. The
       | general idea is to be able to write a field access to a weak link
       | as                   sometype.name
       | 
       | when what's happening under the hood is
       | sometype.upgrade().unwrap().borrow().name
       | 
       | - After fixing the ergonomics, can we fix the performance by
       | hoisting some of the checking? Probably. It's possible to check
       | at the drop of _sometype_ whether anybody is using it, strongly
       | or weakly. That allows removing some of the per-reference
       | checking. With compiler support, we can do even more.
       | 
       | What I've discovered so far is that the way to write about this
       | is to come up with real-word use cases, then work on the
       | machinery. Otherwise you get lost in type theory. The "Why" has
       | to precede the "How" to get buy-in.
       | 
       | I notice this paper is (2024). Any progress?
       | 
       | [1] https://github.com/John-
       | Nagle/technotes/blob/main/docs/rust/...
        
         | mustache_kimono wrote:
         | > But it's really wordy and there's a lot of run time overhead.
         | 
         | I'm curious: what do the benchmarks say about this?
        
         | kurante wrote:
         | Have you seen GhostCell[1]? Seems like this could be a solution
         | to your problem.
         | 
         | [1]: https://plv.mpi-sws.org/rustbelt/ghostcell/
        
           | Animats wrote:
           | Yes. There's an implementation at
           | https://github.com/matthieu-m/ghost-cell
           | 
           | Not clear why it never caught on.
           | 
           | There have been many attempts to solve the Rust back
           | reference problem, but nothing has become popular.
        
           | zozbot234 wrote:
           | The qcell crate is perhaps the most popular implementation of
           | GhostCell-like patterns. But the ergonomics is a bit of a
           | challenge still.
        
             | Animats wrote:
             | Right. The whole problem with all this is ergonomics, from
             | the point of view of programmers who don't want to obsess
             | over ownership and type theory. We sort of know how to make
             | this work. It works fine with enough Rc/Weak/etc. But it's
             | a huge pain.
             | 
             | I appreciate people arguing over this. It helps. We've seen
             | proposals from people who are too much into the type theory
             | and not enough into ease of use. I used to do proof of
             | correctness work, where the problem is that proof of
             | correctness people are too into the formalism and not
             | enough into killing bugs.
        
         | zozbot234 wrote:
         | > The general idea is to be able to write a field access to a
         | weak link as                 sometype.name
         | 
         | > when what's happening under the hood is
         | sometype.upgrade().unwrap().borrow().name
         | 
         | You could easily implement this with no language-level changes
         | as an auto-fixable compiler diagnostic. The compiler would
         | error out when it sees the type-mismatched .name, but it would
         | give you an easy way of changing it to its proper form. You
         | just avoid making the .name form permanent syntactic sugar
         | (which is way too opaque for a low-level language like Rust),
         | it gets replaced in development.
        
         | SkiFire13 wrote:
         | > when what's happening under the hood is
         | 
         | > sometype.upgrade().unwrap().borrow().name
         | 
         | I suspect a hidden `.unwrap()` like that will be highly
         | controversial.
        
           | Animats wrote:
           | .borrow() already has a hidden unwrap. There's try-borrow(),
           | but the assumption for .borrow() is that it will always
           | succeed.
           | 
           | What I'd like to do is move as much of the checking as
           | possible to the drop() of the owning object, and possibly to
           | compile time. If .borrow() calls are very local, it's not too
           | hard to determine that the lifetimes of the borrowed objects
           | don't overlap.
           | 
           | Upgrade is easy to check cheaply at run time for Rc-type
           | cells. Upgrade only fails if the owning object has been
           | dropped. At drop, if weak_count == 0, no dangling weak
           | references outlive the object. If there are more strong
           | references, drop would not be called. With that check,
           | .upgrade() will never fail.
           | 
           | After all, when a programmer codes a potentially fatal
           | .borrow(), they presumably have some reason to be confident
           | the panic won't trigger.
        
       | Ericson2314 wrote:
       | Oh this is really good!
       | 
       | I wrote https://github.com/Ericson2314/rust-papers a decade ago
       | for a slightly different purpose, but fundamentally we agree.
       | 
       | For those trying to grok their stuff after reading the blog post,
       | consider this.
       | 
       | The borrow checker vs type checker distinction is a hack, a hack
       | that works by relegating a bunch of stuff to be "second class".
       | Second class means that the stuff only occurs within functions,
       | and never across function boundaries.
       | 
       | Proper type theories don't have this "within function, between
       | function" distinction. Just as in the lambda calculus, you can
       | slap a lambda around any term, in "platonic rust" you should be
       | able to get any fragment and make it a reusable abstraction.
       | 
       | The author's here lens is async, which is a good point that since
       | we need to be able to slice apart functions into smaller
       | fragments with the boundaries at await, we need this abstraction
       | ability. With today's Rust in contrast, the only way to do safe
       | manual non-cheating awake would instead to be drasticly limit
       | where one could "await" in practice, to never catch this
       | interesting stuff in action.
       | 
       | In my thing I hadn't considered async at all, but was considering
       | a kind of dual thing. Since these inconsievable types do in fact
       | exist (in a Rust Done Right), and since we can also combine our
       | little functions into a bigger function, then the inescable
       | conclusion is that locations do not have a single fixed type, but
       | have types that vary at different points in the control flow
       | graph. (You can try model the control flow graph as a bunch of
       | small functions and moves, but this runs afowl of non-movable
       | stuff, including borrowed stuff, the ur-non-moveable stuff).
       | 
       | Finally, if we're again trying to make everything first class to
       | have a language without cheating and frustration artificial
       | limits on where abstraction boundaries go, we have to consider
       | not just static locations changing type, but also pointers
       | changing type. (We don't want to liberate some types of locations
       | but not others.) That's where my thing comes in -- references
       | that have one type for the pointee at the beginning of the
       | lifetime, and another type at the end.
       | 
       | This stuff might be mind blowing, but if should be seriously
       | pressude. Having second class concepts in the language breeds
       | epiccycles over time. It's how you get C++. Taking the time to
       | make everything first class like this might be scary, but it
       | yields a much more "stable design" that is much more likely to
       | stand the test of time.
        
         | Ericson2314 wrote:
         | The post concludes by saying it's hopeless to get this stuff
         | implemented because back compat, but I do think that that is
         | true. (It might be hopeless for other reasons. It certainly
         | felt hopeless in 2015.)
         | 
         | All this is about adding things to the language. That's
         | backwards compatible. E.g. Drop doesn't need to be _changed_ ,
         | because from every Drop instance a DropVer2 instance can be
         | written instead. async v1 can also continue to exist, just by
         | continuing to generate it's existing shitty unsafe code. And if
         | someone wants something better, they can just use async v2
         | instead.
         | 
         | People get all freaked out about _changing_ languages, but IMO
         | the FUD is entirely due to sloppy imperative monkey brain.
         | Languages are ideas, and ideas are immutable. The actual
         | question is always, can we do  "safe FFI" between two
         | languages. Safe FFI between Rust Edition 20WX and 20YZ is so
         | trivial that people forget to think about it that way. C and
         | C++ is better since C "continues to exist", but of course the
         | bar for "safe FFI" is so low when the language themselves are
         | unsafe _within_ themselves so that safety _between_ them couldn
         | 't mean very much.
         | 
         | With harder edition breaks like this, the "safe FFI" mentality
         | actually yields fruit.
        
       | IshKebab wrote:
       | I think they should just implement position-independent borrows.
       | So instead of the borrow being an absolute pointer that gets
       | broken if you move the self-borrowing struct, you can move it
       | just fine.
       | 
       | Yes it would add like one extra add to every access, but you
       | hardly ever need self-borrows so I think it's probably an
       | acceptable cost in most cases.
        
         | tux3 wrote:
         | Say I have this type:                   struct A {
         | raw_data: Vec<u8>,           parsed_data: B<&pie raw_data>,
         | parsed_data2: B<&pie raw_data>         }              struct
         | B<T> {           foo: &pie T [u8],         }
         | 
         | Ignoring that my made up notation doesn't make much sense, is
         | the idea that B.foo would be an offset relative to its own
         | adress?
         | 
         | So B.method(&self) might do addr(&self.foo) + self.foo, which
         | is stable even if the parent struct A and its raw data field
         | moves?
         | 
         | Then I wonder how to handle the case where the relative &pie
         | reference itself moves. Maybe parsed_data is std::mem::replaced
         | with parsed_data2 (or maybe one of them is an Option<B> and we
         | Option.take() it somewhere else.)
        
         | SkiFire13 wrote:
         | This has been proposed at the time, but it doesn't work for the
         | case where the borrow points to stable memory (e.g. a `&str`
         | pointing to the contents of a `String` in the same struct). In
         | general case a reference might point to either stable or
         | unstable memory at runtime, so there's no way to make this
         | always work (e.g. in async functions)
        
           | IshKebab wrote:
           | Good point.
        
       | shevy-java wrote:
       | Rust is not an easy language.
        
         | Ygg2 wrote:
         | Easy is a relative measure. How familiar is a language to
         | previous knowledge?
        
         | Aurornis wrote:
         | The syntax in this post is hypothetical. In common usage you'd
         | never encounter a need to even think about these complexities,
         | let alone a desire to do the manual work discussed in this blog
         | post.
        
         | marcosdumay wrote:
         | No, but it's the easiest language you can use on many niches.
        
       | uecker wrote:
       | The people who say Rust is too complex just do not want to learn.
       | /s
        
       | andrewaylett wrote:
       | I'm very much a fan of the idea that language features -- and
       | especially _library_ features -- should not have privileged
       | access to the compiler.
       | 
       | Rust is generally pretty good at this, unlike (say) Go: most
       | functionality is implemented as part of the standard library, and
       | if I want to write my own `Vec` then (for the most part) I can.
       | Some standard library code relies on compiler features that
       | haven't been marked stable, which is occasionally frustrating,
       | but the nightly compiler will let me use them if I _really_ want
       | to (most of the time I don 't). Whereas in Go, I can't implement
       | an equivalent to a goroutine. And even iterating over a container
       | was "special" until generics came along.
       | 
       | This article was a really interesting look at where all that
       | breaks down. There's obviously a trade-off between having to
       | maintain all the plumbing as user-visible and therefore _stable_
       | vs purely magic and able to be changed so long as you don 't
       | break the side effects. I think Rust manages to drive a fairly
       | good compromise in allowing library implementations of core
       | functionality while not needing to stabilise everything before
       | releasing anything.
        
       ___________________________________________________________________
       (page generated 2025-11-16 23:01 UTC)