[HN Gopher] The inconceivable types of Rust: How to make self-bo...
___________________________________________________________________
The inconceivable types of Rust: How to make self-borrows safe
(2024)
Author : birdculture
Score : 113 points
Date : 2025-11-15 23:31 UTC (23 hours ago)
(HTM) web link (blog.polybdenum.com)
(TXT) w3m dump (blog.polybdenum.com)
| Animats wrote:
| This is going to take some serious reading.
|
| I've been struggling with a related problem over at [1]. Feel
| free to read this, but it's nowhere near finished. I'm trying to
| figure out how to do back references cleanly and safely. The
| basic approach I'm taking is
|
| - We can do just about everything useful with Rc, Weak, RefCell,
| borrow(), borrow_mut(), upgrade, and downgrade. But it's really
| wordy and there's a lot of run time overhead. Can we fix the
| ergonomics, for at least the single-owner case? Probably. The
| general idea is to be able to write a field access to a weak link
| as sometype.name
|
| when what's happening under the hood is
| sometype.upgrade().unwrap().borrow().name
|
| - After fixing the ergonomics, can we fix the performance by
| hoisting some of the checking? Probably. It's possible to check
| at the drop of _sometype_ whether anybody is using it, strongly
| or weakly. That allows removing some of the per-reference
| checking. With compiler support, we can do even more.
|
| What I've discovered so far is that the way to write about this
| is to come up with real-word use cases, then work on the
| machinery. Otherwise you get lost in type theory. The "Why" has
| to precede the "How" to get buy-in.
|
| I notice this paper is (2024). Any progress?
|
| [1] https://github.com/John-
| Nagle/technotes/blob/main/docs/rust/...
| mustache_kimono wrote:
| > But it's really wordy and there's a lot of run time overhead.
|
| I'm curious: what do the benchmarks say about this?
| kurante wrote:
| Have you seen GhostCell[1]? Seems like this could be a solution
| to your problem.
|
| [1]: https://plv.mpi-sws.org/rustbelt/ghostcell/
| Animats wrote:
| Yes. There's an implementation at
| https://github.com/matthieu-m/ghost-cell
|
| Not clear why it never caught on.
|
| There have been many attempts to solve the Rust back
| reference problem, but nothing has become popular.
| zozbot234 wrote:
| The qcell crate is perhaps the most popular implementation of
| GhostCell-like patterns. But the ergonomics is a bit of a
| challenge still.
| Animats wrote:
| Right. The whole problem with all this is ergonomics, from
| the point of view of programmers who don't want to obsess
| over ownership and type theory. We sort of know how to make
| this work. It works fine with enough Rc/Weak/etc. But it's
| a huge pain.
|
| I appreciate people arguing over this. It helps. We've seen
| proposals from people who are too much into the type theory
| and not enough into ease of use. I used to do proof of
| correctness work, where the problem is that proof of
| correctness people are too into the formalism and not
| enough into killing bugs.
| zozbot234 wrote:
| > The general idea is to be able to write a field access to a
| weak link as sometype.name
|
| > when what's happening under the hood is
| sometype.upgrade().unwrap().borrow().name
|
| You could easily implement this with no language-level changes
| as an auto-fixable compiler diagnostic. The compiler would
| error out when it sees the type-mismatched .name, but it would
| give you an easy way of changing it to its proper form. You
| just avoid making the .name form permanent syntactic sugar
| (which is way too opaque for a low-level language like Rust),
| it gets replaced in development.
| SkiFire13 wrote:
| > when what's happening under the hood is
|
| > sometype.upgrade().unwrap().borrow().name
|
| I suspect a hidden `.unwrap()` like that will be highly
| controversial.
| Animats wrote:
| .borrow() already has a hidden unwrap. There's try-borrow(),
| but the assumption for .borrow() is that it will always
| succeed.
|
| What I'd like to do is move as much of the checking as
| possible to the drop() of the owning object, and possibly to
| compile time. If .borrow() calls are very local, it's not too
| hard to determine that the lifetimes of the borrowed objects
| don't overlap.
|
| Upgrade is easy to check cheaply at run time for Rc-type
| cells. Upgrade only fails if the owning object has been
| dropped. At drop, if weak_count == 0, no dangling weak
| references outlive the object. If there are more strong
| references, drop would not be called. With that check,
| .upgrade() will never fail.
|
| After all, when a programmer codes a potentially fatal
| .borrow(), they presumably have some reason to be confident
| the panic won't trigger.
| Ericson2314 wrote:
| Oh this is really good!
|
| I wrote https://github.com/Ericson2314/rust-papers a decade ago
| for a slightly different purpose, but fundamentally we agree.
|
| For those trying to grok their stuff after reading the blog post,
| consider this.
|
| The borrow checker vs type checker distinction is a hack, a hack
| that works by relegating a bunch of stuff to be "second class".
| Second class means that the stuff only occurs within functions,
| and never across function boundaries.
|
| Proper type theories don't have this "within function, between
| function" distinction. Just as in the lambda calculus, you can
| slap a lambda around any term, in "platonic rust" you should be
| able to get any fragment and make it a reusable abstraction.
|
| The author's here lens is async, which is a good point that since
| we need to be able to slice apart functions into smaller
| fragments with the boundaries at await, we need this abstraction
| ability. With today's Rust in contrast, the only way to do safe
| manual non-cheating awake would instead to be drasticly limit
| where one could "await" in practice, to never catch this
| interesting stuff in action.
|
| In my thing I hadn't considered async at all, but was considering
| a kind of dual thing. Since these inconsievable types do in fact
| exist (in a Rust Done Right), and since we can also combine our
| little functions into a bigger function, then the inescable
| conclusion is that locations do not have a single fixed type, but
| have types that vary at different points in the control flow
| graph. (You can try model the control flow graph as a bunch of
| small functions and moves, but this runs afowl of non-movable
| stuff, including borrowed stuff, the ur-non-moveable stuff).
|
| Finally, if we're again trying to make everything first class to
| have a language without cheating and frustration artificial
| limits on where abstraction boundaries go, we have to consider
| not just static locations changing type, but also pointers
| changing type. (We don't want to liberate some types of locations
| but not others.) That's where my thing comes in -- references
| that have one type for the pointee at the beginning of the
| lifetime, and another type at the end.
|
| This stuff might be mind blowing, but if should be seriously
| pressude. Having second class concepts in the language breeds
| epiccycles over time. It's how you get C++. Taking the time to
| make everything first class like this might be scary, but it
| yields a much more "stable design" that is much more likely to
| stand the test of time.
| Ericson2314 wrote:
| The post concludes by saying it's hopeless to get this stuff
| implemented because back compat, but I do think that that is
| true. (It might be hopeless for other reasons. It certainly
| felt hopeless in 2015.)
|
| All this is about adding things to the language. That's
| backwards compatible. E.g. Drop doesn't need to be _changed_ ,
| because from every Drop instance a DropVer2 instance can be
| written instead. async v1 can also continue to exist, just by
| continuing to generate it's existing shitty unsafe code. And if
| someone wants something better, they can just use async v2
| instead.
|
| People get all freaked out about _changing_ languages, but IMO
| the FUD is entirely due to sloppy imperative monkey brain.
| Languages are ideas, and ideas are immutable. The actual
| question is always, can we do "safe FFI" between two
| languages. Safe FFI between Rust Edition 20WX and 20YZ is so
| trivial that people forget to think about it that way. C and
| C++ is better since C "continues to exist", but of course the
| bar for "safe FFI" is so low when the language themselves are
| unsafe _within_ themselves so that safety _between_ them couldn
| 't mean very much.
|
| With harder edition breaks like this, the "safe FFI" mentality
| actually yields fruit.
| IshKebab wrote:
| I think they should just implement position-independent borrows.
| So instead of the borrow being an absolute pointer that gets
| broken if you move the self-borrowing struct, you can move it
| just fine.
|
| Yes it would add like one extra add to every access, but you
| hardly ever need self-borrows so I think it's probably an
| acceptable cost in most cases.
| tux3 wrote:
| Say I have this type: struct A {
| raw_data: Vec<u8>, parsed_data: B<&pie raw_data>,
| parsed_data2: B<&pie raw_data> } struct
| B<T> { foo: &pie T [u8], }
|
| Ignoring that my made up notation doesn't make much sense, is
| the idea that B.foo would be an offset relative to its own
| adress?
|
| So B.method(&self) might do addr(&self.foo) + self.foo, which
| is stable even if the parent struct A and its raw data field
| moves?
|
| Then I wonder how to handle the case where the relative &pie
| reference itself moves. Maybe parsed_data is std::mem::replaced
| with parsed_data2 (or maybe one of them is an Option<B> and we
| Option.take() it somewhere else.)
| SkiFire13 wrote:
| This has been proposed at the time, but it doesn't work for the
| case where the borrow points to stable memory (e.g. a `&str`
| pointing to the contents of a `String` in the same struct). In
| general case a reference might point to either stable or
| unstable memory at runtime, so there's no way to make this
| always work (e.g. in async functions)
| IshKebab wrote:
| Good point.
| shevy-java wrote:
| Rust is not an easy language.
| Ygg2 wrote:
| Easy is a relative measure. How familiar is a language to
| previous knowledge?
| Aurornis wrote:
| The syntax in this post is hypothetical. In common usage you'd
| never encounter a need to even think about these complexities,
| let alone a desire to do the manual work discussed in this blog
| post.
| marcosdumay wrote:
| No, but it's the easiest language you can use on many niches.
| uecker wrote:
| The people who say Rust is too complex just do not want to learn.
| /s
| andrewaylett wrote:
| I'm very much a fan of the idea that language features -- and
| especially _library_ features -- should not have privileged
| access to the compiler.
|
| Rust is generally pretty good at this, unlike (say) Go: most
| functionality is implemented as part of the standard library, and
| if I want to write my own `Vec` then (for the most part) I can.
| Some standard library code relies on compiler features that
| haven't been marked stable, which is occasionally frustrating,
| but the nightly compiler will let me use them if I _really_ want
| to (most of the time I don 't). Whereas in Go, I can't implement
| an equivalent to a goroutine. And even iterating over a container
| was "special" until generics came along.
|
| This article was a really interesting look at where all that
| breaks down. There's obviously a trade-off between having to
| maintain all the plumbing as user-visible and therefore _stable_
| vs purely magic and able to be changed so long as you don 't
| break the side effects. I think Rust manages to drive a fairly
| good compromise in allowing library implementations of core
| functionality while not needing to stabilise everything before
| releasing anything.
___________________________________________________________________
(page generated 2025-11-16 23:01 UTC)