https://blog.schichler.dev/pinning-in-plain-english-ckwdq3pd0065zwks10raohh85

Menu

Abstraction Haven

Abstraction Haven

HomeVersion History
     

Pinning in plain English

[svg]Pinning in plain EnglishPinning in plain English
[svg]Tamme Schichler's photoTamme Schichler's photo
Tamme Schichler

Published on Nov 24, 2021

 

14 min read

    The header image shows an orange bell pepper sitting on a wooden
    cutting board.

    The same bell pepper has been mirrored into the right half of the
    image. It is still clearly the same fruit, but the mirror image
    would largely only be compatible with a fundamentally different
    biology, as many of its more complex molecules are chiral.

    It was fresh and refreshing.

    -----------------------------------------------------------------

Pinning in Rust is a powerful and very convenient pattern that is, in
my eyes, not supported well enough in the wider ecosystem.

A common sentiment is that it's hard to understand and that the pin
module documentation is confusing. (Personally, I think it works well
as reference to answer questions about edge cases, but it's a dense
read and not necessarily a good intro text.)

This post is my attempt to make the feature more approachable, to
hopefully inspire more developers to make their crates pinning-aware
where that would be helpful.

---------------------------------------------------------------------

License and Translations

See past the end of this post. In short, this post as a whole is
licensed under CC BY-NC-SA 2.0 except for code citations explicitly
marked as such. Code snippets and blocks that are not marked as
citations are also licensed under CC0 1.0.

---------------------------------------------------------------------

Alright, on to the actual content!

Note that I've added links to (mostly) the Rust documentation in
various places.
These are "further reading", so I recommend reading this post
entirely before looking into them. They'll (hopefully) be much easier
to understand that way around.

The Problem

In Rust, any instance is considered trivially movable as long as its
size is known at compile-time.

However, unlike in C++ (1, 2, 3, 4), there is also no way to prevent
(or observe) plain assignments for a particular type: You can't
overload the plain assignment operator =, and there is no concept of
a move constructor that could be called when an instance is moved
implicitly.

This makes it tricky to write a memory-safe API that relies on the
location of an instance without taking continuous possession of it.

Pinning TL;DR (simplified)

Instead, Rust opts to explicitly change the visible type of a
reference (and with that the API) to prevent accidental moves outside
of unsafe code.

While there are some exceptions to this, the default assertion of
pinning is that "If a Pin<&T> is accessible, then that instance of T
will be available at that address until dropped."

It's possible to weaken this somewhat but, aside from when T: Unpin,
only by using unsafe in some way.

Whenever a type implements Unpin, pinning its instances is
meaningless however.

For simplicity, types of pinned values (in this post: T) are implied
to be !Unpin for the rest of this post, unless otherwise noted. This
means some sentences won't be true if you, for example, try to pin an
f64 or a struct where all of its members are Unpin. Please keep this
in mind while reading on.

"pin" vs. "pinned"

Whenever pinning happens in Rust, there are two components involved:

  * A "pin" type that enforces "the value can't be moved" towards
    safe Rust.

    This is often called a "pinning pointer". Pins are very often
    Unpin, but this isn't relevant to their function.

  * A pinned value, which can't be moved by the consumer of an API.

    This is always a specific value, not all members of a type in
    general, as "pinned" is not an inherent property of any type.

Pins are often compounds

For example, look at the signature of Box::pin:

// Citation.
pub fn pin(x: T) -> Pin<Box<T>> { ... }

This function pins the value x of type T behind a pinning smart
pointer of type Pin<Box<_>>.

Think of Pin<Box<_>> as "pinning Box", and not as "Box inside a Pin".
Pin is not a container that the Box can be safely taken out of (
unless T: Unpin).

Side note: A plain Pin<Box<_>> is pinning but not pinned.
In fact, Pin<Box<_>> is always Unpin (because Box<_> is always Unpin)
and as such can not be meaningfully pinned by itself.

This, including Unpin on the pin, is the same for all standard smart
pointers and Rust references:

Rust        English
Pin<Box<_>> "pinning Box"
Pin<Rc<_>>  "pinning Rc"
Pin<Arc<_>> "pinning Arc"
Pin<&_>     "pinning (shared) reference"
Pin<&mut _> "pinning (exclusive) reference"

I often shorten "pinning Box" to "Pin-Box" for myself when reading
silently, and you should be understood when saying it out loud like
that too.

Unpin is an auto trait

Very few types in Rust are actually !Unpin. As Unpin is an auto trait
, it is implemented for all composed types (structs, enums and
unions) whose members are Unpin already. It's auto-implemented for
almost all primitive types and implemented explicitly also for 
pointers! This means that pointer wrappers like NonNull are also
Unpin!

In fact, the only primitive type that is explicitly !Unpin is
core::marker::PhantomPinned, a marker you can use as member type to
make your custom type !Unpin in stable Rust.

You can see the full list of (non-)implementors here: Unpin#
implementors

Values (mostly) don't start out pinned

Even for T where T is not Unpin, a plain instance of T on the stack
or accessible through a plain &mut T is not yet pinned. This also
means it could be discarded without running its destructor, by
calling mem::forget with it as parameter for example.

An instance of T only becomes pinned when passed to a function like
Box::pin that makes these guarantees (and ideally exposes Pin<&T>
somehow, as necessary).

Function of Pin<_>

The only differences between Box<T> and Pin<Box<T>> are that:

  * Pin<Box<T>> never exposes &mut T or a plain T,

    so moving the value elsewhere is impossible.

  * Pin<Box<T>> exposes Pin<&T> and Pin<&mut T>.

    This is called "pin projection" (towards the stored value).

    Getting access to these pinning references lets you call methods
    on the value that require self: Pin<&Self> or self: Pin<&mut
    Self>, and also associated functions with similar argument types.

The plain &T value reference is accessible like before, and can also
be found as such by dereferencing Pin<&T>, regardless of wither T is
Unpin.

All smart pointers and references that are Deref<Target = T> (and
optionally, for mutable access, DerefMut) function exactly like this
when pinning.

In order to keep the rest of the post easy to read:

Definitions (valid in this post only):

Shorthand implied constraint    English
T         not T: Unpin          "[arbitrary] type" / "[arbitrary]
                                value"
a pinned  not T: Unpin          "a pinned value"
T
          P: Deref<Target = T>,
P         optionally P:         "pointer"
          DerefMut
          P: Deref<Target = T>,
Pin<P>    optionally P:         "pinning pointer"
          DerefMut

Pinning is a compile-time-only concept

Pin<P> is a #[repr(transparent)] wrapper around its single member, a
private field with type P.

In other words:

Pin<Box<T>> (for example) has the exact same runtime representation
as Box<T>.

Since Box<T> in turn has the same runtime representation as its
derived &T, dereferencing Pin<Box<T>> to arrive at &T is an identity
function that returns the exact same value it started out from, which
means (but only with optimisations!) no runtime operation is
necessary at all.

Converting a pointer or container into its pinning version is an
equally free action.

This is possible because moves in Rust are already pretty explicit
once references are involved: The underlying assignment maybe be
hidden inside another method, but there is no system that will move
heap instances around without being told to do so (unlike for example
in C#, where pinning is a runtime operation integrated into the GC
API.)

The only exception to this are types that are Copy, a trait which
must be derived explicitly for each type for which implicit trivial
copies should be available.

(Side-note: Don't implement Copy on types where identity matters at
all. Deriving Clone is usually enough.
Copy is largely convenience for immutable instances that you want to
pass by value a lot, so for example Cell does not implement it even
if the underlying type does.)

Pinning is a matter of perspective

A value becomes pinned by making it impossible for safe Rust to move
the instance or free its memory without dropping it first. (A pin
giving safe Rust access to Pin<&T> or Pin<&mut T> asserts this
formally, especially towards T's unrelated implementation.)

However, as the type of the pinned instance itself does not change,
it can remain visible "unpinned" inside the module that implements a
pin in the first place.

Pin<_> hides the normal mutable API only through encapsulation, but
can't erase it entirely.

This means that safe code in that module can often move an instance
even after it appears pinned to code outside of it, and extra care
must be taken to avoid such moves.

Collections can pin

This is most obvious with Box<T> or Pin<Box<T>> where the Box acts as
1-item collection of T. The same can be said about these types with
"Arc" and "Rc" instead of "Box".

As such, it makes sense to also pay attention to C where C is a
collection of items of type T.

It's likewise possible to use Pin to create a new type Pin<C> that
behaves similarly to how a pinning smart pointer would, by giving out
neither &mut T or T.

Definitions (valid in this post only):

Shorthand implied constraint       English
C         owns instances of type T "collection"
Pin<C>    owns instances of type T "pinning collection"

(The standard library doesn't have many utilities for this, as
general collections are much more diverse than smart pointers. If you
decide to write a pinning-aware collection, you will have to
implement much of the API yourself and, as of Rust 1.56, may have to
provide extension methods through traits to make calling it
seamless.)

A collection C "can pin" if it allows some projection from its pinned
form (&Pin<C> or &mut Pin<C>) to Pin<&T> and optionally to Pin<&mut
T>.

A collection may also be inherently pinning, in which case it will
act like Pin<C> without Pin appearing in the type. We won't look at
this kind of collection directly here.

Pin<P> vs. Pin<C> vs. T

How plain (non-pinning) pointers and collections behave should be
clear enough, so I'll only compare how their and T's API tend to
differ when they are pinning or pinned:

             Pin<P>        Pin<C>            T behind Pin<&T> or Pin
                                             <&mut T>
English      "pinning      "pinning          "pinned value"
             pointer"      collection"
                                             rarely. If yes, then
: Unpin      nearly always very often        pinning isn't
                                             meaningful.
APIs         Access to Pin
accessible   <&T>          varies,           Functions that require
vs. without  and           but often similar Pin<&T> or Pin<&mut T>
pinning      optionally    to Pin<P>
             Pin<&mut T>
                           varies, but
APIs                       usually:
inaccessible Access to &   Access to &mut T, Functions that require &
after        mut T,        removing T,       mut T
pinning      unwrapping T  anything that
                           would reallocate
                           T
Unchanged                  Access to &T,     Functions that require &
APIs         Access to &T  dropping T in     T
(examples)                 place
: Clone      usually       possibly          varies
                           where T: Clone1

1 If implemented that way, then pub fn clone_unpinning(this: &Pin
<Self>) -> Self { ... } can also be implemented. However, if T: Clone,
then it's likely that T is also Unpin, which makes pinning pretty
much useless.
See the end of this post for a more useful implementation that can
clone meaningfully pinned instances also.

Which functions require Pin<&T>?

How Pin<&T> and Pin<&mut T> are used varies, but there are three
broad categories most cases fall into:

Avoiding reference-counting

If smart pointers to an instance are copied often but accessed
rarely, and references cannot be used because their lifetime can't be
constrained statically, then it makes sense to shift the runtime cost
from cloning the pointers into a validity check on access instead.
The smart pointers are replaced by copiable handles, in this case.

How do the handles know when their target has disappeared? Pinning a
T asserts that <T as Drop>::drop will run for that particular
instance, so there will be an opportunity to notify a longer-lived
registry.

This also enables use cases where the handles cannot be dropped
explicitly, like if they are stored directly by an arena allocator
like bumpalo. You can see an example of this pattern in my crate
lignin, which supports (V)DOM callbacks this way.

Embedding externally-managed data

My crate tiptoe stores its smart pointers' reference counts directly
inside the hosted value instances. Pinning allows them to still
expose an exclusive reference as Pin<&mut T>.

You can read more about intrusive reference-counting and the
heap-only pattern it enables in this earlier post:

https://blog.schichler.dev/
intrusive-smart-pointers-heap-only-types-ckvzj2thw0caoz2s1gpmi1xm8

Persisting self-references

Consider the following async block:

async {
    let a = "Hello, future!";
    let b = &a;
    yield_once().await;
    println!("{}", b);
}

This code creates an opaquely-typed Future instance that will, at
least formally, contain a reference to another of its fields after it
is polled for the first time.

The instance won't be externally borrowed anymore at that time, but
moving it would break the private reference b to a, so Future::poll
(...) requires self: Pin<&mut Self> as receiver.

This ensures that instances of impl Future will only enter such a
state when they are already guaranteed not to be moved again. If an
executor does need to move Futures, it can require Future + Unpin
instead, which allows converting &mut _ to Pin<&mut _> on the fly.

Side note: Pinning is a huge deal for safe async performance! As even
instances of eventually self-referential impl Futures start out
unpinned, they are directly composable without workarounds like
lifting their state onto the heap on demand. This results in less
fragmentation, less unusual control flow and smaller code size
(before inlining, at least) in the generated state machines, all of
which makes it much easier to evaluate them quickly. (Or rather: It
raises the ceiling for what an async runtime can reasonably achieve,
as it won't be held back by generated code it has no control over.
While a simple async runtime is fairly easy to write in Rust, great
ones are schedulers that operate very "close to the metal" and as
such are often strongly affected by hardware quirks. Their
development is quite interesting to follow.)

Weakening non-moveability

For the final section of this post, let's take a step back and look
at the initial pinning guarantee again.

While pinned instances can't be moved safely by other unrelated code,
it's still often possible to provide functions like the following:

/// Swaps two pinned instances, making adjustments as necessary.
pub fn swap(
    a: Pin<&mut Self>,
    b: Pin<&mut Self>,
) { ... }

As swap has access to Self's private fields (and can rely on internal
implementation details regarding how Self makes use of pinning
exactly), it's able to patch any self-referential or global instance
registry pointers as necessary during the exchange.

It's also possible to similarly recreate the rest of C++'s
address-aware value movement infrastructure, as pointed out by Miguel
Young in April in Move Constructors in Rust: Is it possible? and
implemented in the moveit crate.

In addition to a Rust program more nicely interfacing with C++ this
way, pinning collections can also use moveit's MoveNew and CopyNew
traits to port part of their non-pinning API to their pinning
interface in a more Rust-like fashion:

impl<T> C<T> {
    pub fn push_pin(
        this: &mut Pin<Self>,
        value: T,
    ) -> Pin<&mut T>
    where T: MoveNew {
        todo!("Potentially reallocate existing items.")
    }

    pub fn push_pinned(
        this: &mut Pin<Self>,
        value: Pin<MoveRef<'_, T>>,
    ) -> Pin<&mut T>
    where T: MoveNew {
        todo!("Potentially reallocate existing items, move new item.")
    }
}

// Careful: This also enables `Clone` for `Pin<C<T>>`!
impl<T: CopyNew> Clone for C<T> {
    pub fn clone(self) -> Self {
        todo!("Clone items in an address-aware way.")
    }
}

Collections with stable backing storage can often accept new values
regardless of whether they are currently pinning, but a pinning Vec
-like for example sometimes has to reallocate and as such must allow
its values to patch pointers to expand arbitrarily.

The CopyNew trait can be implemented more broadly than the standard
Clone, which can't be used where internal pointers or certain types
of back-reference may exist (e.g. non-owning "multicast"-like
references to the instance in question).

---------------------------------------------------------------------

Thanks

To Robin Pederson and telios for proof-reading and various
suggestions on how to improve clarity, and to Milou for criticism and
suggestions from a C++ perspective.

License and Translations

This post as a whole with exception of citations is licensed under CC
BY-NC-SA 2.0. All code samples (that is: code blocks and snippets
formatted like this), except for citations, are additionally licensed
under CC0 1.0, so that you can freely use them in your projects under
any license or no license.

Citations from official Rust projects retain their original MIT OR
Apache-2.0 license and are used as permitted at https://
www.rust-lang.org/policies/licenses. Sorry about the complexity here,
unfortunately my country barely recognises fair use.

If you translate this post, please let me know so that I can link it
here. I should be able to post a German translation myself before
long.

(I suggest using the same license structure for code snippets in
translations as here, though this is not something I can enforce. If
a translation uses a different license, you can likely still take the
code you need from the original here under CC0.)

[svg]Post reactionPost reaction1
 
 
 
Share this  

(c) 2021 Abstraction Haven

PrivacyTerms
Proudly part of