https://theta.eu.org/2021/03/08/async-rust-2.html

e (eta)
all posts source code

Why asynchronous Rust doesn't work

2021-03-08 * permalink

In 2017, I said that "asynchronous Rust programming is a disaster and
a mess". In 2021 a lot more of the Rust ecosystem has become
asynchronous - such that it might be appropriate to just say that
Rust programming is now a disaster and a mess. As someone who used to
really love Rust, this makes me quite sad.

I've had a think about this, and I'm going to attempt to explain how
we got here. Many people have explained the problems with
asynchronous programming - the famous what colour is your function
essay, for example.^1 However, I think there are a number of things
specific to the design of Rust that make asynchronous Rust
particularly messy, on top of the problems inherent to doing any sort
of asynchronous programming.

In particular, I actually think the design of Rust is almost
fundamentally incompatible with a lot of asynchronous paradigms. It's
not that the people designing async were incompetent or bad at their
jobs - they actually did a surprisingly good job given the
circumstances!^2 I just don't think it was ever going to work out
cleanly - and to see why, you're going to have to read a somewhat
long blog post!

A study in async

I'd like to make a simple function that does some work in the
background, and lets us know when it's done by running another
function with the results of said background work.

use std::thread;

/// Does some strenuous work "asynchronously", and calls `func` with the
/// result of the work when done.
fn do_work_and_then(func: fn(i32)) {
    thread::spawn(move || {
        // Figuring out the meaning of life...
        thread::sleep_ms(1000); // gee, this takes time to do...
        // ah, that's it!
        let result: i32 = 42;
        // let's call the `func` and tell it the good news...
        func(result)
    });
}

There's this idea called "first-class functions" which says you can
pass around functions as if they were objects. This would be great to
have in Rust, right?

See that func: fn(i32)? fn(i32) is the type of a function that takes
in one singular i32 and returns nothing. Thanks to first-class
functions, I can pass a function to do_work_and_then specifying what
should happen next after I'm done with my work - like this:

fn main() {
    do_work_and_then(|meaning_of_life| {
        println!("oh man, I found it: {}", meaning_of_life);
    });
    // do other stuff
    thread::sleep_ms(2000);
}

Because do_work_and_then is asynchronous, it returns immediately and
does its thing in the background, so the control flow of main isn't
disrupted. I could do some other form of work, which would be nice
(but here I just wait for 2 seconds, because there's nothing better
to do). Meanwhile, when we do figure out the meaning of life, it gets
printed out. Indeed, if you run this program, you get:

oh man, I found it: 42

This is really exciting; we could build whole web servers and network
stuff and whatever out of this! Let's try a more advanced example: I
have a database I want to store the meaning of life in when I find
it, and then I can run a web server in the foreground that enables
people to get it once I'm done (and returns some error if I'm not
done yet).

struct Database {
    data: Vec<i32>
}
impl Database {
    fn store(&mut self, data: i32) {
        self.data.push(data);
    }
}

fn main() {
    let mut db = Database { data: vec![] };
    do_work_and_then(|meaning_of_life| {
        println!("oh man, I found it: {}", meaning_of_life);
        db.store(meaning_of_life);
    });
    // I'd read from `db` here if I really were making a web server.
    // But that's beside the point, so I'm not going to.
    // (also `db` would have to be wrapped in an `Arc<Mutex<T>>`)
    thread::sleep_ms(2000);
}

Let's run this...oh.

error[E0308]: mismatched types
  --> src/main.rs:27:22
   |
27 |       do_work_and_then(|meaning_of_life| {
   |  ______________________^
28 | |         println!("oh man, I found it: {}", meaning_of_life);
29 | |         db.store(meaning_of_life);
30 | |     });
   | |_____^ expected fn pointer, found closure
   |
   = note: expected fn pointer `fn(i32)`
                 found closure `[closure@src/main.rs:27:22: 30:6]`

I see.

Hang on a minute...

So, this is actually quite complicated. Before, the function we
passed to do_work_and_then was pure: it didn't have any associated
data, so you could just pass it around as a function pointer (fn
(i32)) and all was grand. However, this new function in that last
example is a closure: a function object with a bit of data (a &mut
Database) tacked onto it.

Closures are kind of magic. We can't actually name their type - as
seen above, the Rust compiler called it a [closure@src/main.rs:27:22:
30:6], but we can't actually write that in valid Rust code. If you
were to write it out explicitly, a closure would look something like
this:

struct Closure<'a> {
    data: &'a mut Database,
    func: fn(i32, &mut Database)
}

impl<'a> Closure<'a> {
    fn call(&mut self, arg: i32) {
        (self.func)(arg, self.data)
    }
}

There are a number of things to unpack here.

---------------------------------------------------------------------

An aside on naming types

Being able to name types in Rust is quite important. With a regular
old type, like u8, life is easy. I can write a function fn add_one
(in: u8) -> u8 that takes one and returns one without any hassle.

If you can't actually name a type, working with it becomes somewhat
cumbersome. What you end up having to do instead is refer to it using
generics - for example, closures' types can't be named directly, but
since they implement one of the Fn family of traits, I can write
functions like:

fn closure_accepting_function<F>(func: F)
where
    F: Fn(i32), // <-- look!
{
    /* do stuff */
}

If I want to store them in a struct or something, I'll also need to
do this dance with the where clause every time they're used. This is
annoying and makes things harder for me, but it's still vaguely
workable. For now.

msql-srv example code

[image: from the msql-srv crate, showing an example of many where
clauses as a result of using closures]

---------------------------------------------------------------------

An aside on 'radioactive' types

The way Rust is designed tends to encourage certain patterns while
discouraging others. Because of ownership and lifetimes, having
pieces of data that hold references to other pieces of data becomes a
bit of a problem. If my type has a & or a &mut reference to
something, Rust makes me ensure that:

  * the something in question outlives my type; you can't go and drop
    the thing I'm referring to if I still have a reference to it,
    otherwise my reference will become invalid
  * the something in question doesn't move while I have the reference
  * my reference to the something doesn't conflict with other
    references to the something (e.g. I can't have my & reference if
    something else has a &mut reference)

So types with references in them are almost 'radioactive'; you can
keep them around for a bit (e.g. inside one particular function), but
attempting to make them long-lived is usually a bit of an issue
(requiring advanced tricks such as the Pin<T> type which didn't even
exist until a few Rust versions ago). Generally Rust doesn't really
like it when you use radioactive types for too long - they make the
borrow checker uneasy, because you're borrowing something for an
extended period of time.

borrowing semantics visualization

[image: from the RustViz paper, showing borrowing semantics]

---------------------------------------------------------------------

Closures can be pretty radioactive. Look at the thing we just wrote
out: it has a &'a mut Database reference in it! That means while
we're passing our Closure object around, we have to be mindful of the
three rules (outlives, doesn't move, no conflicting) - which makes
things pretty hard. I can't just hand off the Closure to another
function (for example, the do_work_and_then function), because then I
have to make all of those rules work, and that's not necessarily easy
all the time.

(Not all closures are radioactive: if you make them move closures,
they'll take everything by value instead, and create closure objects
that own data instead of having radioactive references to data.
Slightly more of a pain to deal with, but you lose the blue radiation
glow the things give out when you look at them.)

Also, remember what I said about being able to name types? We're not
actually dealing with a nice, written-out Closure object here; we're
dealing with something the compiler generated for us that we can't
name, which is annoying. I also lied when I said that it was as
simple as making all of your functions take F, where F: Fn(i32) or
something - there are actually three different Fn-style traits, Fn,
FnMut, and FnOnce. Do you know the difference between them?

So. A closure is this magical, un-nameable type that the compiler
makes for us whenever we use || {...} syntax, which implements one of
three traits (and it's not immediately obvious which), and it also
might be radioactive. Try and use one of these, and the Rust compiler
is probably going to be watching you very carefully.

---------------------------------------------------------------------

The thing I really want to try and get across here is that Rust is
not a language where first-class functions are ergonomic. It's a lot
easier to make some data (a struct) with some functions attached
(methods) than it is to make some functions with some data attached
(closures).

Trying to use ordinary structs is downright easy:

  * they're explicitly written out by the programmer with no funky
    business
  * you choose what traits and methods to implement on them and how
    to set them out / implement them
  * the struct can actually be referred to by other parts of the code
    by its type

Trying to use closures is hard:

  * the compiler does some magic to make a closure type for you
  * it implements some obscure Fn trait (and it's not immediately
    obvious which)
  * it might be radioactive (or force you to use move and maybe
    insert a bunch of clone() calls)^3
  * you can't actually name their type anywhere or do things like
    return them from a function

Importantly, the restrictions applied to using closures infect types
that contain them - if you're writing a type that contains a closure,
you'll have to make it generic over some Fn-trait-implementing type
parameter, and it's going to be impossible for people to name your
type as a result.

(Other languages, like Haskell, flip this upside down: functions are
everywhere, you can pass them around with reckless abandon, etc. Of
course, these other languages usually have garbage collection to make
it all work...)

---------------------------------------------------------------------

Bearing this in mind, it is really quite hard to make a lot of
asynchronous paradigms (like async/await) work well in Rust. As the
what colour is your function post says, async/await (as well as
things like promises, callbacks, and futures) are really a big
abstraction over continuation-passing style - an idea closely related
to the Scheme programming language. Basically, the idea is you take
your normal, garden-variety function and smear it out into a bunch of
closures. (Well, not quite. You can read the blue links for more; I'm
not going to explain CPS here for the sake of brevity.)

Hopefully by now you can see that making a bunch of closures is
really not going to be a good idea (!)

*wibbly wobbly scene transition*

And then fast forward a few years and you have an entire language
ecosystem built on top of the idea of making these Future objects
that actually have a load of closures inside^4, and all of the
problems listed above (hard to name, can contain references which
make them radioactive, usually require using a generic where clause,
etc) apply to them because of how "infectious" closures are.

The language people have actually been hard at work to solve some
(some!) of these problems by introducing features like impl Trait and
async fn that make dealing with these not immediately totally
terrible, but trying to use other language features (like traits)
soon makes it clear that the problems aren't really gone; just
hidden.

Oh, and all the problems from what colour is your function are still
there too, by the way - on top of the Rust-specific ones.

Beginner (and experienced) Rust programmers look at the state of the
world as it is and try and build things on top of these shaky
abstractions, and end up running into obscure compiler errors, and
using hacks like the async_trait crate to glue things together, and
end up with projects that depend on like 3 different versions of
tokio and futures (perhaps some async-std in there if you're feeling
spicy) because people have differing opinions on how to try and avoid
the fundamentally unavoidable problems, and it's all a bit
frustrating, and ultimately, all a bit sad.

---------------------------------------------------------------------

Did it really have to end this way? Was spinning up a bunch of OS
threads not an acceptable solution for the majority of situations?
Could we have explored solutions more like Go, where a
language-provided runtime makes blocking more of an acceptable thing
to do?

Maybe we could just have kept Rust as it was circa 2016, and let the
crazy non-blocking folks^5 write hand-crafted epoll() loops like they
do in C++. I honestly don't know, and think it's a difficult problem
to solve.

But as far as my money goes, I'm finding it difficult to justify
starting new projects in Rust when the ecosystem is like this. And,
as I said at the start, that makes me kinda sad, because I do
actually like Rust.

---------------------------------------------------------------------

(Common Lisp is pretty nice, though. We have crazy macros and
parentheses and a language ecosystem that is older than I am and
isn't showing any signs of changing...)

---------------------------------------------------------------------

 1. This is really recommended reading if you aren't already familiar
    with it (as you'll soon see) -

 2. Seriously - when I put out the last blog post, the actual async
    core team members commented saying how much they appreciated the
    feedback, and then they actually went and made futures 1.0 better
    as a result. Kudos! -

 3. You might be thinking "well, why don't you just only use move
    closures then?" - but that's beside the point; it's often a lot
    harder to do so, because now you might have to wrap your data in
    an Arc or something, by which point the ergonomic gains of using
    the closure are outweighed by the borrow checker-induced pain. -

 4. You can actually manually implement Future on a regular old
    struct. If you do this, things suddenly become a lot simpler, but
    also you can't easily perform more async operations inside that
    struct's methods. -

 5. (sorry, I mean, the esteemed companies that deign to use Rust for
    their low-latency production services) -

subscribe here (RSS) or here (email), if you'd like to be notified of
new posts

This person's ramblings are distributed in the hope that they will be
useful, but WITHOUT ANY WARRANTY; without even the implied warranty
of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(university lecture notes!) * (site archives) * (blah)