https://davidben.net/2024/01/15/empty-slices.html

Passing nothing is surprisingly difficult

David Benjamin (2024-01-15)

My day job is in browsers and cryptography, not compilers, yet I
often find that I need to spend more of my time working through the
semantics of programming languages than using them. This post
discusses a thorny cross-language issue between C, C++, and Rust. In
short:

  * C's rules around pointers and memcpy leave no good ways to
    represent an empty slice of memory.
  * C++'s pointer rules are fine, but memcpy in C++ inherits the C
    behavior.
  * Rust FFI is not zero-cost. Rust picked a C/C++-incompatible slice
    representation, requiring a conversion in each direction.
    Forgetting the conversion is an easy mistake and unsound.
  * Rust slices appear to also be incompatible with Rust pointer
    arithmetic, to the point that the standard library's slice
    iterator is unsound.

Slices

All three languages allow working with slices, or contiguous
sequences of objects in memory. (Also called spans, but we'll use
"slices" here.) A slice is typically a pointer and a length, (start,
count), giving count objects from start, of some type T.

A slice can also be specified by two pointers, (start, end), giving
the objects from start (inclusive) to end (exclusive). This is better
for iteration because only one value needs to be adjusted to advance,
but the length is less available. C++ iterator pairs are a
generalization of this form, and Rust slice iterators use this
internally. The two forms can be converted with end = start + count
and count = end - start, using C-style pointer arithmetic where
everything is scaled by the object size. We'll primarily discuss
(start, count), but this duality means slices are closely related to
pointer arithmetic.

In C and C++, slices are library types built out of pointers and
lengths. Sometimes functions just take or return pointer and length
separately. In Rust, slices are language primitives, but the
underlying components are exposed for unsafe code and FFIs. To work
with these, we must understand what combinations of pointers and
lengths are valid.

This is straightforward for a positive-length slice: start must point
within an allocation where there are at least count objects of type
T. But suppose we want an empty (length zero) slice. What are the
valid representations of an empty slice?

Certainly we can point start within (or just past) some array of Ts
and set count to zero. But we may want to make an empty slice without
an existing array. For example, a default-constructed std::span<T>()
in C++ or &[] in Rust. What are our options then? In particular:

 1. Can an empty slice be (nullptr, 0)?
 2. Can an empty slice be (alignof(T), 0), or some other aligned
    address that doesn't correspond to an allocation?

The second question may seem odd to C and C++ folks, but Rust folks
may recognize it as std::ptr::NonNull::dangling.

C++

C++ is the easiest to discuss, as it has a formal specification and
is (almost) self-consistent.

First, (nullptr, 0) is a valid empty slice in C++. The STL's types
routinely return it, and the language is compatible with it:

  * nullptr + 0 is defined to be nullptr
  * nullptr - nullptr is defined to be 0

C++ defines APIs like std::span in terms of pointer addition for the
(start, count) form, and iterator pairs in terms of pointer
subtraction for the (start, end) form, so this is both necessary and
sufficient.

Moreover, it would be impractical for C++ to forbid (nullptr, 0). C++
code routinely needs to interact with APIs that specify slices as
individual components. Given such an API, no one writes code like
this:

void takes_a_slice(const uint8_t *in, size_t len);

uint8_t placeholder;
takes_a_slice(&placeholder, 0);

It is much more natural to use nullptr:

void takes_a_slice(const uint8_t *in, size_t len);

takes_a_slice(nullptr, 0);

This means, to be practical, functions like takes_a_slice must accept
(nullptr, 0). For implementing such functions to be practical, the
underlying language primitives must then also accept (nullptr, 0).

As for the (alignof(T), 0) question, pointer addition and subtraction
require the pointers point to some object and that the operation
stays within the bounds of that object (or one past the end). C++
does not define there to be an object at alignof(T), so this is not
allowed, instead producing Undefined Behavior. This has no immediate
usability concern (no one is going to write reinterpret_cast
<uint8_t*>(1) to call takes_a_slice), but we'll see later that it has
some consequences for Rust FFIs.

However, C++ is merely almost self-consistent. C++ picks up memcpy
and other functions from C's standard library, complete with C's
semantics...

C

C is messier. As in C++, it is impractical to reject (nullptr, 0).
However, C does not have C++'s special cases for nullptr + 0 and
nullptr - nullptr. See clauses 8 and 9 of section 6.5.6 of N2310.
memcpy and the rest of the C standard library similarly forbid
(nullptr, 0).

I think this should be considered a bug in the C specification, and
compilers should not optimize based on it. In BoringSSL, we found the
C rules so unusable that we resorted to wrapping the standard library
with n != 0 checks. The pointer arithmetic rules are similarly a tax
on development. Moreover, C++ inherits C's standard library (but not
its pointer rules), including this behavior. In Chromium's C++ code,
memcpy has been the single biggest blocker to enabling UBSan.

Fortunately, there is hope. Nikita Popov and Aaron Ballman have
written a proposal to fix this in C. (Thank you!) While it won't make
C and C++ safe by any stretch of imagination, this is an easy step to
fix an unforced error.

Note that, apart from contrived examples with deleted null checks,
the current rules do not actually help the compiler meaningfully
optimize code. A memcpy implementation cannot rely on pointer
validity to speculatively read because, even though memcpy(NULL,
NULL, 0) is undefined, slices at the end of a buffer are fine:

char buf[16];
memcpy(dst, buf + 16, 0);

If buf were at the end of a page with nothing allocated afterwards, a
speculative read from memcpy would break.

Rust

Rust does not allow (nullptr, 0). Functions like
std::slice_from_raw_parts require the pointer to be non-null. This
comes from Rust treating types like &[T] and *[T] as analogous to &T
and *T. They are "references" and "pointers" that are represented as
(start, count) pairs. Rust requires every pointer type to have a
"null" value outside its reference type. This is used in enum layout
optimizations. For example, Option::<&[T]> has the same size as &[T]
because None uses this null value.

Unfortunately, Rust chose (nullptr, 0) for the null slice pointer,
which means the empty slice, &[], cannot use it. That left Rust
having to invent an unusual convention: some non-null, aligned, but
otherwise dangling pointer, usually (alignof(T), 0).

Is pointer arithmetic defined for this slice? From what I can tell,
the answer appears to be no!

Pointer arithmetic in Rust is spelled with the methods add, sub_ptr,
and offset_from, which the standard library defines in terms of
allocated objects. That means, for pointer arithmetic to work with
alignof(T), there must be zero-size slices allocated at every
non-zero address. Moreover, offset_from requires the two dangling
pointers derived from the same slice to point to the "same" of these
objects. While the third bullet here, second sentence, says casting
literals gives a pointer that is "valid for zero-sized accesses", it
says nothing about allocated objects or pointer arithmetic.

Ultimately, these semantics come from LLVM. The Rustonomicon has more
to say on this (beginning "The other corner-case..."). It concludes
that, while there are infinitely many zero-size types at 0x01, Rust
conservatively assumes alias analysis does not allow offsetting
alignof(T) with zero for positive-sized types. This means Rust
pointer arithmetic rules are incompatible with Rust empty slices. But
recall that slice iteration and pointer arithmetic are deeply
related. The Rustonomicon's sample iterator uses pointer arithmetic,
but needs to guard addition with cap == 0 in into_iter and cast to
usize in size_hint.

This is too easy for programmers to forget. Indeed the real Rust
slice iterator does pointer arithmetic unconditionally (pointer
addition, pointer subtraction, behind some macros). This suggests
Rust slice iterators are unsound.

FFIs

Beyond self-consistency concerns, all this means Rust and C++ slices
are incompatible. Not all Rust (start, count) pairs can be passed
into C++ and vice versa. C's issues make its situation less clear,
but the natural fix is to bring it in line with C++.

This means Rust FFI is not "zero-cost". Passing slices between C/C++
and Rust requires conversions in both directions to avoid Undefined
Behavior.

Beyond performance, this is a safety and ergonomics problem.
Programmers cannot be expected to memorize language specifications.
If given a &[T] and trying to call a C API, the natural option is to
use as_ptr, but that will return a C/C++-incompatible output. Most
Rust crates I've seen which wrap C/C++ APIs do not convert and are
unsound as a result.

This is particularly an issue because C and C++'s (more serious)
safety problems cause real user harm and need addressing. But there
is half a century of existing C and C++ code. We cannot realistically
address this with a new language without good FFI. What makes for
good FFI? At a bare minimum, I think calling a C or C++ function from
Rust should not be dramatically less safe than calling that function
from C or C++.

Wishlist

Empty lists should not be so complicated. We could change C, C++, and
Rust in a few ways to improve things:

Make C accept nullptr

See Nikita Popov and Aaron Ballman's proposal.

Fix Rust's slice representation

All the alignof(T) problems ultimately come from Rust's unusual empty
slice representation. This falls out of Rust's need for a "null" *[T]
value that is not a &[T] value. Rust could have chosen any of a
number of unambiguously unused representations, such as (nullptr, 1),
(nullptr, -1), or (-1, -1).

While this would be a significant change now, with compatibility
implications to work through, I think it is worth seriously
considering. It would address the root cause of this mess, fixing a
soundness hazard in not just Rust FFI, but Rust on its own. This
hazard is real enough that Rust's standard library hits it.

This is also the only option I see that fully meets Rust's
"zero-cost" FFI goals. Even if we make C and C++ accept (alignof(T),
0) from Rust (see below), any slices passed from C/C++ to Rust may
still be (nullptr, 0).

Define pointer arithmetic for invalid pointers

If Rust leaves its slice representation as is, we instead should
define pointer arithmetic for NonNull::dangling(). Expecting
low-level Rust code to guard all pointer offsets is impractical.

Where nullptr is a single value which could just be special-cased,
there are many alignof(T) values. It seems one would need to define
it in terms of the actual allocated objects. This is well beyond my
expertise, so I've likely gotten all the details and terminology
wrong, but one possibility is, in the vein of PNVI-ae-udi:

  * cast_ival_to_ptrval returns a special @empty provenance when
    casting garbage values (unchanged from PNVI-ae-udi)
  * Adding zero to a pointer with the @empty provenance is valid and
    gives back the original pointer
  * Two pointers with @empty provenance can be subtracted to give
    zero if they have the same address

One subtlety, however, is that cast_ival_to_ptrval might not give
back @empty if there is an object at that address. Giving back a
concrete provenance means picking up pointer-zapping semantics, which
would be undesirable here. For alignof(T), that shouldn't happen if
the maximum alignment is under a page and the bottom page is never
allocated. But Rust allows not just alignof(T) but any non-zero
integer literal, even if some allocation happens to exist at that
address. (Perhaps we could use the "user-disambiguation" aspect and
say all integer-to-pointer casts may additionally disambiguate to
@empty? Would that impact the compiler's aliasing analysis?)

I think this complexity demonstrates why nullptr is a much better
choice for an empty slice than a dangling pointer. Pointer arithmetic
with nullptr is easy to define, and nullptr cannot alias a real
allocation.

If Rust (and LLVM) accepted invalid pointers, it would fix the
soundness issues within Rust, but not with FFIs. If the C and C++
standards also picked this up, it would partially fix FFIs. We could
then directly pass Rust slices into C and C++, but not in the other
direction. Directly passing C and C++ slices into Rust can only be
fixed by changing Rust to accept (nullptr, 0) form.

[S:(Outside of Rust FFI, there's no reason to use alignof(T) as a
pointer in C/C++, so I do not know how plausible it would be for C/
C++ to accept it.):S] Update 2024-01-16: Nelson Elhage reminded me
that non-null sentinel pointers are sometimes used to allocate zero
bytes. While C forbids malloc from doing this (malloc(0) must return
either nullptr or a unique non-null pointer), other allocators might
reasonably pick this option. It makes error checks more uniform
without actually reserving address space. So there is a non-Rust
reason to allow these pointers in C and C++.

FFI helpers in Rust standard library

If the languages' slice representations cannot be made compatible,
we're still left with safety hazards in Rust FFI. In that case,
Rust's standard library should do more to help programmers pick the
right operations: Add analogs of slice::from_raw_parts,
slice::as_ptr, etc., that use the C and C++ representation,
converting internally as needed. Document existing functions very
clear warnings that they cannot be used for FFI. Finally, audit all
existing calls in crates.io, as the majority of existing calls are
likely for FFI.

For slice::from_raw_parts, we could go further and fix the existing
function itself. This would be backwards-compatible, but adds
unnecessary conversions to non-FFI uses. That said, if the crates.io
audit reveals mostly FFI uses, that conversion may be warranted. For
non-FFI uses, a type signature incorporating std::ptr::NonNull would
have been more appropriate anyway.

This would improve things, but it's an imperfect solution. We'd still
sacrifice zero-cost FFI, and we'd still rely on programmers to read
the warnings and realize the natural options are incorrect.

Acknowledgements

Thanks to Alex Chernyakhovsky, Alex Gaynor, Dana Jansens, Adam
Langley, and Miguel Young de la Sota for reviewing early iterations
of this post. Any mistakes in here are my own.