https://www.ralfj.de/blog/2018/07/19/const.html

  * ralfj.de
  * 
      + Projects
      + Blog
      + 
          o RSS Feed

Jul 19, 2018 * Internship, Rust * Edits * Permalink

Thoughts on Compile-Time Function Evaluation and Type Systems

For some time now (since the 1.26 release, to be precise), Rust has a
very powerful machinery for CTFE, or compile-time function
evaluation. Since then, there have been various discussions about
which operations should be allowed during CTFE, which checks the
compiler should do, how this all relates to promotion and which kinds
of guarantees we should be able to expect around CTFE. This post is
my take on those topics, and it should not be surprising that I am
going to take a very type-system centric view. Expect something like
a structured brain dump, so there are some unanswered questions
towards the end as well.

Some Background

CTFE is the mechanism used by the compiler, primarily, to evaluate
items like const x: T = ...;. The ... here is going to be Rust code
that must be "run" at compile-time, because it can be used as a
constant in the code - for example, it can be used for array lengths.

Notice that CTFE is not the same as constant propagation: Constant
propagation is an optimization pass done by compilers like LLVM that
will opportunistically change code like 3 + 4 into 7 to avoid
run-time work. Being an optimization, constant propagation must, by
definition, not change program behavior and will not be observable at
all (other than performance). CTFE, on the other hand, is about code
that must be executed at compile-time because the compiler needs to
know its result to proceed - for example, it needs to know the size
of an array to compute how to lay out data in memory. You can
statically see, just from the syntax of the code, whether CTFE
applies to some piece of code or not: CTFE is only used in places
like the value of a const or the length of an array.

fn demo() {
  const X: u32 = 3 + 4; // CTFE
  let x: u32 = 4 + 3; // no CTFE (but maybe constant propagation)
}

We say that the 3 + 4 above is in const context and hence subject to
CTFE, but the 4 + 3 is not.

Const Safety

Not all operations can be used in const context. For example, it
makes no sense to compute your array length as "please go read that
file from disk and compute something" - we can't know what will be on
the disk when the program actually runs. We could use the disk of the
machine compiling the program, but that does not sound very appealing
either. Things get even worse when you consider letting the program
send information to the network. Clearly, we don't want CTFE to have
actually observable side-effects outside of compilation.

In fact, just naively letting programs read files would also be
grossly unsafe: When computing the length of an array twice, it is
important that we obtain the same result. Update: As @eddyb points
out, things get even worse once you consider const generics, traits,
and coherence: At that point, you have to rely on evaluating the same
expression in different crates to produce the same result. /Update

    CTFE must be deterministic.

If not, the compiler could end up thinking that two arrays have the
same length, but then later compute different layouts. That would be
a disaster. So, any kind of external input and any kind of
non-determinism is a complete no-go for CTFE. This does not just
concern I/O, even converting a reference to a usize is not
deterministic.

The compiler will throw a CTFE error if such an operation is ever
attempted to be executed. Those programs that are executable in const
context are called const safe:

    A program is const safe if it can be executed by CTFE without
    hitting an error (panics are allowed).

This is very much in analogy with the idea that a safe (or run-time
safe, to distinguish it from const safe) program is a program that
does not cause any memory errors or data races. In fact, we will see
that this analogy between "programs that are well-behaved under CTFE"
(const safety) and "programs that do not cause UB" (run-time safety)
can carry us very far.

One very interesting question now is whether some given function foo
should be allowed to be called in const context. We could just always
say "yes", and rely on the fact that CTFE will throw an error when
foo does anything fishy. The problem with this approach is that, if
foo is in a library, updating the library might change foo in a way
that makes it no longer const-safe. In other words, making any
function not const-safe any more would be a semver violation because
it could break downstream crates.

The typical mechanism to solve that problem is to have an annotation
that explicitly marks a function as "usable in const context". In
Rust, the proposed mechanism for this purpose is const fn; in C++ it
is called constexpr. The compiler can now reject calling non-const
functions in const context, so library authors can add non-const-safe
operations without breaking semver.

Const Type System and Const Soundness

This leads us to the interesting situation that the compiler will
reject code in const context that it would accept just fine outside
of const context. In particular, the body of a const fn is also
considered to be in const context - otherwise, if we allowed calling
arbitrary functions, we would have the same problem again. One useful
way to think about this is that we have a second type system, a
"const type system", that is used to type-check code in const
context. This type system does not allow calls to non-const
functions.

It should probably also not allow casting a reference to an integer,
because (as discussed above) that is a non-deterministic operation
which cannot be performed during CTFE. What else?

Before we go on and add random additional checks, let us step back
and think about what our goals are here. Typically, the purpose of a
type system is to establish some sort of guarantee for a well-typed
program. For Rust's "main" ("run-time") type system, that guarantee
is "no undefined behavior", which means no memory errors and no data
races. What is the guarantee for our new const type system? We have
already talked about it above: It's const safety! This leads us to
the definition of const soundness:

    Our const type system is sound if well-typed programs are
    const-safe.

Again, notice how this is very similar to the correctness statement
for the run-time type system, which guarantees run-time safety.

However, we have to be a bit careful here. Consider the following
piece of code:

const fn is_eight_mod_256(x: usize) -> bool { x % 256 == 8 }

We will definitely want to allow this code. Why should == or % not be
const-safe? Well, we could call our function as follows:

is_eight_mod_256(Box::into_raw(Box::new(0)) as usize);

That statement is certainly not const-safe as the result depends on
where exactly the allocator puts our Box. However, we want to blame
the as usize for this issue, not the is_eight_mod_256.

The solution is for the const type system to not just have separate
rules about which operations are allowed, we also must change our
notion of which values are "valid" for a given type. An integer
obtained from a pointer is valid for usize at run-time, but it is not
valid for usize in const mode! After all, there are basic arithmetic
operations that we expect all usize to support, that CTFE cannot
support for pointers.

    A function is const-safe if, when executed with const-valid
    arguments, it does not trigger a CTFE error and returns a
    const-valid result (if it returns at all).

Under this definition, is_eight_mod_256 is const-safe because
whenever x is an actual integer, it will evaluate without any error.
At the same time, this shows that converting a reference into usize
is not const-safe, because the input of this operation is
const-valid, but the output is not! This provides a solid
justification for rejecting such casts in const context.

CTFE correctness

In Rust, CTFE is performed by miri, a MIR interpreter that used to be
a separate project but whose core engine has been integrated into
rustc. miri will execute the code in const context step-by-step and
just complain and fail with an error when an operation cannot be
performed. This does not just concern non-determinism; miri does not
support everything it could support because @oli-obk is super careful
about not accidentally stabilizing behavior that should undergo an
RFC.

In fact, right now miri will reject all operations on raw pointers.
They all raise a CTFE error and hence must all be rejected by the
const type system. The plan is to change miri so that it can support
more operations, but we have to be careful in doing so. I have
already mentioned that miri must be deterministic, but there is
another point to consider that you might have expected to play a much
more prominent role: CTFE, at least if it succeeds, should match
run-time behavior!

    CTFE is correct if, when it loops forever, completes with a
    result, or panics, that behavior matches the run-time behavior of
    the same code.

We clearly do not want code to behave differently when it lives in
const context and is run by CTFE, and when it is compiled to
machine-code and executed "for real".

Or, do we? Don't get me wrong, I am not advocating for deliberately
breaking that property, but it sure is worth considering what would
go wrong if miri was not CTFE-correct. Maybe surprisingly, it turns
out that this would not be a soundness issue! All we care about for
the purpose of soundness is for CTFE to be deterministic, as already
discussed. We don't re-run the same code at run-time and rely on it
still doing the same, so nothing actually breaks if CTFE behavior
diverges from run-time behavior.

That said, not being CTFE correct is surely very surprising and we
should avoid it best we can. However, I am told that actually
predicting the result of floating-point operations deterministically
is extremely hard and LLVM isn't exactly helping. So, we will likely
have to live with either considering floating point operations to be
const-unsafe (raising a CTFE error), or not having CTFE correctness
when floating point operations are involved. I think it is possible
to achieve CTFE correctness for all other operations, and I think we
should strive to do so.

Before we go on, notice that CTFE correctness as defined above does
not say anything about the case where CTFE fails with an error, e.g.
because of an unsupported operation. CTFE would be trivially correct
(in the above sense) if it just always immediately returned an error.
However, since const-safe programs cannot error during CTFE, we know
from CTFE correctness that those programs do in fact behave exactly
the same at compile-time and at run-time.

Unsafe Blocks in Const Context

Let's say we want to extend miri to support more operations on raw
pointers. We know we have to be careful about keeping miri
deterministic, and about maintaining CTFE correctness. Which
operations will we be able to support?

Notice that at this point, const soundness and the related const
safety are not a concern yet. Those ideas come into play when we are
changing the const type system to allow more operations. CTFE
determinism and correctness, however, are properties of the CTFE
engine (miri) itself.

Let us look at the following example:

const fn make_a_bool() -> bool {
    let x = Box::new(0);
    let x_ptr = &*x as *const i32;
    drop(x);
    let y = Box::new(0);
    let y_ptr = &*y as *const i32;
    x_ptr == y_ptr
}

At run-time, whether this function returns true or false depends on
whether or not the allocator re-used the space for x when allocating
y. However, due to CTFE being deterministic, we have to pick one
concrete answer at compile-time, and that may not be the right
answer. Hence we cannot allow this program to execute under CTFE if
we want to maintain CTFE correctness. Supporting memory allocation in
a deterministic way is perfectly feasible (in fact, miri already has
that implemented), and casting a reference to a raw pointer changes
nothing but the type. The only actually problematic operation here is
testing two raw pointers for equality: Because one of the pointers is
dangling, we can not deterministically predict the result of this
comparison!

In other words, if/when miri learns how to compare pointers, we must
make it raise a CTFE error if one of the pointers is dangling (points
to unallocated memory), or else we would violate CTFE correctness.

Now, let us go one level up and look at the const type system. We
have seen that comparing raw pointers can raise a CTFE error, so this
is actually not a const-safe operation. Similar to casting pointers
to integers, we have to make the const type system reject code that
compares raw pointers. However, it seems like a shame to not even
allow comparing two references for equality, after converting them to
raw pointers! After all, references never dangle, so this is a
perfectly const-safe operation.

Lucky enough, Rust already has an answer to the need to side-step the
type system in controlled ways: unsafe blocks. Comparing raw pointers
is not allowed by the const type system because it is not const-safe,
but just like we allow run-time-unsafe operations to be performed in
unsafe blocks, we can allow const-unsafe operations as well. So, we
should be able to write the following:

const fn ptr_eq<T>(x: &T, y: &T) -> bool {
    unsafe { x as *const _ == y as *const _ }
}

As usual when writing unsafe code, we have to be careful not to
violate the safety guarantees that are usually upheld by the type
system. We have to manually ensure that, if our inputs are
const-valid, then we will not trigger a CTFE error and return a
const-valid result. For this example, the reason no CTFE error can
arise is that references cannot dangle. We can thus provide ptr_eq as
an abstraction that is entirely safe to use in const context, even
though it contains a potentially const-unsafe operation. This is,
again, in perfect analogy with types like Vec being entirely safe to
use from safe Rust even though Vec internally uses plenty of
potentially unsafe operations.

Whenever I said above that some operation must be rejected by the
const type system, what that really means is that the operation
should be unsafe in const context. Even pointer-to-integer casts can
be used internally in const-safe code, for example to pack additional
bits into the aligned part of a pointer in a perfectly deterministic
way.

Static Promotion

There is one more aspect to CTFE in Rust that I have not yet touched
on: Promotion of static values. This is the mechanism that makes the
following code well-typed:

fn make_a_3() -> &'static i32 { &3 }

This may look like it should be rejected because we are returning a
reference to a locally created value with lifetime 'static, but
instead, Magic (TM) is happening. The compiler determines that 3 is a
static value that can be computed at compile-time and put into static
memory (like a static variable), and hence &3 can have lifetime
'static. This also works with e.g. &(3+4). Static variables, like
const, are computed at compile time and hence CTFE comes into play.

The fundamentally new aspect to this is that the user did not ask for
the value to be promoted. That means we have to be really careful
when deciding which values to promote: If we promote something that
miri cannot evaluate, there will be a CTFE error and we have just
broken compilation for no good reason. We better make sure that we
only promote values that we can expect miri to actually be able to
compute - i.e., we should only promote the results of const-safe
code. You probably already guessed it, but what I am proposing is to
use the const type system for that purpose. Const soundness already
says that this is a way to ensure const safety.

I propose to only ever promote values that are safely
const-well-typed. (So, we will not promote values involving
const-unsafe operations even when we are in an unsafe block.) When
there are function calls, the function must be a safe const fn and
all arguments, again, const-well-typed. For example, &
is_eight_mod_256(13) would be promoted but &is_eight_mod_256
(Box::into_raw(Box::new(0)) as usize) would not. As usual for type
systems, this is an entirely local analysis that does not look into
other functions' bodies. Assuming our const type system is sound, the
only way we could possibly have a CTFE error from promotion is when
there is a safe const fn with an unsound unsafe block.

Notably, we are relying on library authors properly writing unsafe
const fn even for private functions whenever there is a chance that
this function might cause a CTFE error for const-valid inputs. If
there is a const fn that is actually unsafe, arguing "but it is
private and hence that's fine" will not help because the compiler
might decide to promote the result of that function. However, this
can only break code within the same crate and can be fixed locally,
so it seems like a reasonable compromise to me.

Another interesting point to consider is that we probably care much
more about CTFE correctness when thinking about promotion. After all,
the user asked for the run-time behavior; if they are getting a
completely different behavior from CTFE, that would lead to problems.
If miri is CTFE-correct except for obscure floating point issues,
that means "only" people relying on specific behavior of floating
point operations could be affected, and likely LLVM will already
violate whatever assumptions those people are making. (miri's
floating point implementation is perfectly sane and should be
standards compliant, LLVM and the particularities of x87 rounding are
the sources of uncertainty here.) I am not sure which effect that
should or will have for promotion.

Conclusion

I have discussed the notions of CTFE determinism and CTFE correctness
(which are properties of a CTFE engine like miri), as well as const
safety (property of a piece of code) and const soundness (property of
a type system). In particular, I propose that when type-checking safe
code in const context, we guarantee that this code is const-safe,
i.e., that it will not hit a CTFE error (though panics are allowed,
just like they are in "run-time" Rust code).

There are still plenty of open questions, in particular around the
interaction of const fn and traits, but I hope this terminology is
useful when having those discussions. Let the type systems guide us
:)

Thanks to @oli-obk for feedback on a draft of this post, and to
@centril for interesting discussion in #rust-lang that triggered me
into developing these ideas and terminology. If you have feedback or
questions, let's discuss in the internals forum!

Posted on Ralf's Ramblings on Jul 19, 2018.
Comments? Drop me a mail or leave a note in the forum!