[HN Gopher] Translation of Rust's core and alloc crates to Coq f...
___________________________________________________________________
Translation of Rust's core and alloc crates to Coq for formal
verification
Author : p4bl0
Score : 332 points
Date : 2024-05-15 06:32 UTC (16 hours ago)
(HTM) web link (formal.land)
(TXT) w3m dump (formal.land)
| thaliaarchi wrote:
| That's really impressive.
|
| Automatic translation like this shifts the trust to the tool.
| coq-of-rust itself is written in Rust, not in Coq. The recursive
| nature is somewhat boggling, but I think a proof of its
| correctness is possible in a similar process to David A.
| Wheeler's "Countering Trusting Trust through Diverse Double-
| Compiling" (2009) [0] (which circumvents Ken Thompson's Trusting
| Trusting attack by using a second compiler), but with a mix of a
| CompCert approach.
|
| To verify it, you'd use coq-of-rust to convert the coq-of-rust
| translator from Rust to Coq. That translation is not trusted,
| because it was performed in Rust, but it doesn't matter. Once in
| Coq, you prove the desired correctness properties--crucially,
| that it preserves the semantics of the Rust program when it
| translates a program to Coq.
|
| As in the article, it is likely easier to work with more
| functional definitions in proofs instead of generated ones, so
| you'd undertake the same process as they do with the stdlib of
| proving equivalence between definitions. Since the current line
| count for the coq-of-rust translator (specifically, lib/ [1]) is
| 6350 lines of Rust, it even seems feasible to write a full
| translator in Coq and prove its equivalence to the generated one.
|
| Then, you execute the proven-correct Coq coq-of-rust translator
| on the Rust source of the coq-of-rust translator. The Coq
| definitions it outputs should match the output of the Rust coq-
| of-rust translator that you started with.
|
| As an aside, it's nice to see industry funding for work like
| this. I'm often cynical of cryptocurrency, but its correctness
| constraints really push for improvements in areas I like (Rust,
| Coq, funding for masters students I know, etc.).
|
| [0]: https://dwheeler.com/trusting-trust/wheelerd-trust.pdf
|
| [1]: https://github.com/formal-land/coq-of-rust/tree/main/lib
| clarus wrote:
| Thanks for the comment! One of the authors here.
|
| Indeed this would be a nice process to verify coq-of-rust.
| Also, although the code is rather short, we depend on the Rust
| compiler to parse and type-check the input Rust code. So that
| would need to be also verified, or at least formally specified
| without doing the proofs, and the API of rustc is rather large
| and unstable. It could still be a way to get more insurance.
| deredede wrote:
| Is there a Coq formalisation for enough of Rust that would be
| usable here? I thought people were still figuring that out.
| clarus wrote:
| The formalization work for Rust was done mostly at the MIR
| level, which is one step lower than the THIR level we use
| here. See, for example, the https://plv.mpi-
| sws.org/rustbelt/ project. MIR should be more amenable to
| formal specification, as this language is more compact than
| THIR and aims to be more stable.
|
| However, we also lose some information going to MIR, as
| there are no expressions/loops from the original code
| anymore. There are still ways to reconstruct these, but we
| preferred to use the THIR representation directly.
| deredede wrote:
| That makes sense and fits what I had in mind: there is no
| formalization at the source level. Thanks for the
| details!
| thaliaarchi wrote:
| I didn't touch on that, but I did assume trust of the Rust
| toolchain, verifying starting at THIR. Verifying rustc would
| be a monumental undertaking, though I think some people are
| working on it.
|
| Since we don't have a verified rustc (a la CompCert [0]), I
| wonder if an approach like the translation validation of seL4
| [1] would work. They prove that the artifact (ARM machine
| code) produced by an existing compiler (gcc) for a chosen
| program (seL4) matches the source semantics (C). Thus you
| could circumvent trusting rustc, but it only works to verify
| a specific output of a chosen program. If the chosen program
| was coq-of-rust, I don't think this would be easier than the
| approach I detailed above. The seL4 kernel is 9,500 lines of
| C, while their Isabel/HOL specification is over 200,000
| lines, so the technique doesn't seem to scale to a large
| chosen source like rustc.
|
| Isn't bootstrapping fun?
|
| [0]: Xavier Leroy. 2008. "Formal verification of a realistic
| compiler". https://xavierleroy.org/publi/compcert-CACM.pdf
|
| [1]: Thomas Sewell, Magnus Myreen, and Gerwin Klein. PLDI
| 2013. "Translation Validation for a Verified OS Kernel".
| https://sci-hub.st/10.1145/2491956.2462183
| deredede wrote:
| Since you contrast the two approaches you might be
| interested in learning that CompCert also uses translation
| validation in some part of the compiler (notably for
| register allocation), see Jean-Baptiste Tristan's papers
| with Xavier.
| Gajurgensen wrote:
| Very interesting work! I'm curious how you handle
| loops/recursion? I imagine the `M` monad seen in the examples
| has a special primitive for loops?
| clarus wrote:
| Yes there is a special primitive for loops in the monad.
| The primitives are uninterpreted, we define valid finite
| traces of execution of a program, and reason about these
| traces.
|
| If a program has a loop we show that it terminates by
| constructing an execution trace. Note that we do not yet
| consider concurrency, so programs are deterministic.
| noneeeed wrote:
| It reminds me of when I used to work in SPARK Ada. On a number
| of projects where there was no supported Ada target (especially
| very small devices), it would be converted to C and then
| compiled for the target. That allowed us to perform the various
| forms of static analysis in the SPARK land.
|
| It obviously introduced issues around verifying the output or
| the transpiler, but I think the resulting C code was quite
| readable and limited in style for verification purposes, and
| the tools had a high degree of trust.
|
| The SPARK static analysis was only ever a part of the whole
| verification and validation process and there was plenty of
| testing and other activities that provides additional levels of
| trust.
|
| The whole approach seemed to work pretty well.
| im3w1l wrote:
| > As an aside, it's nice to see industry funding for work like
| this. I'm often cynical of cryptocurrency, but its correctness
| constraints really push for improvements in areas I like (Rust,
| Coq, funding for masters students I know, etc.).
|
| That's part of it, but another part I think is that crypto
| currencies simply transferred a lot of wealth to people
| interested in CS (and a wouldn't-it-be-cool-if mindset in
| general). And they are spending that money on CS research not
| just because it benefits them but because it's philanthropy
| aligned with their hobby.
| epolanski wrote:
| You reminded me of this blog post by Stephen Diel:
|
| https://www.stephendiehl.com/posts/crypto.html
| hedora wrote:
| I'm not sure how that approach prevents the rust compiler
| that's used to build binaries from injecting its malicious
| payload. (You could build A with B and B with A, and diff the
| binaries, I guess).
|
| Another approach is proof carrying code: The rust compiler
| would emit a coq proof that its executable output matches the
| semantics of the source code input.
|
| Of course, you could also write a compiler of a subset of rust
| (perhaps without the borrow checker or optimizations) in
| machine code for one architecture, then bootstrap everything
| with that.
| thaliaarchi wrote:
| A compiler that injects backdoors in targeted programs and
| self-propagates the meta-backdoor (to avoid detection in the
| source) is exactly the trusting trust attack and it can be
| mitigated by diverse double-compiling (paper linked above).
| It requires a second compiler and we have mrustc, a Rust
| compiler in C++ built specifically for circumventing the
| unverified bootstrap chain of rustc.
|
| The process is: Compile mrustc with a C++ compiler. Compile
| rustc sources with untrusted rustc binary and compile rustc
| sourcs with mrustc (these have identical behavior, but
| different codegen). Compile rustc sources with rustc-by-rustc
| and compile rustc sources with rustc-by-mrustc (these will
| have identical behavior and codegen). Those will match. If
| you compile once more, they will match. Since mrustc is never
| compiled by rustc, such a backdoor would have to also exist
| in gcc/clang and propagate with exactly identical behavior in
| mrustc. The process could be repeated for gcc/clang.
| Havoc wrote:
| Didn't know that is even feasible.
|
| I wonder if efforts like this could speed up rust in key parts of
| kernel adoption
| weinzierl wrote:
| What I don't quite understand with these efforts is this: If we
| have to translate the code to Coq manually or semi-manually,
| isn't the likelihood that we make mistakes there much higher than
| what we ultimately gain with the formal verification. In other
| words, how do we know that what we proof is still valid for the
| original code?
| clarus wrote:
| The code is translated automatically with coq-of-rust! When
| issues are found in the translation they can be fixed once in
| the coq-of-rust tool, and all the translations are updated.
| weinzierl wrote:
| Ok, what I think I do not understand is what they mean by _"
| tedious and error prone"_? Is it tedious to write the
| automated translation (aka the coq-of-rust tool in this case)
| or does the translation of a concrete piece of code (e.g. the
| core crate) still involve manual work?
| clarus wrote:
| The _" tedious and error prone"_ code was what we were
| doing before, when the translation of the standard library
| was not yet working automatically with coq-of-rust. Now,
| this is automatic. Maybe the explanations on the blog post
| were not clear.
| bennofs wrote:
| Yes, that is a limitation. But this limitation is not too bad.
|
| In many cases, a bug in the translation simply makes the proof
| impossible. Somebody then investigates why the proof does not
| go through and finds the bug in the translation.
|
| We only have a problem if the bug in the translation
| specificially cancels a bug in the original code. If there is
| no systematic risk, it is quite unlikely that two bugs cancel
| each other in this way.
| user2342 wrote:
| In case of coq-to-ocaml: is it feasible to do an extraction
| to OCaml on the translated code and compare it with the
| original?
| cccbbbaaa wrote:
| You can write programs in Coq and extract them in OCaml
| with the `Extraction' command: https://coq.inria.fr/doc/v8.
| 19/refman/addendum/extraction.ht...
|
| This is used by compcert: https://compcert.org/
| user2342 wrote:
| Yes, I know, I mentioned the extraction.
|
| My question was whether it can help detecting translation
| errors from the first step.
| cccbbbaaa wrote:
| I'm not sure which first step you are talking about.
| Typically, one would write the program directly in Coq
| and use the extracted code as-is.
| im3w1l wrote:
| Let's say you want to check if use after free can ever occur.
| Your translation is bugged and translates the whole program
| to a single "nop". Then the verifier happily asserts that the
| translated program does not cause use after free.
|
| I doubt the translation would be _that_ bad but it could have
| more subtle issues of the same kind.
| muldvarp wrote:
| > In other words, how do we know that what we proof is still
| valid for the original code?
|
| We don't. We will always have to trust _something_ (the
| hardware, the compiler, the specification, Coq's trusted
| kernel, ...). Formal verification is sadly often discussed as
| removing the possibility of bugs in their entirety. In
| practice, formal verification only makes bugs much less likey.
|
| That said, yes, automatic translation between Rust and Coq
| massively reduces the complexity of the things you have to
| trust and should therefore be preferred over manual
| translation.
| epolanski wrote:
| Every tool can only guarantee absence of some categories of
| bugs.
|
| The most common ones, business logic, often escape
| verification simply because of wrong assumptions.
|
| Simple example: some websites use the browser's default
| language to set your locale, some use your ip. Thus I travel
| to Poland and suddenly YouTube shows me Polish content and
| ads (this can be changed often, but not always, see Google
| docs idiocy). They will all unavoidably lead to business
| logic bugs because the assumptions are wrong in the first
| place.
| jlokier wrote:
| That's why some industries have separate "verification" and
| validation" (V&V) activities.
|
| Verification is checking against formal or at least known
| requirements like those being discussed in this thread.
|
| Validation is a set of processes for ensuring the
| specifications actually correspond to how the product ought
| to work. "How it ought to work" is open-ended and meant to
| include real-world aspects you didn't know beforehand. For
| example business logic assumptions. But despite being open-
| ended it is possible to formalise and track the process of
| looking for issues systematically.
|
| In your example of YouTube showing Polish content when you
| travel, verification is the process of checking the website
| uses the IP correctly because using the IP is specified
| somewhere, but validation is the process of asking "what's
| the right thing to do when a user travels, is it really to
| use the IP?" and "let's systematically research what issues
| are affecting users due to incorrect specifications that we
| should change".
| p4bl0 wrote:
| Note to interested readers: I submitted this blog post because it
| is less directly versed in cryptocurrency-related stuff than
| other posts from the same blog, but there are many more
| technically interesting posts there, especially the last two
| discussing the same approach but for Python instead of Rust.
| zem wrote:
| wonder how much work it would be to add a rust backend to F*
| [https://github.com/FStarLang/FStar]
| clarus wrote:
| It shoud be possible. A specific feature of Coq that we use is
| impredicative Set. I do not know if this is the case in F*.
| xavxav wrote:
| Can you compare how your approach contrasts to Aeneas or
| RustHornBelt? How do you handle pointers and mutable borrows?
| clarus wrote:
| I do not know how RustHornBelt works. We are focusing on safe
| code, although we still generate a translation for unsafe
| blocks as a "best effort".
|
| Compared to Aeneas the goal is very similar as we want to
| verify Rust programs using interactive theorem provers.
| However, with coq-of-rust we write the purely functional
| version of the code (the one on which we make proofs) by hand,
| or with the help of some GitHub Copilot as this is rather
| repetitive, and prove it equivalent with the automatic
| translation. In Aeneas the aim is to directly generate a
| functional version.
|
| We handle all the pointers as if they were mutable pointers
| (the `*` type). We do not use any information from Rust's
| borrow checker, which simplifies the translation, but we pay
| that at proof time.
|
| To reason about pointers in the proofs, we let the user provide
| a custom allocator that can be designed depending on how the
| memory will be used. For example, if the program uses three
| global mutable variables, the memory can be a record with three
| entries. These entries are initially `None` to represent when
| they are not yet allocated. We do not yet know how this
| technique can scale, but at least we can avoid separation logic
| reasoning for now. We hope that most of the programs we will
| verify have a rather "simple" memory discipline, especially on
| the application side.
| fsloth wrote:
| Totally not expert on formal verification. If Rust's base
| libraries get formally verified, and one does not use unsafe
| code, does that means all Rust programs that use the formally
| verified libs get de-facto formally verified quality regarding
| their memoryhandling?
| hu3 wrote:
| I'm no Rust expert but it seems to have a lot of unsafe in core
| code. And understandably so:
|
| https://github.com/search?q=repo%3Arust-lang%2Frust+unsafe+l...
|
| I don't think coq, as being presented here, can verify all
| unsafe calls.
| steveklabnik wrote:
| Most of the interest in formalizing Rust is specifically
| about formalizing the model that unsafe code is supposed to
| uphold.
| Galanwe wrote:
| Rust safety has very little to do with bugs.
|
| Rust has its own definition of what "safe" means, which is
| arguably a subset of memory safety.
|
| You can have perfectly safe Rust code with data races,
| deadlocks, memory exhaustion, etc. Not even talking about
| logical errors.
| dist1ll wrote:
| > You can have perfectly safe Rust code with data races
|
| No, you can't. Data races are explicitly one type of race
| condition that Rust protects you from. Anything else would
| indicate an unsound library or compiler bug. For reference,
| you can take a look at the non-exhaustive list of undefined
| behaviors[0]. None of those behaviours should be possible to
| trigger from safe Rust.
|
| [0] https://doc.rust-lang.org/reference/behavior-considered-
| unde...
| hu3 wrote:
| As per https://doc.rust-lang.org/nomicon/races.html
|
| > However Rust does not prevent general race conditions.
|
| > This is mathematically impossible in situations where you
| do not control the scheduler, which is true for the normal
| OS environment.
|
| > For this reason, it is considered "safe" for Rust to get
| deadlocked or do something nonsensical with incorrect
| synchronization: this is known as a general race condition
| or resource race. Obviously such a program isn't very good,
| but Rust of course cannot prevent all logic errors.
| couchand wrote:
| On the one hand, we like to encourage learning here. On
| the other, we prefer not to copy-paste a bunch of
| possibly-irrelevant content. Well, forgive me for pasting
| in a StackOverflow answer that may be relevant here:
| https://stackoverflow.com/questions/11276259/are-data-
| races-...
|
| > Are "data races" and "race condition" actually the same
| thing in context of concurrent programming
|
| > No, they are not the same thing. They are not a subset
| of one another. They are also neither the necessary, nor
| the sufficient condition for one another.
|
| The really curious thing here is that the Nomicon page
| also describes this distinction in great detail.
| Galanwe wrote:
| You can have both data races and race conditions in a
| perfectly safe Rust program.
|
| Rust can only offer its safety for the lifetimes it tracks,
| and this tracking is limited to a single instance of your
| process.
|
| In other words, if you perform concurrent accesses through
| multiple threads spawned from a same process, you're safe
| from data races, at risk of race conditions. If you perform
| concurrent accesses through multiples processes, you're at
| risk of both.
|
| That implies that even in a world where everything is safe
| Rust, you cannot guarantee that these two Rust processes
| will not data races together.
| oconnor663 wrote:
| > If you perform concurrent accesses through multiples
| processes
|
| How would you do that in safe code? If I understand
| correctly, the problem you're referring to is the exact
| reason that mmap APIs in Rust are unsafe: https://docs.rs
| /memmap2/latest/memmap2/struct.Mmap.html#safe...
| nindalf wrote:
| I don't agree with the point GP is making, but presumably
| something like two processes writing a file and reading
| their changes back. If the writes and reads interleave
| you'd get a race condition.
| couchand wrote:
| That's decidedly not a data race, however.
| bigstrat2003 wrote:
| I don't think most people expect language safety
| guarantees to extend to other processes. That's a pretty
| unrealistic expectation you are bringing.
| Galanwe wrote:
| I don't think that's unrealistic, it all depends on which
| sphere of development you're navigating into.
|
| In highly concurrent/long lived systems, you often end up
| using a multiprocess approach between your producers and
| consumers rather than multithreading. This is because it
| allows you to dynamically spawn new processes to consume
| from a shared memory without going through a bastion
| process managing threaded consumers. e. g. It allows you
| to spawn at will newly developed consumers without
| redeploying and restarting the producers. Note that this
| does not mean a different codebase.
|
| As for the expectations, I think it's fair to highlight
| it. Because people tend to think that a world of
| applications fully developed in Rust would somehow imply
| no data races. That's not the case: On a fully fault-free
| operating system, with only 100% safe Rust applications
| running, you can, and will, still run into data races,
| because in the real world applications cross process
| boundaries.
| steveklabnik wrote:
| What is your definition of "data race"?
|
| While I'm aware of some differences in opinion in the
| details, all of the definitions I'm aware of exclusively
| refer to multiple threads of execution writing to the
| same memory location, which makes the concept of a multi-
| process data race impossible. For example, Regehr:
| https://blog.regehr.org/archives/490
|
| A data race happens when there are two memory accesses in
| a program where both:
|
| target the same location
|
| are performed concurrently by two threads
|
| are not reads
|
| are not synchronization operations
| Galanwe wrote:
| The subtle differences (or lack of) between threads and
| processes don't really matter IMHO. A data race happen
| when two concurrent accesses on the same memory location
| happen, one of these accesses is a write, and there is no
| (correct) synchronization in place. The means by which
| this memory location was shared don't really matter.
| Whether it was because both processes share the same
| initial memory space, or whether this memory space was
| mapped later on, or even whether both accesses were from
| a mapped space, is really not of the matter. You could
| have two threads mmap the same shared memory to
| communicate for what it matters.
| steveklabnik wrote:
| Okay, at least there's definitional agreement here.
| However,
|
| > The means by which this memory location was shared
| don't really matter.
|
| It does in the context of this conversation, because you
| are claiming that safe Rust can create data races. Is
| there a safe API that lets you map addresses between
| processes? The closest thing I'm aware of is mmap, which
| is unsafe, for exactly this reason.
| dist1ll wrote:
| That's my thinking as well. Any library that exposes the
| such shared memory IPC via shared references as a safe
| API seems fundamentally unsound.
| Galanwe wrote:
| Well, see, that's the point of the conversation where I
| just don't follow the Rust logic.
|
| "Safe" and "unsafe" are scoped properties. They apply to
| the block of code that they encompass, by very definition
| of the fact that they are scoped. As long as an unsafe
| block respects the constraints that would be enforced if
| Rust could understand them, then that unsafe block can be
| wrapped in a safe block and exposed.
|
| In other words, "unsafe" is not transitive as long as
| what is done in the unsafe block respects the
| constraints. Mapping a block of memory is not unsafe.
| AFAIK Rust is able to allocate memory, which means Rust
| calls mmap, and that is safe.
|
| Thus, mmap being marked unsafe is not relevant here,
| because the actual unsafe block using mmap can be safe by
| knowledge of the programmer, while still inducing a data
| race later on (or not).
|
| If your argument is that unsafety is transitive by virtue
| of calling something that might have unsafe side effects
| later on, then that is seriously broken. Because with the
| same logic, every file access could use /proc to
| overwrite your process memory or change your memory maps,
| should we mark the all unsafe? Or worst, you prone a
| "maybe unsafe" approach, where things are maybe
| transitively unsafe, in which case why bother, let's
| rename "unsafe" to "shady", or unsafety just becomes some
| qualitative measure.
| steveklabnik wrote:
| > If your argument is that unsafety is transitive
|
| It is not. I am in full agreement with this:
|
| > As long as an unsafe block respects the constraints
| that would be enforced if Rust could understand them,
| then that unsafe block can be wrapped in a safe block and
| exposed.
|
| But I am not sure what you mean by
|
| > AFAIK Rust is able to allocate memory, which means Rust
| calls mmap, and that is safe.
|
| liballoc is not required to call mmap, and even if it
| did, that mmap isn't shared with other processes, as far
| as I'm aware.
|
| > while still inducing a data race later on
|
| If this is possible, then it is not safe, otherwise you
| lose the properties we're in agreement about above: no
| data races are allowed in safe Rust.
|
| > Because with the same logic, every file access could
| use /proc to overwrite your process memory
|
| It is true that Linux specifically exposes the ability to
| mess with process memory via the filesystem API, and you
| can use that to subvert Rust's guarantees. But this one
| specific hole on one specific platform is considered to
| be outside of the scope of the ability of any programming
| language to guarantee. An operating system corrupting
| your memory is a completely different thing.
|
| I doubt we're going to come to an agreement here, but I
| appreciate you elaborating, as now I think I understand
| your position here. It's not the one in general use
| though, so you're going to encounter these sorts of
| arguments in the future.
| pcwalton wrote:
| mmap is unsafe in Rust.
| raphlinus wrote:
| I think that to some extent represents the dream, with caveats.
| A few pieces need to fall into place.
|
| 1. You need to formalize the semantics of unsafe Rust. Ralf
| Jung's pioneering RustBelt[1] work was a huge step in that
| direction, but it is not yet complete. One element that's
| emerging as tricky is "pointer provenance."
|
| 2. Part of that is making a formal model of the borrow checker.
| Stacked borrows[2] was a good attempt, but has some flaws. Tree
| borrows[3] may correct those flaws, but it's possible something
| even more refined may emerge.
|
| 3. Another part of that is a formal memory model, which is
| mostly about the behavior of atomics and synchronization (thus,
| is critically important for standard library components such as
| Arc). It is widely understood that Rust's memory model should
| be similar to the C++ one (and that they should interoperate),
| but there are some remaining flaws in that, such as the "out of
| thin air" problem and also some missing functionality like
| seqlocks (part of why the Linux kernel still has its own).
|
| 4. You need a proof that the safety guarantees compose well; in
| particular if you have _sound_ code written in unsafe Rust,
| composed with other code written in safe Rust, the safety
| guarantees still hold. There are good results so far but it
| needs to be proved over the entire system.
|
| 5. You do need to close all the remaining soundness bugs in
| Rust[1], for such a proof to hold for _all_ code. Progress on
| this is slow, as many of these problems are theoretical, or are
| only really relevant for adversarial code[5].
|
| When all this is done, you still only have a partial guarantee.
| Huge amounts of the surface area to interface with the system
| are based on unsafe code. If you're doing only pure
| computation, that's one thing, but if you're building, say, a
| graphical UI, there's a massive amount of stuff that can still
| go wrong.
|
| Even so, there is a practical path forward, and that leads to a
| much better situation than the usual state of things today,
| which is of course systems riddled with vulnerabilities.
|
| [1]: https://people.mpi-
| sws.org/~dreyer/papers/rustbelt/paper.pdf
|
| [2]: https://plv.mpi-sws.org/rustbelt/stacked-borrows/
|
| [3]: https://www.ralfj.de/blog/2023/06/02/tree-borrows.html
|
| [4]: https://github.com/rust-
| lang/rust/issues?q=is%3Aissue+is%3Ao...
|
| [5]: https://github.com/Speykious/cve-rs
| clarus wrote:
| Thanks for the explanations and all the links!
| tialaramex wrote:
| Provenance is a serious problem for a language which has
| pointers and optimisation.
|
| It's interesting to me that serious people propose "bag of
| bits" semantics (basically "What is provenance, lets just say
| pointers are raw addresses") for C++ which seems like a
| terrible idea and I can't tell whether they really mean that.
| P2414R3: https://www.open-
| std.org/jtc1/sc22/wg21/docs/papers/2024/p24...
| pron wrote:
| The size of _programs_ (as opposed to libraries, which could be
| easier because their code is not nearly as self-interacting) that
| has been end-to-end verified with semi-automated deductive proof
| systems like Coq is quite small. In fact, the size of programs
| that can be verified in this way has grown at a slower pace than
| the growth in the average size of programs overall. While sound
| software verification (i.e. a 100% conformance to a spec)
| certainly has its place (usually to prove the correctness of some
| core algorithms), it doesn 't scale well, which is why there's
| been a shift in research into unsound methods because the cost of
| 100% assurance can be 10x that of 99.9999% assurance, and the
| difference in probability may fall below that of non-software
| failures (after all, a physical system cannot be proven to
| conform to a spec) or the probability of a spec that doesn't
| conform well enough to reality.
| pcwalton wrote:
| The point is that, by proving the unsafe parts of the standard
| library correct and Rust's safety guarantees correct, we
| _transitively_ prove the memory safety (as well as data race
| freedom, etc.) of _all_ safe Rust code (that uses only the
| standard library). Only proving the unsafe code correct is far
| less effort than proving all Rust code correct would be. This
| addresses the common criticism of "what about unsafe code?"
| when people talk about Rust's safety guarantees. We can use a
| more powerful system, like Coq, to fill in the gaps that the
| Rust type system itself isn't powerful enough to handle.
| pron wrote:
| Sure (assuming Rust's type system were sound, which, I
| believe, it isn't yet). But my general point -- which isn't
| at all specific to Rust -- is that with deductive methods,
| _sometimes_ the effort required may not be worth the added
| assurance probability they give over unsound methods. In
| other cases, deductive methods may be cost effective, but the
| question of cost effectiveness is one that always needs to be
| asked in software assurance, and deductive methods shouldn 't
| be selected _merely_ because they 're sound.
|
| The level of confidence that software requires (as opposed to
| an abstract algorithm) is never much higher than one minus
| the probability of an "unaccounted" failure (such as a
| hardware failure). Sometimes the extra confidence beyond that
| point is free, and sometimes it can be very expensive.
| konstantinua00 wrote:
| I remember lecture [1] that mentioned fuzzer finding bugs in
| formally-verified C compiler - because formal verification didn't
| cover frontend and backend
|
| I know people raise questions about trust in Coq itself and in
| translation (and I question how it will synchronize with rust
| updates), but even perfect formal verification doesn't mean 100%
| correctness from start to finish
|
| [1] https://youtu.be/Ux0YnVEaI6A
| the8472 wrote:
| Did you find any bugs in the crates?
| clarus wrote:
| No, we have not found bugs! We have only scratched the surface
| and bugs are probably well hidden as the standard library is
| very well tested. We do not expect to find bugs there,
| especially as we also assume the unsafe code is safe.
|
| Our main hope with the translation to Coq of the standard
| library is to get a formalization of it, so that we can
| precisely verify Rust programs calling core and alloc.
| vlovich123 wrote:
| Is formal verification specification writing similar to writing a
| more complicated property test? I've found writing property tests
| to be kinda difficult and time consuming for non-trivial
| programs.
| Jtsummers wrote:
| Similar, not always the same. Property tests generally work at
| the interface of a system (either down to the
| function/procedure level or higher level) where you specify a
| description of the inputs and test the outputs satisfy some
| property or set of properties.
|
| This will look a lot like what you do with formal verification
| at the same level. Formal verification tools can go deeper,
| though, like answering questions about the state of the system
| internal to a computation. Like "I have this loop invariant,
| can I prove (or have a system prove) that it actually holds
| across all iterations?" or "Can I prove that while doing this
| computation there is never underflow/overflow in any
| intermediate computation?" The former could be done with
| property testing if you pull the loop kernel out into its own
| procedure and then test the invariant property by executing the
| kernel with a large number of intermediate states. The latter
| is harder to do without really, really tearing your program
| apart down to the level of making each line separately
| executable.
| vlovich123 wrote:
| I think my main point is that writing the specification is
| difficult enough with a non-trivial program with property
| tests, with many places tending to even avoid it if they can.
| If formal verification is even more difficult, it doesn't
| matter that it has more power and can make the code even more
| correct - the cost is a huge blocker.
| Jtsummers wrote:
| It depends on how you do the formal verification. There are
| systems like SPARK/Ada which brings the specification into
| the language you're developing in (supports only a subset
| of Ada, but every year they're expanding what it covers),
| JML which does something similar for Java but the
| specification language is expressed in its own notation in
| Java comments, and others.
|
| If you use these kinds of systems you can be more judicious
| in what you try to prove. Whole program proofs are
| infeasible (or at least impractical) generally, but you
| pick the subsets you care about most and you can add
| annotations and prove them (automatically ideally, but you
| can bring in an external proof assistant as well). You
| leave the rest of the system to testing which has proven
| remarkably effective over the decades (when taken
| seriously, at least).
|
| This isn't an all-or-nothing proposition, you choose how to
| spend your time and effort with formal verification just
| like you do with testing or other work.
| clarus wrote:
| There are some specifications that are always there, like
| the absence of reachable panics or the backward
| compatibility between releases on stable entry points.
|
| Otherwise you can outsource this work to specialized
| companies such as Formal Land or others!
___________________________________________________________________
(page generated 2024-05-15 23:01 UTC)