[HN Gopher] SeL4 on RISC-V Verified to Binary Code
       ___________________________________________________________________
        
       SeL4 on RISC-V Verified to Binary Code
        
       Author : kryptiskt
       Score  : 145 points
       Date   : 2021-05-05 08:31 UTC (14 hours ago)
        
 (HTM) web link (microkerneldude.wordpress.com)
 (TXT) w3m dump (microkerneldude.wordpress.com)
        
       | ephaeton wrote:
       | Quote regarding their previous work: "However, there is still the
       | risk of security holes resulting from (...) the compiler and
       | kernel developers interpreting the C semantics differently."
       | 
       | Was that really a "risk" ? I think it's a given.
       | 
       | "Beware of bugs in the above code; I have only proved it correct,
       | not tried it." - Knuth
       | 
       | also: https://cacm.acm.org/magazines/2018/7/229036-c-is-not-a-
       | low-...
       | 
       | In light of that, I'm not convinced by the current increment
       | either.
        
         | saithound wrote:
         | That was really a risk. It's no longer a risk. In any situation
         | where the developer's interpretation of the semantics differs
         | from the binary semantics, their tool chain will reject the
         | compiled binary. That's the point.
         | 
         | The gcc git repository receives hundreds of contributions every
         | week. Despite the heroic efforts of the open-source
         | communities, it's still very easy to submit buggy or malicious
         | patches (just think about the University of Minnesota scandal
         | [1]). One of those could easily be used to target seL4,
         | especially since it's hard to target seL4 in any other way,
         | given all the other security guarantees. A tool that double-
         | checks the compiler will buy the stakeholders significant peace
         | of mind.
         | 
         | [1] https://news.ycombinator.com/item?id=26887670
        
           | darksaints wrote:
           | I believe seL4 no longer uses gcc, but rather uses CompCert,
           | which is also formally verified for adherence to standardized
           | C semantics.
        
             | ephaeton wrote:
             | my reticule aims at the creation of the formal C semantics
             | being used to verify the output. If both you and I make an
             | off-by-one, we can always agree on the (say, index) value,
             | but we'll both be always wrong. And C semantics is
             | something especially finicky, given how often we delve in
             | "undefined behaviour" territory, or our nowadays hardware
             | does something that "back then" was unthinkable.
        
               | vilhelm_s wrote:
               | Yes, making a mistake when writing down the C semantics
               | is definitely possible. The point of the work announced
               | here is to get rid of that risk, so that the proof only
               | relies on the semantics of the RiscV instructions (and
               | the C semantics become an intermediate thing which are
               | proven rather than trusted). It's still possible that
               | they made a mistake when writing down the RiscV
               | semantics, but machine code is much simpler than C so the
               | risk is smaller.
        
               | mondoshawan wrote:
               | The formal C semantics are defined in Isabelle/HOL,
               | checked, and then the compiler output is also checked.
               | They use standard GCC compilers for all of this.
               | 
               | See https://riscv.org/blog/2021/05/sel4-on-risc-v-
               | verified-to-bi...
        
       | sel4groupie wrote:
       | Man they been working on porting it to Qubes for a minute now
        
       | awelkie wrote:
       | I've been wondering recently to what extent formal verification
       | could be used to simplify CPU hardware and ISAs. For example,
       | MMUs are unnecessary if the OS verifies that all programs only
       | access memory that's been allocated to it. This paper[0] seems to
       | imply that the savings would be non-negligible. Also interrupts
       | and context switching could be avoided if programs are
       | cooperatively scheduled and verified that they always yield after
       | an appropriate amount of time.
       | 
       | Those two examples are the only ones I could come up with, but I
       | suspect that's because I don't know very much about hardware
       | design. Are people working on this or am I just off-base here?
       | 
       | [0]: https://arxiv.org/abs/2009.06789
        
         | opnitro wrote:
         | Check out theseus! It's exploring some of these ideas.
         | https://github.com/theseus-os/Theseus
        
         | mcguire wrote:
         | I can't recall any names at the moment, but you might check out
         | some of the Unikernel research. I seem to recall results of
         | combining a unikernel and a safe language to be exactly what
         | you describe.
        
         | nynx wrote:
         | I tried this with nebulet (https://github.com/nebulet/nebulet).
         | Unfortunately the MMU cannot be turned off on modern x86_64
         | processors.
        
         | tachyonbeam wrote:
         | Memory protection isn't just used to protect processes against
         | one another, you can also use it to implement write barriers
         | for a garbage collector, or to map files to memory, for
         | example. I think you'd be hard pressed, with any kind of
         | compiler, to prove that an object you're writing to isn't part
         | of an old GC generation, seems like an intractable problem,
         | unless you're in a pure functional programming language.
        
           | ingenieroariel wrote:
           | sel4 has a Haskell version that they write before the C
           | (granted, not sure how a ghc generated binary would be
           | verified) but would the problem be tractable in that case?
        
         | zozbot234 wrote:
         | > For example, MMUs are unnecessary if the OS verifies that all
         | programs only access memory that's been allocated to it.
         | 
         | You'd need some sort of proof-carrying code to allow
         | verification of these properties in binaries. Also, features
         | like MMU's and interrupts are needed anyway to implement
         | virtual memory, or user tasks with adjustable priorities.
        
           | fuklief wrote:
           | Would virtual memory still be necessary on a capability
           | machine architecture [0] ? My understanding is that would not
           | be the case since all programs can only access memory that's
           | been allocated to it. By definition, capabilities cannot be
           | created out of thin air like pointers can be ?
           | 
           | [0]: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
        
         | gmueckl wrote:
         | Cooperative scheduling would require provable upper limits on
         | the number of instructions run by process between calls to a
         | yield function. When the process is a compute task with an
         | input of variable size, this would result in the need to count
         | iterations and branch in inner loops that you want to be fast.
         | I am not sold on the idea that such an overhead is lower in
         | that case than interrupt driven preemptive scheduling.
         | 
         | Also, task prioritization can be a problem: the rate at which
         | an interrupt-less cooperatively scheduled system must
         | reschedule depends on the response time for the task with the
         | shortest maximum response time. You must guarantee that the
         | scheduler can switch to it in time to meet its deadline. This
         | can increase the scheduling frequency for everything, requiring
         | all components of the system to call the scheduler more often.
         | So you'd have pretty tight coupling between everything.
        
           | awelkie wrote:
           | Both great points. Regarding your second paragraph, note that
           | this isn't so bad for multi-core processors. For randomly
           | scheduled processes, the (expected) maximum response time is
           | the maximum time between yield points divided by the number
           | of cores. And the OS could maybe do something smarter by
           | making sure that at least one core has a time-to-yield which
           | is very small. This would mean that you can still have very
           | large time-between-yields for some processes without
           | affecting the maximum response time.
           | 
           | And my hope was that simplifying the hardware would allow us
           | to fit more cores on the same chip. So your first point means
           | that the per-core performance is worse but maybe for some
           | workloads the processor performance as a whole is better. But
           | yeah not sure if that's realistic.
        
           | vbezhenar wrote:
           | You can run verifier once (e.g. at installation time) and
           | remember that code is good.
        
           | jhgb wrote:
           | > Cooperative scheduling would require provable upper limits
           | on the number of instructions run by process between calls to
           | a yield function. When the process is a compute task with an
           | input of variable size, this would result in the need to
           | count iterations and branch in inner loops that you want to
           | be fast. I am not sold on the idea that such an overhead is
           | lower in that case than interrupt driven preemptive
           | scheduling.
           | 
           | Surely something like branch-on-random could be made to work
           | with such code. It's effectively an interrupt point with the
           | added benefit that you can place it exactly where you want it
           | to.
        
         | zamalek wrote:
         | > For example, MMUs are unnecessary if the OS verifies that all
         | programs only access memory that's been allocated to it.
         | 
         | That's what Microsoft was attempting with Singularity. Because
         | binaries were IL, they could be reasonably verified for
         | isolation guarantees during JIT compilation.
        
       | moonchild wrote:
       | This approach is interesting, it's also used by compcert for its
       | register allocator; the allocator itself is not formally
       | verified, but the its results are. It's not complete, in a
       | general sense, but I wonder if this sort of 'brute-force'
       | approach to formal verification will become more popular over
       | time, and lower the barrier to entry for formally verifying
       | nontrivial programs.
        
         | fuklief wrote:
         | In CompCert's case though, the register allocation verifier
         | itself is verified. I'm not so sure the SMT solver used here
         | are.
        
           | MaxBarraclough wrote:
           | > In CompCert's case though, the register allocation verifier
           | itself is verified.
           | 
           | I don't think so, I think moonchild's account of it is
           | accurate. At least, it was accurate in 2012. From a 2012 blog
           | post: [0]
           | 
           | > _In CompCert, this algorithm is not proven correct.
           | Instead, after generating the graph, the compiler calls an
           | OCaml implementation of IRC and then checks that the coloring
           | is valid. If the checker confirms that the coloring is valid,
           | the compiler uses it and proceeds along. If it does not
           | confirm that it 's valid, compilation is aborted. This
           | checker is proven correct, and because it is simpler, its
           | proof of correctness is easier._
           | 
           | [0] http://gallium.inria.fr/blog/register-allocation/
        
             | fuklief wrote:
             | By "register allocation verifier", I meant the checker as
             | mentioned in the post you quote.
        
             | archi42 wrote:
             | > This checker is proven correct, and because it is
             | simpler, its proof of correctness is easier.
             | 
             | Still correct ;)
             | 
             | What I belief GP (fuklief) meant was: The SMT solver might
             | still be incorrect. Like the register allocator _itself_ in
             | CompCert. _But_ for CompCert, the result is checked by a
             | formally verified checker. I do _not_ know if that 's true
             | or not for the result of the SMT solver.
        
               | creata wrote:
               | Don't SMT solvers output proof objects so that the
               | results can be easily independently verified?
        
               | syrak wrote:
               | Even if you had a verified SMT checker, you also need to
               | prove that the encoding of the problem in SMT is correct,
               | i.e., that a proof does translate to a valid register
               | allocation solution, to achieve comparable guarantees to
               | CompCert.
        
       | darksaints wrote:
       | One of the discussions that always forms when talking about
       | formally verified software is the whole verifying-the-verifier
       | scenario. AKA, all of the reasons why you still can't trust it.
       | And while that is an interesting conversation to have, it does a
       | disservice to the practical benefits of the software verification
       | process.
       | 
       | The Linux CVE database was recently analyzed and it was
       | determined that 96% of the vulnerabilities would be mitigated to
       | at least non-critical status, if not completely eliminated, if
       | linux were a microkernel architecture. Now throw in capabilities
       | oriented security, and most of non-critical exploits become non-
       | exploits. Throw in formal verification of the kernel, and you've
       | basically eliminated everything except for hardware exploits.
       | 
       | Assuming mass adoption, what seL4 effectively does is turn
       | million dollar black market zero days into billion dollar zero
       | days...possibly more. A million dollar exploit probably isn't
       | even a line-item in the FBI budget. A billion dollar exploit
       | would be a "needs-congressional-approval" type of line-item in
       | the national US intelligence budget. Unless your job revolves
       | around protecting the nuclear arsenal of a major nation-state,
       | you won't ever have to worry about OS vulnerabilities, and you
       | can rely on the security characteristics of the OS to drastically
       | narrow down your attack surface.
       | 
       | To me, that is a pretty big deal. It is slowly becoming a moral
       | imperative that we start adopting this sort of security as a core
       | standard.
        
         | lallysingh wrote:
         | Summary acronyms: IoT should be on SeL4 on RISC-V.
         | 
         | RISC-V fits the embedded space, and IoT usually doesn't need
         | much more software on top than a kernel and a dumb control
         | service. If an open-source control service (effectively just
         | controlling some I/O pins depending on a config) can get
         | verified, the entire threat vector of IoT would be devastated.
        
         | dmos62 wrote:
         | I'm fuzzy on what it would take for there to be a mainstream
         | formally-verified micro-kernel capability-based OS. Is it
         | probable for Linux to be gradually refactored into that, or are
         | will it take a fundamentally new OS?
        
           | CameronNemo wrote:
           | I think it would be possible to adapt some Linux drivers to
           | run in a sandboxed capacity within a capability/microkernel
           | system.
           | 
           | IIRC there was once an L4 based project that integrated with
           | Linux.
           | 
           | https://en.wikipedia.org/wiki/L4Linux
        
             | shandor wrote:
             | L4Linux is just the usual Linux paravirtualized on top of
             | one the L4 microkernels. We previously ran two instances of
             | it on top of L4/Fiasco
             | 
             | The problem with that is that you don't actually get that
             | much benefits from the underlying microkernel. Linux is
             | just the same old monolith, just better isolated from the
             | other resources not exposed to this particular
             | (para)virtualized kernel instance, so you end up isolating
             | more or less complete systems from each other rather than
             | getting the normal benefits of capability based microkernel
             | architecture inside a single system.
             | 
             | Now that's not nothing of course, but not usually what
             | people are after when they're hoping for a usable, modern
             | microkernel system.
        
           | darksaints wrote:
           | There are already some attempts at creating new operating
           | systems off of seL4. The most complete at the moment is
           | Genode. There are also attempts to port existing OSes to
           | seL4, such as qubes. None of these are particularly mature.
           | 
           | In terms of creating a Linux replacement, drivers are the
           | biggest hurdle. There are some attempts to programmatically
           | port the netbsd suite of drivers, which seems to be the most
           | promising. Apart from that, drivers will likely need to be
           | ported manually.
           | 
           | The GNU userland is also unlikely to be ported easily en-
           | masse due to the capabilities-oriented security model,
           | although some projects would definitely be easier than
           | others. Things like SystemD would be a monstrous task.
           | 
           | I personally think that apart from drivers, which can use
           | traditional microkernel approaches, it's probably time to
           | start implementing some of the better OS ideas that have come
           | out in the last 30 years, as opposed to just trying to
           | replicate a posix environment. For example, NuShell has a ton
           | of promise, but isn't particularly posix-y.
        
             | [deleted]
        
         | anp wrote:
         | I'd love to dig into that CVE analysis if you have a link!
        
           | darksaints wrote:
           | http://ts.data61.csiro.au/publications/csiro_full_text/Biggs.
           | ..
        
         | SkyMarshal wrote:
         | Are there any projects anywhere developing a microkernel with a
         | capability security model?
        
           | y7 wrote:
           | Yes, the seL4 project that is being discussed.
        
         | pjmlp wrote:
         | The irony is that in cloud environments with dozen of layers on
         | top of Linux kernel, the whole mikrokernel vs monolith is a
         | moot point.
         | 
         | So much for monolith performance, when one needs to pile layers
         | and layers of security on top.
        
         | mcguire wrote:
         | " _The Linux CVE database was recently analyzed and it was
         | determined that 96% of the vulnerabilities would be mitigated
         | to at least non-critical status, if not completely eliminated,
         | if linux were a microkernel architecture._ "
         | 
         | If Linux were a microkernel architecture, or if Linux were
         | based on seL4?
         | 
         | If Linux were based on the OSI Mk microkernel (I think that was
         | the name; it was Mach-based), every system operation would be
         | roughly an order of magnitude slower[1].
         | 
         | As for formal methods, I'm singing in the choir.
         | 
         | [1] I've got the performance paper from the early '90s
         | somewhere; it's hilarious. One benchmark was dhrystones: they
         | didn't manage to physically slow down the processor. :-)
        
           | darksaints wrote:
           | The mach microkernel was definitely slow, but I think you'd
           | be surprised at how fast the L3 and L4 family of microkernels
           | are. L4 IPC calls are some 20x faster than mach IIRC.
           | L4Linux, which is _Linux running on top of L4_ , is only 6-7%
           | slower than native Linux.
           | 
           | Cache locality is one hell of a drug. With seL4, the entire
           | kernel consumes ~160KiB, which easily fits in most CPU
           | caches.
        
       | zibzab wrote:
       | I might be wrong, but isn't their approach similar to what
       | another team did some years back?
       | 
       | https://www.researchgate.net/publication/262367164_Machine_c...
        
         | saithound wrote:
         | I think they're doing different things. What they have in
         | common is that they both use SMT solvers for automation.
         | 
         | Glancing at the abstract, it seems like Dam, Guanciale and
         | Nemati proved correctness theorems about a tiny hypervisor in
         | ARMv7 assembly, using SMT solvers to automate some of the
         | proof. The C compiler is not involved.
         | 
         | Whereas seL4 is written in C, and the correctness theorems are
         | proved using the C code and the specification. There is a
         | compiler that turns the C code into RISC-V machine code. Then
         | the new tool chain proves the equivalence of the C program and
         | the RISC-V binary emitted by the compiler, using SMT solvers to
         | automate some of that proof.
        
           | zibzab wrote:
           | The Dam paper mentions compiling C code to assembly first,
           | but not much details is given.
           | 
           | I think there is another step not covered in this paper that
           | connects back to the C code. Maybe they saved that for
           | another publication?
        
       ___________________________________________________________________
       (page generated 2021-05-05 23:01 UTC)