hngopher.com

       [HN Gopher] The C Bounded Model Checker: Criminally Underused
       ___________________________________________________________________
        
       The C Bounded Model Checker: Criminally Underused
        
       Author : philzook
       Score  : 60 points
       Date   : 2024-01-30 15:51 UTC (7 hours ago)
        
 (HTM) web link (www.philipzucker.com)
 (TXT) w3m dump (www.philipzucker.com)
        
       | gaudat wrote:
       | I miss this kind of stuff in computer programming languages.
       | 
       | In hardware design, verification is done simultaneously with
       | design, and semiconductor companies would be bankrupt if they did
       | not verify the hell out of their designs before committing
       | millions in manufacturing these chips.
       | 
       | Even among hobbyists this is getting traction with yosys.perhaps
       | its time for programmers to adapt this kind of tooling so there
       | will be less buggy software released...
        
         | vacuity wrote:
         | There are probably orders of magnitude in difference between
         | hardware design output and programming output. At least at the
         | time, seL4's verification was quite impressive for a codebase
         | on the order of 10,000 lines. But we should work towards the
         | goal of improved formal modelling and checking all the same.
        
       | rwmj wrote:
       | Nice but can you run it on non-toy programs?
       | 
       | I gave a talk about Frama-C a few years ago. It's very
       | interesting technology, but not too useful at any scale
       | (http://oirase.annexia.org/tmp/2020-frama-c-tech-talk.mp4) One
       | aim we had with Bounds Checking GCC back in '95 was to make it
       | possible to use with real programs. Although it was quite slow,
       | any open source program of the era could use it
       | (https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html).
        
         | philzook wrote:
         | Non trivial example projects https://model-
         | checking.github.io/cbmc-training/projects.html Some crypto, AWS
         | common C, FreeRTOS. Other interesting applications
         | https://www.cprover.org/cbmc/applications/
        
           | philzook wrote:
           | Frama-C is also very cool. Also Coq VST and however seL4 does
           | it. I think out of all of these, CBMC requires the least
           | expertise / has highest automation. I also don't see how it
           | can scale beyond the unit test level, or patch together
           | verification conditions into global properties. Perhaps this
           | could be done by some meta framework plugging together calls
           | to CBMC. Given the level of penetration of formal methods
           | generally, I think the relatively low ambition of CBMC
           | compared to these higher ceiling techniques is good.
        
             | rwmj wrote:
             | Thanks, very interesting stuff!
        
         | jcranmer wrote:
         | One of the examples they gave was an HTTP client, which would
         | be a surprisingly non-toy example, so I looked at what they
         | actually did in the code
         | (https://github.com/FreeRTOS/coreHTTP/tree/main/test/cbmc).
         | 
         | Not that I'm an expert in processing what exactly is being
         | tested, but it basically looks only able to prove that an
         | individual function doesn't overrun buffers. If you tell it to
         | assume that integer overflows can't happen (!). So I'm not
         | impressed.
        
         | nanolith wrote:
         | I've used CBMC in large commercial projects ranging around half
         | a million lines of code. The secret is to break down every one
         | of those functions into small pieces that the model checker can
         | analyze.
         | 
         | CBMC works best with a contract-based programming style, where
         | contracts are enforced by assertions and shadow functions exist
         | to simplify model checking.
         | 
         | A single reply on HN is hardly a place where idiomatic styles
         | can be expounded, but it is quite possible to build a resource-
         | oriented idiomatic style that is reinforced with CBMC.
        
           | philzook wrote:
           | I would love to hear more
        
             | nanolith wrote:
             | In particular, one of the more important rules I made when
             | using CBMC was to keep the search depth for each invocation
             | as shallow as possible. If CBMC runs in less than five
             | minutes, you're doing it right. If it takes more than that,
             | then you're asking it to do too much.
             | 
             | This led to the creation of function contracts and shadow
             | functions. When evaluating functions, we actually want any
             | calls that it makes to go to shadow functions instead of
             | real functions. When we write a function, we include a
             | contract that includes anything it may return, any side-
             | effects it may have, and any memory it may touch / allocate
             | / initialize. We then write a twin function -- its shadow
             | function -- that technically follows the same contract in
             | the most simplified terms, randomly choosing whether to
             | succeed or fail. From CBMC's perspective, it's equivalent
             | enough for model checking. But, it removes additional stack
             | depth or recursion.
             | 
             | A good example of this would be a shadow function for the
             | POSIX read function. Its contract should verify that the
             | descriptor it is passed is valid. It should also assert
             | that the memory region it is passed is bounded by the size
             | it is given. The shadow function picks a non-deterministic
             | state based on how it should set errno and how it should
             | return. This state follows the POSIX standard, but the
             | underlying system calls don't matter. Likewise, depending
             | on the return code, it should initialize a number of bytes
             | in the read buffer with non-deterministic values.
             | 
             | I used this shadow read function to find a buffer overflow
             | in networking code and also to find a DOS infinite loop
             | error in a tagged binary file format reader we were using.
             | 
             | CBMC isn't perfect, but when coupled with a good linter and
             | -Wall / -Werror / -Wpedantic, it is a very useful layer in
             | a defense in depth strategy for safer software.
        
               | lmm wrote:
               | > CBMC isn't perfect, but when coupled with a good linter
               | and -Wall / -Werror / -Wpedantic, it is a very useful
               | layer in a defense in depth strategy for safer software.
               | 
               | I agree to a certain extent, but at what point are you
               | putting in more effort than it would take to migrate to a
               | stricter language that offers more safety by default?
        
               | nanolith wrote:
               | I'm not aware of any language that can't be improved by
               | this strategy. Amazon even added SMT solving tooling
               | around Rust, because Rust's built-in safety guarantees
               | aren't enough.
               | 
               | I'd say that the added effort is minimal. The compiler
               | spits out more errors. We think a little more carefully
               | about how we write functions so CBMC doesn't complain.
               | There are no silver bullets.
        
       | WalterBright wrote:
       | Unfortunately,                   int main(){             char
       | buffer[10];             buffer[10] = 0;         }
       | 
       | are so rare they are hardly worth bothering with. The more usual
       | case is:                   int get(char* buffer)         {
       | return buffer[10];         }              void test() {
       | char buffer[10];             get(buffer);         }
       | 
       | I.e. the array bounds for buffer get lost in the function call. I
       | have proposed a fix:
       | 
       | https://www.digitalmars.com/articles/C-biggest-mistake.html
       | 
       | that is compatible with existing code.
       | 
       | And yet, nobody cares. Oh well! Instead, we have overly complex
       | solutions like this:
       | 
       | https://developers.redhat.com/articles/2022/09/17/gccs-new-f...
        
         | philzook wrote:
         | Your example does not pass the verifier                  cbmc
         | /tmp/walter.c --bounds-check --pointer-check --function test
         | ...        [get.pointer_dereference.5] line 3 dereference
         | failure: pointer outside object bounds in buffer[(signed long
         | int)10]: FAILURE        ...
         | 
         | There are many footguns. People love their guns. Makes them
         | feel powerful.
        
           | WalterBright wrote:
           | int get(char* buffer, int i)         {             return
           | buffer[i];         }              #include <stdio.h>
           | void test() {             char buffer[10];
           | get(buffer, getc(stdin));         }
        
             | nanolith wrote:
             | CBMC's default contract for getc returns a non-
             | deterministic integer value. A non-deterministic integer
             | value basically counts as "pick a value that would cause
             | this de-reference to crash the machine model".
             | 
             | So, it will find a counter-example.
        
               | philzook wrote:
               | Yes, it returned the same "FAILURE" along with a couple
               | new checks about getc that do pass.
        
               | WalterBright wrote:
               | Sounds like it's doing a solid job of data flow analysis.
               | But I can still produce cases where it cannot determine
               | either the range of the index or the length of the array
               | being pointed to.
               | 
               | The programmer will have to add code to keep track of the
               | length of the array, and add code to check the value of
               | the index. The good thing is the analyzer will tell you
               | where you'll need to add those checks.
               | 
               | With the proposal I made, none of this is necessary. It
               | just works. Over two decades of experience with this
               | approach in D is it-just-works.
        
               | nanolith wrote:
               | I'm curious, have you used CBMC?
               | 
               | The machine model tracks the array with its size and
               | compares this to any index. This is loaded into an SMT
               | solver. This is not a standard static analyzer. It's
               | compiling the code to an abstract machine model and then
               | running through this model with an SMT solver.
               | 
               | CBMC isn't perfect, but the strategies you are talking
               | about are meant to defeat a linter or a shallow analyzer
               | in a compiler. CBMC isn't a linter.
        
               | WalterBright wrote:
               | I have never heard of CBMC before this thread.
               | 
               | SMT cannot solve cases where it simply cannot know what
               | the length of an array is. For example, when array is
               | allocated in code not available to the solver, or is set
               | by the environment.
               | 
               | I remember trying out Spark a few years ago, which
               | advertised its compile time checking abilities. I tried
               | using an integer equation that used an OR, and it gave up
               | on it.
               | 
               | If you need to add in range checking code to help the
               | solver along, then the range check itself is a source of
               | bugs, and the range limit has to be made available
               | somewhere and be correct.
               | 
               | In my proposal, the array length is set at the time of
               | creation of the array, and the length is carried along
               | with the pointer. The solver's role then becomes one of
               | _eliminating_ the bounds check in the cases where it can
               | prove the index is within range. The user doesn 't have
               | to do anything.
               | 
               | We've been using it in the implementation of the D
               | compiler for a very long time now, and problems with
               | buffer overflows are a faded memory.
               | 
               | P.S. I also added manual integer overflow checks when
               | passing the size to malloc(), no matter how unlikely an
               | overflow would be :-)
        
           | Joker_vD wrote:
           | > There are many footguns. People love their guns. Makes them
           | feel powerful.
           | 
           | Well, duh: if you remove the most egregious footguns from C,
           | you end up basically with Pascal, and here is Why Pascal Is
           | Not My Favorite Programming Language.
           | 
           | And when you try to explain to them that C's abstract machine
           | has semantics of "when you omit a safety check, you actually
           | makes a promise to the compiler that this check is redundant,
           | that you'll promise that you'll make sure in some other way
           | that in no possible execution will the illegal values ever
           | show up in this place, and the compiler will actually hold
           | you to this promise" -- they throw tantrums and try to argue
           | that no, it hasn't this semantics, and even if has, it's not
           | the semantics as originally intended by K&R and it is the
           | authors of the C compilers who are wrong (even though the
           | authors of C compilers are the people who actually write both
           | the C standards _and_ the C compilers...)
           | 
           | Must have to do something with imprintment: I've learned C as
           | my 3rd language, and from the start I was taught that its UB
           | is absolutely insane -- and I've never felt "nah, it should
           | just do what I meant, dangit" denial about it, only "this is
           | sad, how can I even know while writing code when I am making
           | a baseless promise especially when it's made by _omitting_
           | something? " But like you've said, people like their
           | footguns.
        
       | obl wrote:
       | Weird that this treats uninitialized variables as unknown values.
       | For example in ex3.c, the program                 int main(){
       | int x;           if (x <= 42){                   assert(x !=
       | 12345);           }       }
       | 
       | is of course UB in C, even though under a "uninitialized is
       | random" model the program is valid and does not assert (as the
       | model checker concludes).
       | 
       | (even in O1 clang gets rid of the whole function, including even
       | the ret instruction, I'm surprised it does not at least leave an
       | ud2 for an empty function to help debugging since it would not
       | cost anything https://godbolt.org/z/eK8cz3EPe )
        
         | philzook wrote:
         | I see the calling an undefined prototype style more often,
         | perhaps for this reason. You are probably right this is
         | undefined behavior, but it's subtle.
         | https://stackoverflow.com/questions/11962457/why-is-using-an...
         | I suspect CBMC just picks some concrete behavior for undefined
         | behavior. It may not be a good detector for that. I'm not sure.
         | This gets into shaky territory of understanding for me.
        
           | jcranmer wrote:
           | Speaking as a compiler writer:
           | 
           | Any invocation of undefined behavior should be considered an
           | assertion failure as far as any model checker should be
           | concerned. Compilers can-- _and will_ --treat undefined
           | behavior as license to alter the semantics of your program
           | without any constraint, and most instances of undefined
           | behavior are clearly programmer error (there is no good
           | reason to read uninitialized memory, for example).
           | 
           | Reading an uninitialized variable is _not_ "subtle" undefined
           | behavior. It's one of the most readily accessible examples
           | not only of what undefined behavior can exist, but also the
           | ways compilers will mutilate your code just because you did
           | it. To be honest, if something as simple as the consequences
           | of reading uninitialized memory are shaky understanding for
           | someone trying to prove code correct, that will completely
           | undermine any trust I have in the validity of your proofs.
        
             | philzook wrote:
             | Yes, I do not trust my understanding of C. Which is why I
             | want mechanized assistance.
        
               | philzook wrote:
               | I apologize for using the word subtle. What I should have
               | said is I don't understand the topic and reading about it
               | has left me a sense that one should be very careful.
        
             | tialaramex wrote:
             | Also, even _if_ the compiler does what a  "Real programmer"
             | type thinks it "should" do for this case (and I agree with
             | you that you're not entitled to expect that), you aren't
             | guaranteed that there's some particular value since you
             | never initialized it.
             | 
             | Your operating system likely feels entitled to assume that
             | if you never wrote to this page of RAM you don't care what
             | exactly is in it. After all what kind of lunatic _reads_ a
             | bunch of unknown data, says  "Yeah, that's coincidentally
             | what I wanted" and just leaves it unmodified? No, almost
             | anybody would _write_ data they want to keep instead. So,
             | if you never wrote to this particular page of RAM and your
             | OS finds it convenient to swap that page for a different
             | one, no harm no foul right? But now the contents of your
             | uninitialized variable changed!
        
               | addaon wrote:
               | > So, if you never wrote to this particular page of RAM
               | and your OS finds it convenient to swap that page for a
               | different one, no harm no foul right? But now the
               | contents of your uninitialized variable changed!
               | 
               | No sane OS will do this. Any page that's handed to a
               | process that was last written by a different process must
               | be zero'd (or otherwise have every address initialized)
               | by the OS to avoid leaking information across process
               | boundaries. You could, in theory, have a page that was
               | munmap'd by /this/ process be handed back to the same
               | process to fill a request for a different virtual address
               | without zeroing it, but I can't imagine that any OS
               | tracks the last writer to enable this "optimization" in
               | the few cases it would apply.
        
               | jcranmer wrote:
               | glibc marks freed pages with MADV_FREE, and on Linux, an
               | MADV_FREE page won't be unmapped immediately. It's
               | possible to read a value of an MADV_FREE'd page, have
               | Linux swap the page out, and then read it again and now
               | observe a zero-init'd page. If malloc returns a new
               | allocation to a freed page, it isn't written to by the
               | allocator (which would prevent the page from being
               | subsequently zeroed if it hasn't already done so), so it
               | doesn't guarantee that the contents won't be randomly
               | zeroed.
               | 
               | Whether or not this is a sane OS I leave an exercise to
               | the reader, but it is nonetheless the property of a
               | common OS.
        
               | amluto wrote:
               | Neither Linux nor Windows have this sort of uninitialized
               | memory, and it would be quite the security hole if they
               | did.
               | 
               | In general, the things your compiler thinks are UB are
               | not the same things your OS or CPU thinks are undefined.
        
               | SkiFire13 wrote:
               | At least Linux does this, and it is not a security hole
               | because it only reuses pages with memory of the current
               | process (as in the worst case you end up reading data
               | from some other part of your program, but not other
               | programs).
               | 
               | And yes, it does happen in practice, most famously it has
               | been mentioned in the "The strange details of std::string
               | at Facebook" talk at CppCon 2016
               | https://youtu.be/kPR8h4-qZdk?t=1150&si=2R358wniZfxTJLmc
        
           | philzook wrote:
           | For undefined behavior detection, I have heard of these:
           | 
           | - UB sanitizer
           | https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
           | 
           | - Cerberus Semantics
           | https://www.cl.cam.ac.uk/~pes20/cerberus/
           | 
           | - https://github.com/kframework/c-semantics
           | 
           | - https://github.com/TrustInSoft/tis-interpreter
        
         | nanolith wrote:
         | Well, specifically, it counts uninitialized variables as being
         | set to a non-deterministic value. The point of this tool isn't
         | to optimize functions as a compiler would, but rather to find
         | bad behavior based on its machine model.
         | 
         | This isn't perfect, of course. In this case, the compiler can
         | rightly treat x as uninitialized, meaning that its value could
         | be <= 42 at one point, and not be <= 42 at another point. Since
         | the uninitialized variable isn't "pinned", it could technically
         | be different values in different locations.
         | 
         | CBMC's machine model works differently. In this case, x is
         | assigned a non-deterministic value. The branch condition
         | creates a refinement of this value within the scope of that
         | statement. If the branch is taken, then by SMT rules, x can't
         | equal 12345, because it was already refined as being <= 42.
         | 
         | On its own, a model checker can miss situations like these.
         | It's why I recommend -Wall -Werror -Wpedantic in conjunction
         | with CBMC. The compiler should catch this as a warning, and it
         | should be upgraded as an error.
        
           | zzo38computer wrote:
           | > In this case, the compiler can rightly treat x as
           | uninitialized, meaning that its value could be <= 42 at one
           | point, and not be <= 42 at another point. Since the
           | uninitialized variable isn't "pinned", it could technically
           | be different values in different locations.
           | 
           | LLVM has a "freeze" command to stop propagation of undefined
           | values (although I think that command was added later than
           | the first version), so that the value is "pinned" as you say.
           | However, the "undef" command, if you do not use "freeze",
           | will not do this.
           | 
           | I think that the C compiler should "pin" such undefined
           | values where they are used, but I don't know which compilers
           | have an option to do this. (Perhaps CBMC should also have
           | such a switch, so that you can use the same options that you
           | will use with the C compiler.)
           | 
           | With this ex.3 file, the optimizer should be allowed to
           | result in a function that does nothing and has an unspecified
           | return value (which might or might not be the same each time
           | it is executed). (If optimizations are disabled, then it
           | should actually compile the conditional branch instruction.)
        
             | nanolith wrote:
             | This is one reason why I explored model checking machine
             | code output, since at this point, the behavior is very much
             | defined, even if it differs from implied source behavior.
             | But, this gets complicated for other reasons.
        
               | philzook wrote:
               | This is actually one of the reasons I've been tinkering
               | with CBMC https://www.philipzucker.com/pcode2c/ The idea
               | is to use Ghidra to lift a specialized C interpreter of
               | the machine code and then compare to source or ask other
               | questions via CBMC
        
               | nanolith wrote:
               | This kind of analysis is excellent.
               | 
               | I do recommend standardizing on hardware and toolchain
               | for projects like these, as it can ensure that the
               | machine code is matched.
               | 
               | The third phase of my work will round the corner with
               | machine code analysis. Currently, I'm working on
               | constructive proofs and equivalence proofs of imported C
               | code to handle the pieces that CBMC doesn't do so well,
               | as part of my second phase. So far, I can extract
               | efficient C from Lean 4, but I want to import C directly
               | into Lean 4 to prove it equivalent to extracted code.
               | Hand-written C is easier to read than the line noise I
               | can extract.
               | 
               | In particular, model checking doesn't fare well with data
               | structures, algorithms, and cryptography. These can be
               | torn down quite a bit, and at least we can verify that
               | the algorithms don't leak memory or make bad integer
               | assumptions. But, if we want to verify other properties,
               | this requires constructive proofs.
               | 
               | Between 80 and 95 percent of the time, depending on the
               | type of code being written, CBMC is enough. I'm now
               | solving for that 5 - 20 percent of code that CBMC can't
               | easily tackle.
        
               | philzook wrote:
               | ooooooh. link?
               | 
               | What do you mean by standardizing on hardware and
               | toolchain? I am currently choosing ghidra and cbmc, and
               | it seems like the approach is applicable to any compiler
               | or arch that ghidra supports without too much change. I
               | have only been doing x86-64 and gcc so far though
        
               | nanolith wrote:
               | Sadly, I don't have a link for this yet. My goal is to
               | write this up once I have a compelling example.
               | 
               | In the meantime, check out my github.
               | https://github.com/nanolith
               | 
               | Currently, I'm feeding commits into libcparse, which is
               | the start of this effort. That stream is currently about
               | 180 days out of phase with what's in my local
               | repositories, but it is slowly trickling in. The first
               | step of this second phase will be to use libcparse to
               | verify itself via constructive proofs in Lean. Currently,
               | and by that I mean what hits in March or April, libcparse
               | has a mostly C18 compliant lexical scanner for the C
               | preprocessor. The goal is to have a SAX-like C18 parser
               | that can detect all C declarations and definitions.
               | Tooling will include a utility that imports C
               | declarations and definitions into both Lean 4 and Coq.
               | This is a moonlighting effort for an upcoming book I
               | intend to write.
        
               | nanolith wrote:
               | As for standardizing on hardware and toolchain, what I
               | mean is that I try to get projects to commit to using
               | specific hardware and a specific toolchain. The goal is
               | to limit variables for automated testing and analysis.
               | 
               | OpenBSD, in particular, is adamant about testing their OS
               | on real hardware for platforms that they support. Their
               | definition of "working" is on real hardware. I think
               | that's a reasonable goal.
               | 
               | It helps with analysis like this, as you don't have to
               | worry about different compilers or different hardware.
               | GCC, in particular, can produce different output
               | depending on which hardware it was compiled on, unless a
               | specific configuration is locked down. If there is a flaw
               | in how it does vector operations on a particular Xeon
               | processor, this won't be caught if testing locally on a
               | Core i5 processor that produces different code.
        
       | IshKebab wrote:
       | This is also the backend for Kani - Amazon's formal verification
       | tool for Rust.
       | 
       | https://github.com/model-checking/kani
        
       | jvanderbot wrote:
       | > however runs all possible executions,
       | 
       | No, it does not. It probably does constraint based verification
       | or looks for "proofs" of possible error conditions or asserts. In
       | the case shown, an uninitialized variable is trivial to prove
       | that it could equal the asseted value.
        
         | philzook wrote:
         | Yes, correct. It does not and could not _actually_ run all
         | possible executions unless the program state space is quite
         | small. But from a user's perspective, it is similar. I have
         | tried this pedagogical approach to describing verifiers like
         | these as "infinite" unit tests before (to mixed results). It
         | feels to me that what symbolic or constraint based reasoning
         | (algebraic identities, programs, integrals, what have you) in
         | general is doing is finding a way to finitely reason about a
         | very large or actually infinite number of concretized cases.
        
       | p0w3n3d wrote:
       | I believe valgrind shows uninitialised variables
        
       ___________________________________________________________________
       (page generated 2024-01-30 23:00 UTC)