[HN Gopher] The C Bounded Model Checker: Criminally Underused
___________________________________________________________________
The C Bounded Model Checker: Criminally Underused
Author : philzook
Score : 60 points
Date : 2024-01-30 15:51 UTC (7 hours ago)
(HTM) web link (www.philipzucker.com)
(TXT) w3m dump (www.philipzucker.com)
| gaudat wrote:
| I miss this kind of stuff in computer programming languages.
|
| In hardware design, verification is done simultaneously with
| design, and semiconductor companies would be bankrupt if they did
| not verify the hell out of their designs before committing
| millions in manufacturing these chips.
|
| Even among hobbyists this is getting traction with yosys.perhaps
| its time for programmers to adapt this kind of tooling so there
| will be less buggy software released...
| vacuity wrote:
| There are probably orders of magnitude in difference between
| hardware design output and programming output. At least at the
| time, seL4's verification was quite impressive for a codebase
| on the order of 10,000 lines. But we should work towards the
| goal of improved formal modelling and checking all the same.
| rwmj wrote:
| Nice but can you run it on non-toy programs?
|
| I gave a talk about Frama-C a few years ago. It's very
| interesting technology, but not too useful at any scale
| (http://oirase.annexia.org/tmp/2020-frama-c-tech-talk.mp4) One
| aim we had with Bounds Checking GCC back in '95 was to make it
| possible to use with real programs. Although it was quite slow,
| any open source program of the era could use it
| (https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html).
| philzook wrote:
| Non trivial example projects https://model-
| checking.github.io/cbmc-training/projects.html Some crypto, AWS
| common C, FreeRTOS. Other interesting applications
| https://www.cprover.org/cbmc/applications/
| philzook wrote:
| Frama-C is also very cool. Also Coq VST and however seL4 does
| it. I think out of all of these, CBMC requires the least
| expertise / has highest automation. I also don't see how it
| can scale beyond the unit test level, or patch together
| verification conditions into global properties. Perhaps this
| could be done by some meta framework plugging together calls
| to CBMC. Given the level of penetration of formal methods
| generally, I think the relatively low ambition of CBMC
| compared to these higher ceiling techniques is good.
| rwmj wrote:
| Thanks, very interesting stuff!
| jcranmer wrote:
| One of the examples they gave was an HTTP client, which would
| be a surprisingly non-toy example, so I looked at what they
| actually did in the code
| (https://github.com/FreeRTOS/coreHTTP/tree/main/test/cbmc).
|
| Not that I'm an expert in processing what exactly is being
| tested, but it basically looks only able to prove that an
| individual function doesn't overrun buffers. If you tell it to
| assume that integer overflows can't happen (!). So I'm not
| impressed.
| nanolith wrote:
| I've used CBMC in large commercial projects ranging around half
| a million lines of code. The secret is to break down every one
| of those functions into small pieces that the model checker can
| analyze.
|
| CBMC works best with a contract-based programming style, where
| contracts are enforced by assertions and shadow functions exist
| to simplify model checking.
|
| A single reply on HN is hardly a place where idiomatic styles
| can be expounded, but it is quite possible to build a resource-
| oriented idiomatic style that is reinforced with CBMC.
| philzook wrote:
| I would love to hear more
| nanolith wrote:
| In particular, one of the more important rules I made when
| using CBMC was to keep the search depth for each invocation
| as shallow as possible. If CBMC runs in less than five
| minutes, you're doing it right. If it takes more than that,
| then you're asking it to do too much.
|
| This led to the creation of function contracts and shadow
| functions. When evaluating functions, we actually want any
| calls that it makes to go to shadow functions instead of
| real functions. When we write a function, we include a
| contract that includes anything it may return, any side-
| effects it may have, and any memory it may touch / allocate
| / initialize. We then write a twin function -- its shadow
| function -- that technically follows the same contract in
| the most simplified terms, randomly choosing whether to
| succeed or fail. From CBMC's perspective, it's equivalent
| enough for model checking. But, it removes additional stack
| depth or recursion.
|
| A good example of this would be a shadow function for the
| POSIX read function. Its contract should verify that the
| descriptor it is passed is valid. It should also assert
| that the memory region it is passed is bounded by the size
| it is given. The shadow function picks a non-deterministic
| state based on how it should set errno and how it should
| return. This state follows the POSIX standard, but the
| underlying system calls don't matter. Likewise, depending
| on the return code, it should initialize a number of bytes
| in the read buffer with non-deterministic values.
|
| I used this shadow read function to find a buffer overflow
| in networking code and also to find a DOS infinite loop
| error in a tagged binary file format reader we were using.
|
| CBMC isn't perfect, but when coupled with a good linter and
| -Wall / -Werror / -Wpedantic, it is a very useful layer in
| a defense in depth strategy for safer software.
| lmm wrote:
| > CBMC isn't perfect, but when coupled with a good linter
| and -Wall / -Werror / -Wpedantic, it is a very useful
| layer in a defense in depth strategy for safer software.
|
| I agree to a certain extent, but at what point are you
| putting in more effort than it would take to migrate to a
| stricter language that offers more safety by default?
| nanolith wrote:
| I'm not aware of any language that can't be improved by
| this strategy. Amazon even added SMT solving tooling
| around Rust, because Rust's built-in safety guarantees
| aren't enough.
|
| I'd say that the added effort is minimal. The compiler
| spits out more errors. We think a little more carefully
| about how we write functions so CBMC doesn't complain.
| There are no silver bullets.
| WalterBright wrote:
| Unfortunately, int main(){ char
| buffer[10]; buffer[10] = 0; }
|
| are so rare they are hardly worth bothering with. The more usual
| case is: int get(char* buffer) {
| return buffer[10]; } void test() {
| char buffer[10]; get(buffer); }
|
| I.e. the array bounds for buffer get lost in the function call. I
| have proposed a fix:
|
| https://www.digitalmars.com/articles/C-biggest-mistake.html
|
| that is compatible with existing code.
|
| And yet, nobody cares. Oh well! Instead, we have overly complex
| solutions like this:
|
| https://developers.redhat.com/articles/2022/09/17/gccs-new-f...
| philzook wrote:
| Your example does not pass the verifier cbmc
| /tmp/walter.c --bounds-check --pointer-check --function test
| ... [get.pointer_dereference.5] line 3 dereference
| failure: pointer outside object bounds in buffer[(signed long
| int)10]: FAILURE ...
|
| There are many footguns. People love their guns. Makes them
| feel powerful.
| WalterBright wrote:
| int get(char* buffer, int i) { return
| buffer[i]; } #include <stdio.h>
| void test() { char buffer[10];
| get(buffer, getc(stdin)); }
| nanolith wrote:
| CBMC's default contract for getc returns a non-
| deterministic integer value. A non-deterministic integer
| value basically counts as "pick a value that would cause
| this de-reference to crash the machine model".
|
| So, it will find a counter-example.
| philzook wrote:
| Yes, it returned the same "FAILURE" along with a couple
| new checks about getc that do pass.
| WalterBright wrote:
| Sounds like it's doing a solid job of data flow analysis.
| But I can still produce cases where it cannot determine
| either the range of the index or the length of the array
| being pointed to.
|
| The programmer will have to add code to keep track of the
| length of the array, and add code to check the value of
| the index. The good thing is the analyzer will tell you
| where you'll need to add those checks.
|
| With the proposal I made, none of this is necessary. It
| just works. Over two decades of experience with this
| approach in D is it-just-works.
| nanolith wrote:
| I'm curious, have you used CBMC?
|
| The machine model tracks the array with its size and
| compares this to any index. This is loaded into an SMT
| solver. This is not a standard static analyzer. It's
| compiling the code to an abstract machine model and then
| running through this model with an SMT solver.
|
| CBMC isn't perfect, but the strategies you are talking
| about are meant to defeat a linter or a shallow analyzer
| in a compiler. CBMC isn't a linter.
| WalterBright wrote:
| I have never heard of CBMC before this thread.
|
| SMT cannot solve cases where it simply cannot know what
| the length of an array is. For example, when array is
| allocated in code not available to the solver, or is set
| by the environment.
|
| I remember trying out Spark a few years ago, which
| advertised its compile time checking abilities. I tried
| using an integer equation that used an OR, and it gave up
| on it.
|
| If you need to add in range checking code to help the
| solver along, then the range check itself is a source of
| bugs, and the range limit has to be made available
| somewhere and be correct.
|
| In my proposal, the array length is set at the time of
| creation of the array, and the length is carried along
| with the pointer. The solver's role then becomes one of
| _eliminating_ the bounds check in the cases where it can
| prove the index is within range. The user doesn 't have
| to do anything.
|
| We've been using it in the implementation of the D
| compiler for a very long time now, and problems with
| buffer overflows are a faded memory.
|
| P.S. I also added manual integer overflow checks when
| passing the size to malloc(), no matter how unlikely an
| overflow would be :-)
| Joker_vD wrote:
| > There are many footguns. People love their guns. Makes them
| feel powerful.
|
| Well, duh: if you remove the most egregious footguns from C,
| you end up basically with Pascal, and here is Why Pascal Is
| Not My Favorite Programming Language.
|
| And when you try to explain to them that C's abstract machine
| has semantics of "when you omit a safety check, you actually
| makes a promise to the compiler that this check is redundant,
| that you'll promise that you'll make sure in some other way
| that in no possible execution will the illegal values ever
| show up in this place, and the compiler will actually hold
| you to this promise" -- they throw tantrums and try to argue
| that no, it hasn't this semantics, and even if has, it's not
| the semantics as originally intended by K&R and it is the
| authors of the C compilers who are wrong (even though the
| authors of C compilers are the people who actually write both
| the C standards _and_ the C compilers...)
|
| Must have to do something with imprintment: I've learned C as
| my 3rd language, and from the start I was taught that its UB
| is absolutely insane -- and I've never felt "nah, it should
| just do what I meant, dangit" denial about it, only "this is
| sad, how can I even know while writing code when I am making
| a baseless promise especially when it's made by _omitting_
| something? " But like you've said, people like their
| footguns.
| obl wrote:
| Weird that this treats uninitialized variables as unknown values.
| For example in ex3.c, the program int main(){
| int x; if (x <= 42){ assert(x !=
| 12345); } }
|
| is of course UB in C, even though under a "uninitialized is
| random" model the program is valid and does not assert (as the
| model checker concludes).
|
| (even in O1 clang gets rid of the whole function, including even
| the ret instruction, I'm surprised it does not at least leave an
| ud2 for an empty function to help debugging since it would not
| cost anything https://godbolt.org/z/eK8cz3EPe )
| philzook wrote:
| I see the calling an undefined prototype style more often,
| perhaps for this reason. You are probably right this is
| undefined behavior, but it's subtle.
| https://stackoverflow.com/questions/11962457/why-is-using-an...
| I suspect CBMC just picks some concrete behavior for undefined
| behavior. It may not be a good detector for that. I'm not sure.
| This gets into shaky territory of understanding for me.
| jcranmer wrote:
| Speaking as a compiler writer:
|
| Any invocation of undefined behavior should be considered an
| assertion failure as far as any model checker should be
| concerned. Compilers can-- _and will_ --treat undefined
| behavior as license to alter the semantics of your program
| without any constraint, and most instances of undefined
| behavior are clearly programmer error (there is no good
| reason to read uninitialized memory, for example).
|
| Reading an uninitialized variable is _not_ "subtle" undefined
| behavior. It's one of the most readily accessible examples
| not only of what undefined behavior can exist, but also the
| ways compilers will mutilate your code just because you did
| it. To be honest, if something as simple as the consequences
| of reading uninitialized memory are shaky understanding for
| someone trying to prove code correct, that will completely
| undermine any trust I have in the validity of your proofs.
| philzook wrote:
| Yes, I do not trust my understanding of C. Which is why I
| want mechanized assistance.
| philzook wrote:
| I apologize for using the word subtle. What I should have
| said is I don't understand the topic and reading about it
| has left me a sense that one should be very careful.
| tialaramex wrote:
| Also, even _if_ the compiler does what a "Real programmer"
| type thinks it "should" do for this case (and I agree with
| you that you're not entitled to expect that), you aren't
| guaranteed that there's some particular value since you
| never initialized it.
|
| Your operating system likely feels entitled to assume that
| if you never wrote to this page of RAM you don't care what
| exactly is in it. After all what kind of lunatic _reads_ a
| bunch of unknown data, says "Yeah, that's coincidentally
| what I wanted" and just leaves it unmodified? No, almost
| anybody would _write_ data they want to keep instead. So,
| if you never wrote to this particular page of RAM and your
| OS finds it convenient to swap that page for a different
| one, no harm no foul right? But now the contents of your
| uninitialized variable changed!
| addaon wrote:
| > So, if you never wrote to this particular page of RAM
| and your OS finds it convenient to swap that page for a
| different one, no harm no foul right? But now the
| contents of your uninitialized variable changed!
|
| No sane OS will do this. Any page that's handed to a
| process that was last written by a different process must
| be zero'd (or otherwise have every address initialized)
| by the OS to avoid leaking information across process
| boundaries. You could, in theory, have a page that was
| munmap'd by /this/ process be handed back to the same
| process to fill a request for a different virtual address
| without zeroing it, but I can't imagine that any OS
| tracks the last writer to enable this "optimization" in
| the few cases it would apply.
| jcranmer wrote:
| glibc marks freed pages with MADV_FREE, and on Linux, an
| MADV_FREE page won't be unmapped immediately. It's
| possible to read a value of an MADV_FREE'd page, have
| Linux swap the page out, and then read it again and now
| observe a zero-init'd page. If malloc returns a new
| allocation to a freed page, it isn't written to by the
| allocator (which would prevent the page from being
| subsequently zeroed if it hasn't already done so), so it
| doesn't guarantee that the contents won't be randomly
| zeroed.
|
| Whether or not this is a sane OS I leave an exercise to
| the reader, but it is nonetheless the property of a
| common OS.
| amluto wrote:
| Neither Linux nor Windows have this sort of uninitialized
| memory, and it would be quite the security hole if they
| did.
|
| In general, the things your compiler thinks are UB are
| not the same things your OS or CPU thinks are undefined.
| SkiFire13 wrote:
| At least Linux does this, and it is not a security hole
| because it only reuses pages with memory of the current
| process (as in the worst case you end up reading data
| from some other part of your program, but not other
| programs).
|
| And yes, it does happen in practice, most famously it has
| been mentioned in the "The strange details of std::string
| at Facebook" talk at CppCon 2016
| https://youtu.be/kPR8h4-qZdk?t=1150&si=2R358wniZfxTJLmc
| philzook wrote:
| For undefined behavior detection, I have heard of these:
|
| - UB sanitizer
| https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
|
| - Cerberus Semantics
| https://www.cl.cam.ac.uk/~pes20/cerberus/
|
| - https://github.com/kframework/c-semantics
|
| - https://github.com/TrustInSoft/tis-interpreter
| nanolith wrote:
| Well, specifically, it counts uninitialized variables as being
| set to a non-deterministic value. The point of this tool isn't
| to optimize functions as a compiler would, but rather to find
| bad behavior based on its machine model.
|
| This isn't perfect, of course. In this case, the compiler can
| rightly treat x as uninitialized, meaning that its value could
| be <= 42 at one point, and not be <= 42 at another point. Since
| the uninitialized variable isn't "pinned", it could technically
| be different values in different locations.
|
| CBMC's machine model works differently. In this case, x is
| assigned a non-deterministic value. The branch condition
| creates a refinement of this value within the scope of that
| statement. If the branch is taken, then by SMT rules, x can't
| equal 12345, because it was already refined as being <= 42.
|
| On its own, a model checker can miss situations like these.
| It's why I recommend -Wall -Werror -Wpedantic in conjunction
| with CBMC. The compiler should catch this as a warning, and it
| should be upgraded as an error.
| zzo38computer wrote:
| > In this case, the compiler can rightly treat x as
| uninitialized, meaning that its value could be <= 42 at one
| point, and not be <= 42 at another point. Since the
| uninitialized variable isn't "pinned", it could technically
| be different values in different locations.
|
| LLVM has a "freeze" command to stop propagation of undefined
| values (although I think that command was added later than
| the first version), so that the value is "pinned" as you say.
| However, the "undef" command, if you do not use "freeze",
| will not do this.
|
| I think that the C compiler should "pin" such undefined
| values where they are used, but I don't know which compilers
| have an option to do this. (Perhaps CBMC should also have
| such a switch, so that you can use the same options that you
| will use with the C compiler.)
|
| With this ex.3 file, the optimizer should be allowed to
| result in a function that does nothing and has an unspecified
| return value (which might or might not be the same each time
| it is executed). (If optimizations are disabled, then it
| should actually compile the conditional branch instruction.)
| nanolith wrote:
| This is one reason why I explored model checking machine
| code output, since at this point, the behavior is very much
| defined, even if it differs from implied source behavior.
| But, this gets complicated for other reasons.
| philzook wrote:
| This is actually one of the reasons I've been tinkering
| with CBMC https://www.philipzucker.com/pcode2c/ The idea
| is to use Ghidra to lift a specialized C interpreter of
| the machine code and then compare to source or ask other
| questions via CBMC
| nanolith wrote:
| This kind of analysis is excellent.
|
| I do recommend standardizing on hardware and toolchain
| for projects like these, as it can ensure that the
| machine code is matched.
|
| The third phase of my work will round the corner with
| machine code analysis. Currently, I'm working on
| constructive proofs and equivalence proofs of imported C
| code to handle the pieces that CBMC doesn't do so well,
| as part of my second phase. So far, I can extract
| efficient C from Lean 4, but I want to import C directly
| into Lean 4 to prove it equivalent to extracted code.
| Hand-written C is easier to read than the line noise I
| can extract.
|
| In particular, model checking doesn't fare well with data
| structures, algorithms, and cryptography. These can be
| torn down quite a bit, and at least we can verify that
| the algorithms don't leak memory or make bad integer
| assumptions. But, if we want to verify other properties,
| this requires constructive proofs.
|
| Between 80 and 95 percent of the time, depending on the
| type of code being written, CBMC is enough. I'm now
| solving for that 5 - 20 percent of code that CBMC can't
| easily tackle.
| philzook wrote:
| ooooooh. link?
|
| What do you mean by standardizing on hardware and
| toolchain? I am currently choosing ghidra and cbmc, and
| it seems like the approach is applicable to any compiler
| or arch that ghidra supports without too much change. I
| have only been doing x86-64 and gcc so far though
| nanolith wrote:
| Sadly, I don't have a link for this yet. My goal is to
| write this up once I have a compelling example.
|
| In the meantime, check out my github.
| https://github.com/nanolith
|
| Currently, I'm feeding commits into libcparse, which is
| the start of this effort. That stream is currently about
| 180 days out of phase with what's in my local
| repositories, but it is slowly trickling in. The first
| step of this second phase will be to use libcparse to
| verify itself via constructive proofs in Lean. Currently,
| and by that I mean what hits in March or April, libcparse
| has a mostly C18 compliant lexical scanner for the C
| preprocessor. The goal is to have a SAX-like C18 parser
| that can detect all C declarations and definitions.
| Tooling will include a utility that imports C
| declarations and definitions into both Lean 4 and Coq.
| This is a moonlighting effort for an upcoming book I
| intend to write.
| nanolith wrote:
| As for standardizing on hardware and toolchain, what I
| mean is that I try to get projects to commit to using
| specific hardware and a specific toolchain. The goal is
| to limit variables for automated testing and analysis.
|
| OpenBSD, in particular, is adamant about testing their OS
| on real hardware for platforms that they support. Their
| definition of "working" is on real hardware. I think
| that's a reasonable goal.
|
| It helps with analysis like this, as you don't have to
| worry about different compilers or different hardware.
| GCC, in particular, can produce different output
| depending on which hardware it was compiled on, unless a
| specific configuration is locked down. If there is a flaw
| in how it does vector operations on a particular Xeon
| processor, this won't be caught if testing locally on a
| Core i5 processor that produces different code.
| IshKebab wrote:
| This is also the backend for Kani - Amazon's formal verification
| tool for Rust.
|
| https://github.com/model-checking/kani
| jvanderbot wrote:
| > however runs all possible executions,
|
| No, it does not. It probably does constraint based verification
| or looks for "proofs" of possible error conditions or asserts. In
| the case shown, an uninitialized variable is trivial to prove
| that it could equal the asseted value.
| philzook wrote:
| Yes, correct. It does not and could not _actually_ run all
| possible executions unless the program state space is quite
| small. But from a user's perspective, it is similar. I have
| tried this pedagogical approach to describing verifiers like
| these as "infinite" unit tests before (to mixed results). It
| feels to me that what symbolic or constraint based reasoning
| (algebraic identities, programs, integrals, what have you) in
| general is doing is finding a way to finitely reason about a
| very large or actually infinite number of concretized cases.
| p0w3n3d wrote:
| I believe valgrind shows uninitialised variables
___________________________________________________________________
(page generated 2024-01-30 23:00 UTC)