[HN Gopher] Go: Fuzzing Is Beta Ready
       ___________________________________________________________________
        
       Go: Fuzzing Is Beta Ready
        
       Author : ingve
       Score  : 290 points
       Date   : 2021-06-04 06:02 UTC (16 hours ago)
        
 (HTM) web link (blog.golang.org)
 (TXT) w3m dump (blog.golang.org)
        
       | thethimble wrote:
       | How does fuzzing compare to something like Quickcheck? Are they
       | basically equivalent?
        
         | pdpi wrote:
         | QuickCheck gives you a mechanism for a form of unit testing
         | where you build valid values as test cases and test that your
         | code maintains specific expected properties.
         | 
         | Fuzzing is similar but typically involves starting from a
         | known-good input then randomising it at the byte level
         | (irrespective of validity). This project allows for property
         | testing-like unit tests, but tools like american fuzzy lop
         | focus on detecting whole application crashes.
        
       | kubb wrote:
       | What are the benefits of building out a whole separate part of
       | the test framework that handles this? Is there a way to fuzz non-
       | string inputs?
        
         | tjpnz wrote:
         | >What are the benefits of building out a whole separate part of
         | the test framework that handles this?
         | 
         | I'm guessing that you won't always want to execute these
         | alongside other tests. Go has also taken the same approach with
         | benchmarks.
        
         | masklinn wrote:
         | > What are the benefits of building out a whole separate part
         | of the test framework that handles this?
         | 
         | Instead of making it part of the base runner? Maybe so the beta
         | bits can work and it'll be folded in more directly afterwards?
         | 
         | Also if it includes coverage guiding being able to know that a
         | run is fuzzing or non-fuzzing would avoid having to include the
         | fuzzing instrumentation for non-fuzzing run, mayhaps?
         | 
         | > Is there a way to fuzz non-string inputs?
         | 
         | From the design doc it looks like you can have any number of
         | parameters and
         | 
         | > Fuzzing of built-in types (e.g. simple types, maps, arrays)
         | and types which implement the BinaryMarshaler and TextMarshaler
         | interfaces are supported.
        
           | kubb wrote:
           | I was thinking a library to generate fuzzed inputs that you
           | could use in normal tests.
           | 
           | It's good to hear that it's not limited to strings.
        
             | masklinn wrote:
             | > I was thinking a library to generate fuzzed inputs that
             | you could use in normal tests.
             | 
             | The fuzzer input is random data. You don't need a special
             | library to generate random data. In fact that's what the
             | post tells you with respect to type compatibility:
             | 
             | > types which implement the BinaryMarshaler and
             | TextMarshaler interfaces are supported
             | 
             | these are just tools to convert from/to binary
             | (unstructured or utf8).
        
               | kubb wrote:
               | Are you sure it's random bytes? Some fuzzers start with a
               | given input and then mutate it to increase coverage of
               | the code under test.
        
               | masklinn wrote:
               | > Some fuzzers start with a given input and then mutate
               | it to increase coverage of the code under test.
               | 
               | Yes, and this one does, but you requested a library to
               | generate inputs didn't you? A library can't get coverage
               | feedback from the SUT, which as I wrote in my previous
               | comment would be why you'd be building a dedicated test
               | framework.
        
         | alephnan wrote:
         | > Is there a way to fuzz non-string inputs?
         | 
         | Fuzzing usually revolves around strings because of escape
         | characters, escape sequences. There is a much larger set of
         | string characters than there are for the 10 or so numeric
         | digits. Numbers don't have the same problems that strings do,
         | because numbers are usually interpreted only as data, whereas
         | strings can be interpreted as data or computation.
        
           | pdpi wrote:
           | > Fuzzing usually revolves around strings because of escape
           | characters, escape sequences.
           | 
           | Not always. AFL has been used to detect issues around
           | processing plain old binary data (eg
           | https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-8637)
           | 
           | I would argue that anything involving a parser of some
           | description (either binary or text-based) is a good candidate
           | for fuzzing.
        
         | jlouis wrote:
         | Fuzz testing is expensive.
         | 
         | And you gotta start somewhere. String inputs is a good start,
         | and you can use those to test other inputs by factoring through
         | conversion functions.
        
       | [deleted]
        
       | diegs wrote:
       | Interesting. I've used gopter a lot for property-based testing,
       | though it's very complex (and impressive), and can get slow or
       | require hacks for complex types.
       | 
       | I'm glad this is being made, but like many other things that have
       | been added to Go, it shows the limitations of the language that
       | you can't just build this inside the language. (I might be wrong,
       | as I haven't had a chance to look at the design
       | docs/implementation yet, but the installation instructions imply
       | that's the case).
        
         | icholy wrote:
         | > it shows the limitations of the language that you can't just
         | build this inside the language.
         | 
         | Not sure why you'd make that assumption.
         | https://github.com/dvyukov/go-fuzz
        
           | nemo1618 wrote:
           | go-fuzz requires an instrumented binary. It's essentially a
           | forked Go compiler. So I think it's fair to call this a
           | limitation of the language. But I don't see that as a
           | negative. :)
        
           | stevekemp wrote:
           | Yeah go-fuzz is an awesome tool, which I've used extensively
           | on some of my own projects.
           | 
           | When writing parsers and compilers it has proven eerily good
           | at identify corner-cases (panics, and infinite loops).
           | 
           | I'm looking forward to trying the new approach out. Anything
           | that makes fuzz-testing easier to configure/maintain and
           | spreads awareness is a good thing in my book.
        
       | moomin wrote:
       | Someone (Hillel Wayne?) observed that fuzz testing and property
       | testing are basically the same thing, but the communities are
       | almost entirely disjoint so the tools are completely separate.
        
         | pag wrote:
         | DeepState [1] is a tool that lets you write Google Test-style
         | unit tests, as well as property tests, in either C or C++, and
         | plug in fuzzers and symbolic executors. That is, DeepState
         | bridges this gap between fuzz testing and property testing.
         | 
         | [1] https://github.com/trailofbits/deepstate
        
         | Drup wrote:
         | My understanding is that the difference between fuzz testing
         | and property testing is how the input is crafted. Both can be
         | viewed as a pair of things: a function to generate series of
         | bits as input, and a way to turn these bits into the
         | appropriate data under test.
         | 
         | Property testing generates these bits using a specified
         | distribution, and that's about it. Fuzz testing generates these
         | bits by looking at how the program is executed, and uses a
         | black box to try to explore all paths in the program.
         | 
         | Most libraries for property testing comes with very convenient
         | ways to craft the "input to data" part. Fuzz tools come with an
         | almost magically effective way to craft interesting inputs. The
         | two combines very well (and have been combined in several
         | libraries).
        
           | dnautics wrote:
           | > Property testing generates these bits using a specified
           | distribution, and that's about it
           | 
           | I think most property testing frameworks also come with the
           | concept of "shrinkage", which is a way to walk back a failed
           | condition to try to find the "minimum requirement of
           | failure". Though I am sure there are PT frameworks that
           | haven't implemented this.
        
           | jlouis wrote:
           | Yep, but in the dual, they have the same goal: produce
           | interesting inputs to the program which might exhibit
           | trouble.
           | 
           | This is why you can use one of the approaches to help the
           | other side of the approach.
           | 
           | The 3rd solution is concolic testing: use an SMT/SAT solver
           | to flip branches. The path down to the branch imposes a
           | formula. By inverting the formula, we can pick a certain
           | branch path. Now you ask the SMT solver to check there
           | there's no way to go down that branch. If it finds
           | satisfiability, you have a counterexample and can explore
           | that path.
        
             | UncleMeat wrote:
             | Being thematically similar is not that interesting, if one
             | works better than the others.
             | 
             | Coverage guided fuzzing is eating symbolic execution and
             | concolic testing for breakfast. It isn't even close. As
             | much as I love these more principled approaches, the
             | strategy of throwing bits at programs and applying careful
             | coverage metrics is just way more effective, even for cases
             | that seem to be hand picked for SMT-based approaches to
             | win.
        
         | staticassertion wrote:
         | They are fundamentally the same in a lot of important ways. Any
         | fuzz test is almost certainly (if not entirely certainly) a
         | property test.
         | 
         | In Rust the driver for property and fuzz testing can be shared,
         | which is nice[0].
         | 
         | https://docs.rs/arbitrary/1.0.1/arbitrary/trait.Arbitrary.ht...
         | 
         | By describing my data with arbitrary I've written programs that
         | had traditional property testing and fuzz-testing as well,
         | without any additional effort.
         | 
         | It's only really post-AFL where fuzzing has suddenly become
         | synonymous with instrumentation guided data generation, which
         | is a totally fine distinction to make, but they're still
         | fundamentally equivalent. The first fuzzers were virtually
         | identical to prop test frameworks today.
         | 
         | I'd be fine with dropping the "fuzzing" name entirely and
         | instead just having us use the term property testing, with
         | "data generation" being the thing we start differentiating ie:
         | "random prop testing" or "instrumentation guided prop testing"
         | or "type based prop testing" etc.
        
         | px43 wrote:
         | I've heard that a bit recently too. I think it's more that
         | people selling property testing tools are trying to sell them
         | as fuzzing tools to unsuspecting suckers.
        
         | dllthomas wrote:
         | I expect the observation has been made many times, but one
         | particular example of note is https://danluu.com/testing/
        
         | rtpg wrote:
         | I think Prop testing is great, but stuff like AFL (with
         | instrumentation to basically find your conditionals for you,
         | and working backwards to modify data for it) is very different
         | from property testing.
         | 
         | My experience (real world, for actually existing code) is that
         | property tests often require a lot of fiddling with the data
         | generation, in order to actually stress your system in
         | interesting ways. If you just throw "totally random" data into
         | your system, you won't be testing very interesting properties.
         | Amazing assurances and payoff from doing it, of course! Just
         | like... "just generate random arbitraries" works a lot less
         | when you're working with 100 field structs and only 10 or so
         | "matter" in senses you care about.
         | 
         | To my knowledge I have not seen a property testing toolkit that
         | leverages code coverage in the way fuzzing does.
        
           | gsg wrote:
           | There are some examples, for example Crowbar is an OCaml tool
           | that uses AFL to drive property based tests.
        
           | [deleted]
        
         | typical182 wrote:
         | People can have different definitions and still communicate
         | usefully, and I think there is not 100% agreement on the exact
         | boundaries between the two.
         | 
         | That said, for me: they are distinct but related, and that
         | distinction is useful.
         | 
         | For example, Hypothesis[1] is a popular property testing
         | framework. The authors have more recently created HypoFuzz[2],
         | which includes this sentence in the introduction:
         | 
         |  _"HypoFuzz runs your property-based test suite, using cutting-
         | edge fuzzing techniques and coverage instrumentation to find
         | even the rarest inputs which trigger an error."_
         | 
         | Being able to talk about fuzzing and property testing as
         | distinct things seems useful -- saying something like "We added
         | fuzzing techniques to our property testing framework" is more
         | meaningful than "We added property testing techniques to our
         | property testing framework" ;-)
         | 
         | My personal hope is there will be more convergence, and work to
         | add convenient first-class fuzzing support in a popular
         | language like Go will hopefully help move the primary use case
         | for fuzzing to be about correctness, with security moving to an
         | important but secondary use case.
         | 
         | [1] https://hypothesis.works
         | 
         | [2] https://hypofuzz.com
        
         | Milner08 wrote:
         | Reading about fuzz testing all I could think of was 'is this
         | not property testing?'... That is strange. It sounds like both
         | communities could learn a lot from each other unless there is
         | something I am missing (there probably is..)
        
           | Kototama wrote:
           | Fuzzing is more to ensure safety and property checking to
           | ensure correctness IMHO. Both are related but not similar.
        
           | masklinn wrote:
           | In my mind fuzz testing is external, backboxed, and often
           | profile-guided, while proptesting is more structured,
           | internal (language aware) but generally less guided or
           | entirely unguided.
           | 
           | This here is closer to what I see as property testing than
           | fuzzing, although it looks like they plan on coverage
           | feedback (so guided generation).
        
             | ahelwer wrote:
             | Property-based testing requires you to define the condition
             | of success or failure (the "property" you're testing for)
             | too, right? Whereas fuzzing just looks for crashes?
        
               | masklinn wrote:
               | > Property-based testing requires you to define the
               | condition of success or failure (the "property" you're
               | testing for) too, right?
               | 
               | That property can be "does not crash".
               | 
               | And I'd say this is the structured / language-awareness
               | part: with fuzzing you can't generally build an oracle.
               | 
               | And if you can it's of course trivial: just have a
               | wrapper script check the result against the oracle, and
               | trigger whatever the fuzzer looks for indicating
               | "failure" whether it's a return code or a segfault or...
        
           | bottled_poe wrote:
           | Surely the coverage of fuzz testing is a superset of property
           | testing?
        
             | amw-zero wrote:
             | Right. That's the point - they are both conceptually the
             | same thing, testing via automatically generated inputs.
        
         | agumonkey wrote:
         | the common point being, covering the input space right ?
        
       | sriku wrote:
       | Is there a law of some kind that says the sum of time taken to
       | compile and test code is a language independent constant? So
       | either our testing tools take time to run or your compiler or a
       | mix of both .. if we want robust software, that is.
        
         | throwaway894345 wrote:
         | I don't think this is true. Python tests take a loooong time to
         | run while (on the other extreme) Go tests can compile and run
         | nearly instantly, at least based on my 10-15 years experience
         | with both languages.
        
       | CGamesPlay wrote:
       | Any details about how the mutator works? The design doc hints at
       | a "coverage-based" mutator, but I can't see anything specific
       | about how it works or even if that was implemented.
        
         | jrockway wrote:
         | Here's the plan for the mutator:
         | https://go.googlesource.com/proposal/+/master/design/draft-f...
        
           | greyface- wrote:
           | And the code: https://github.com/golang/go/blob/dev.fuzz/src/
           | internal/fuzz...
           | 
           | Edit: and here's the mechanism that guides mutations towards
           | increased coverage: https://github.com/golang/go/blob/5542c10
           | fbf19cb199d1659c189...
        
       | dorian-graph wrote:
       | IIRC there was this [1] issue that some people pushed for a
       | couple of years. Then at some point, this other one [2] became
       | the new one for it (which has Kate Hockman as the issue creator).
       | 
       | It's been a multi-year effort, so congrats to those who've made
       | it happen.
       | 
       | [1] https://github.com/golang/go/issues/19109
       | 
       | [2] https://github.com/golang/go/issues/44551
        
         | typical182 wrote:
         | There is a good LWN article that gives a useful overview of the
         | current proposal as well as briefly hits on some of the
         | history:
         | 
         | https://lwn.net/Articles/829242/
        
           | throwaway894345 wrote:
           | That is a good description. Rather than building a corpus
           | genetically (mutating a random input and adding it to the
           | corpus of it gives new coverage), I wonder if we could use
           | static analysis to generate a corpus in a single shot? I.e.,
           | statically identify the branches and pick inputs that cover
           | each branch?
        
       ___________________________________________________________________
       (page generated 2021-06-04 23:02 UTC)