[HN Gopher] Go: Fuzzing Is Beta Ready
___________________________________________________________________
Go: Fuzzing Is Beta Ready
Author : ingve
Score : 290 points
Date : 2021-06-04 06:02 UTC (16 hours ago)
(HTM) web link (blog.golang.org)
(TXT) w3m dump (blog.golang.org)
| thethimble wrote:
| How does fuzzing compare to something like Quickcheck? Are they
| basically equivalent?
| pdpi wrote:
| QuickCheck gives you a mechanism for a form of unit testing
| where you build valid values as test cases and test that your
| code maintains specific expected properties.
|
| Fuzzing is similar but typically involves starting from a
| known-good input then randomising it at the byte level
| (irrespective of validity). This project allows for property
| testing-like unit tests, but tools like american fuzzy lop
| focus on detecting whole application crashes.
| kubb wrote:
| What are the benefits of building out a whole separate part of
| the test framework that handles this? Is there a way to fuzz non-
| string inputs?
| tjpnz wrote:
| >What are the benefits of building out a whole separate part of
| the test framework that handles this?
|
| I'm guessing that you won't always want to execute these
| alongside other tests. Go has also taken the same approach with
| benchmarks.
| masklinn wrote:
| > What are the benefits of building out a whole separate part
| of the test framework that handles this?
|
| Instead of making it part of the base runner? Maybe so the beta
| bits can work and it'll be folded in more directly afterwards?
|
| Also if it includes coverage guiding being able to know that a
| run is fuzzing or non-fuzzing would avoid having to include the
| fuzzing instrumentation for non-fuzzing run, mayhaps?
|
| > Is there a way to fuzz non-string inputs?
|
| From the design doc it looks like you can have any number of
| parameters and
|
| > Fuzzing of built-in types (e.g. simple types, maps, arrays)
| and types which implement the BinaryMarshaler and TextMarshaler
| interfaces are supported.
| kubb wrote:
| I was thinking a library to generate fuzzed inputs that you
| could use in normal tests.
|
| It's good to hear that it's not limited to strings.
| masklinn wrote:
| > I was thinking a library to generate fuzzed inputs that
| you could use in normal tests.
|
| The fuzzer input is random data. You don't need a special
| library to generate random data. In fact that's what the
| post tells you with respect to type compatibility:
|
| > types which implement the BinaryMarshaler and
| TextMarshaler interfaces are supported
|
| these are just tools to convert from/to binary
| (unstructured or utf8).
| kubb wrote:
| Are you sure it's random bytes? Some fuzzers start with a
| given input and then mutate it to increase coverage of
| the code under test.
| masklinn wrote:
| > Some fuzzers start with a given input and then mutate
| it to increase coverage of the code under test.
|
| Yes, and this one does, but you requested a library to
| generate inputs didn't you? A library can't get coverage
| feedback from the SUT, which as I wrote in my previous
| comment would be why you'd be building a dedicated test
| framework.
| alephnan wrote:
| > Is there a way to fuzz non-string inputs?
|
| Fuzzing usually revolves around strings because of escape
| characters, escape sequences. There is a much larger set of
| string characters than there are for the 10 or so numeric
| digits. Numbers don't have the same problems that strings do,
| because numbers are usually interpreted only as data, whereas
| strings can be interpreted as data or computation.
| pdpi wrote:
| > Fuzzing usually revolves around strings because of escape
| characters, escape sequences.
|
| Not always. AFL has been used to detect issues around
| processing plain old binary data (eg
| https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-8637)
|
| I would argue that anything involving a parser of some
| description (either binary or text-based) is a good candidate
| for fuzzing.
| jlouis wrote:
| Fuzz testing is expensive.
|
| And you gotta start somewhere. String inputs is a good start,
| and you can use those to test other inputs by factoring through
| conversion functions.
| [deleted]
| diegs wrote:
| Interesting. I've used gopter a lot for property-based testing,
| though it's very complex (and impressive), and can get slow or
| require hacks for complex types.
|
| I'm glad this is being made, but like many other things that have
| been added to Go, it shows the limitations of the language that
| you can't just build this inside the language. (I might be wrong,
| as I haven't had a chance to look at the design
| docs/implementation yet, but the installation instructions imply
| that's the case).
| icholy wrote:
| > it shows the limitations of the language that you can't just
| build this inside the language.
|
| Not sure why you'd make that assumption.
| https://github.com/dvyukov/go-fuzz
| nemo1618 wrote:
| go-fuzz requires an instrumented binary. It's essentially a
| forked Go compiler. So I think it's fair to call this a
| limitation of the language. But I don't see that as a
| negative. :)
| stevekemp wrote:
| Yeah go-fuzz is an awesome tool, which I've used extensively
| on some of my own projects.
|
| When writing parsers and compilers it has proven eerily good
| at identify corner-cases (panics, and infinite loops).
|
| I'm looking forward to trying the new approach out. Anything
| that makes fuzz-testing easier to configure/maintain and
| spreads awareness is a good thing in my book.
| moomin wrote:
| Someone (Hillel Wayne?) observed that fuzz testing and property
| testing are basically the same thing, but the communities are
| almost entirely disjoint so the tools are completely separate.
| pag wrote:
| DeepState [1] is a tool that lets you write Google Test-style
| unit tests, as well as property tests, in either C or C++, and
| plug in fuzzers and symbolic executors. That is, DeepState
| bridges this gap between fuzz testing and property testing.
|
| [1] https://github.com/trailofbits/deepstate
| Drup wrote:
| My understanding is that the difference between fuzz testing
| and property testing is how the input is crafted. Both can be
| viewed as a pair of things: a function to generate series of
| bits as input, and a way to turn these bits into the
| appropriate data under test.
|
| Property testing generates these bits using a specified
| distribution, and that's about it. Fuzz testing generates these
| bits by looking at how the program is executed, and uses a
| black box to try to explore all paths in the program.
|
| Most libraries for property testing comes with very convenient
| ways to craft the "input to data" part. Fuzz tools come with an
| almost magically effective way to craft interesting inputs. The
| two combines very well (and have been combined in several
| libraries).
| dnautics wrote:
| > Property testing generates these bits using a specified
| distribution, and that's about it
|
| I think most property testing frameworks also come with the
| concept of "shrinkage", which is a way to walk back a failed
| condition to try to find the "minimum requirement of
| failure". Though I am sure there are PT frameworks that
| haven't implemented this.
| jlouis wrote:
| Yep, but in the dual, they have the same goal: produce
| interesting inputs to the program which might exhibit
| trouble.
|
| This is why you can use one of the approaches to help the
| other side of the approach.
|
| The 3rd solution is concolic testing: use an SMT/SAT solver
| to flip branches. The path down to the branch imposes a
| formula. By inverting the formula, we can pick a certain
| branch path. Now you ask the SMT solver to check there
| there's no way to go down that branch. If it finds
| satisfiability, you have a counterexample and can explore
| that path.
| UncleMeat wrote:
| Being thematically similar is not that interesting, if one
| works better than the others.
|
| Coverage guided fuzzing is eating symbolic execution and
| concolic testing for breakfast. It isn't even close. As
| much as I love these more principled approaches, the
| strategy of throwing bits at programs and applying careful
| coverage metrics is just way more effective, even for cases
| that seem to be hand picked for SMT-based approaches to
| win.
| staticassertion wrote:
| They are fundamentally the same in a lot of important ways. Any
| fuzz test is almost certainly (if not entirely certainly) a
| property test.
|
| In Rust the driver for property and fuzz testing can be shared,
| which is nice[0].
|
| https://docs.rs/arbitrary/1.0.1/arbitrary/trait.Arbitrary.ht...
|
| By describing my data with arbitrary I've written programs that
| had traditional property testing and fuzz-testing as well,
| without any additional effort.
|
| It's only really post-AFL where fuzzing has suddenly become
| synonymous with instrumentation guided data generation, which
| is a totally fine distinction to make, but they're still
| fundamentally equivalent. The first fuzzers were virtually
| identical to prop test frameworks today.
|
| I'd be fine with dropping the "fuzzing" name entirely and
| instead just having us use the term property testing, with
| "data generation" being the thing we start differentiating ie:
| "random prop testing" or "instrumentation guided prop testing"
| or "type based prop testing" etc.
| px43 wrote:
| I've heard that a bit recently too. I think it's more that
| people selling property testing tools are trying to sell them
| as fuzzing tools to unsuspecting suckers.
| dllthomas wrote:
| I expect the observation has been made many times, but one
| particular example of note is https://danluu.com/testing/
| rtpg wrote:
| I think Prop testing is great, but stuff like AFL (with
| instrumentation to basically find your conditionals for you,
| and working backwards to modify data for it) is very different
| from property testing.
|
| My experience (real world, for actually existing code) is that
| property tests often require a lot of fiddling with the data
| generation, in order to actually stress your system in
| interesting ways. If you just throw "totally random" data into
| your system, you won't be testing very interesting properties.
| Amazing assurances and payoff from doing it, of course! Just
| like... "just generate random arbitraries" works a lot less
| when you're working with 100 field structs and only 10 or so
| "matter" in senses you care about.
|
| To my knowledge I have not seen a property testing toolkit that
| leverages code coverage in the way fuzzing does.
| gsg wrote:
| There are some examples, for example Crowbar is an OCaml tool
| that uses AFL to drive property based tests.
| [deleted]
| typical182 wrote:
| People can have different definitions and still communicate
| usefully, and I think there is not 100% agreement on the exact
| boundaries between the two.
|
| That said, for me: they are distinct but related, and that
| distinction is useful.
|
| For example, Hypothesis[1] is a popular property testing
| framework. The authors have more recently created HypoFuzz[2],
| which includes this sentence in the introduction:
|
| _"HypoFuzz runs your property-based test suite, using cutting-
| edge fuzzing techniques and coverage instrumentation to find
| even the rarest inputs which trigger an error."_
|
| Being able to talk about fuzzing and property testing as
| distinct things seems useful -- saying something like "We added
| fuzzing techniques to our property testing framework" is more
| meaningful than "We added property testing techniques to our
| property testing framework" ;-)
|
| My personal hope is there will be more convergence, and work to
| add convenient first-class fuzzing support in a popular
| language like Go will hopefully help move the primary use case
| for fuzzing to be about correctness, with security moving to an
| important but secondary use case.
|
| [1] https://hypothesis.works
|
| [2] https://hypofuzz.com
| Milner08 wrote:
| Reading about fuzz testing all I could think of was 'is this
| not property testing?'... That is strange. It sounds like both
| communities could learn a lot from each other unless there is
| something I am missing (there probably is..)
| Kototama wrote:
| Fuzzing is more to ensure safety and property checking to
| ensure correctness IMHO. Both are related but not similar.
| masklinn wrote:
| In my mind fuzz testing is external, backboxed, and often
| profile-guided, while proptesting is more structured,
| internal (language aware) but generally less guided or
| entirely unguided.
|
| This here is closer to what I see as property testing than
| fuzzing, although it looks like they plan on coverage
| feedback (so guided generation).
| ahelwer wrote:
| Property-based testing requires you to define the condition
| of success or failure (the "property" you're testing for)
| too, right? Whereas fuzzing just looks for crashes?
| masklinn wrote:
| > Property-based testing requires you to define the
| condition of success or failure (the "property" you're
| testing for) too, right?
|
| That property can be "does not crash".
|
| And I'd say this is the structured / language-awareness
| part: with fuzzing you can't generally build an oracle.
|
| And if you can it's of course trivial: just have a
| wrapper script check the result against the oracle, and
| trigger whatever the fuzzer looks for indicating
| "failure" whether it's a return code or a segfault or...
| bottled_poe wrote:
| Surely the coverage of fuzz testing is a superset of property
| testing?
| amw-zero wrote:
| Right. That's the point - they are both conceptually the
| same thing, testing via automatically generated inputs.
| agumonkey wrote:
| the common point being, covering the input space right ?
| sriku wrote:
| Is there a law of some kind that says the sum of time taken to
| compile and test code is a language independent constant? So
| either our testing tools take time to run or your compiler or a
| mix of both .. if we want robust software, that is.
| throwaway894345 wrote:
| I don't think this is true. Python tests take a loooong time to
| run while (on the other extreme) Go tests can compile and run
| nearly instantly, at least based on my 10-15 years experience
| with both languages.
| CGamesPlay wrote:
| Any details about how the mutator works? The design doc hints at
| a "coverage-based" mutator, but I can't see anything specific
| about how it works or even if that was implemented.
| jrockway wrote:
| Here's the plan for the mutator:
| https://go.googlesource.com/proposal/+/master/design/draft-f...
| greyface- wrote:
| And the code: https://github.com/golang/go/blob/dev.fuzz/src/
| internal/fuzz...
|
| Edit: and here's the mechanism that guides mutations towards
| increased coverage: https://github.com/golang/go/blob/5542c10
| fbf19cb199d1659c189...
| dorian-graph wrote:
| IIRC there was this [1] issue that some people pushed for a
| couple of years. Then at some point, this other one [2] became
| the new one for it (which has Kate Hockman as the issue creator).
|
| It's been a multi-year effort, so congrats to those who've made
| it happen.
|
| [1] https://github.com/golang/go/issues/19109
|
| [2] https://github.com/golang/go/issues/44551
| typical182 wrote:
| There is a good LWN article that gives a useful overview of the
| current proposal as well as briefly hits on some of the
| history:
|
| https://lwn.net/Articles/829242/
| throwaway894345 wrote:
| That is a good description. Rather than building a corpus
| genetically (mutating a random input and adding it to the
| corpus of it gives new coverage), I wonder if we could use
| static analysis to generate a corpus in a single shot? I.e.,
| statically identify the branches and pick inputs that cover
| each branch?
___________________________________________________________________
(page generated 2021-06-04 23:02 UTC)