[HN Gopher] Go Fuzzing
___________________________________________________________________
Go Fuzzing
Author : 0xedb
Score : 149 points
Date : 2022-01-01 18:27 UTC (4 hours ago)
(HTM) web link (tip.golang.org)
(TXT) w3m dump (tip.golang.org)
| staticassertion wrote:
| Great to see fuzzing becoming more mainstream. Ultimately we have
| absurd program states, with even a trivial program's state vastly
| exceeding the number of particles in the universe. We need to
| start finding order-of-magnitude-better approaches for testing.
|
| I almost always write generated tests at this point with unit
| tests being a fallback for slow code or niche cases. What I
| _dont_ generally write though is fuzz tests, which would really
| be a 'next step'. In Rust it's not very hard to do so, but it
| hasn't quite hit the "trivial" mark yet for me, whereas
| quickcheck is virtually the same amount of work to use as to not
| use.
|
| Languages like Go adopting and mainstreaming these practices will
| be a benefit to everyone.
|
| I'm curious if there's documentation on:
|
| a) The coverage approach taken
|
| b) The mutation approach taken
|
| Can you configure these? Plugin different fuzzing backends?
| fpopa wrote:
| What kind of generated tests are you writing?
|
| Is it more similar to 'golden files'? Generate expected output
| and assert versus current implementation output?
| adamgordonbell wrote:
| Go stdlib has property testing built it. It's not as powerful
| as some quick check frameworks, but it's built right in. I
| wrote an article on it.
|
| https://earthly.dev/blog/property-based-testing/
| staticassertion wrote:
| Just taking what would normally be a unit test and having the
| input values be generated. Some examples:
|
| 1. I have a test for encryption/decryption functions. The
| data that's provided for the plaintext, additional data, and
| key, is generated. The assertions are:
|
| assert_ne!(plaintext, encrypted_data);
|
| assert_eq!(plaintext, decrypted_data);
|
| assert_eq!(aad, decrypted_aad);
|
| etc
|
| 2. I have some generated integration tests. For example, in
| our product, there are certain properties that should always
| hold for a given database entry. I generate a new entry on
| every test and have the fields for that entry provided by
| quickcheck, then I perform the operation, query the database,
| and assert that properties on those values hold.
|
| So to answer your question, yes. Sometimes you want to check
| a concrete output (ie: "this base64 encoded string should
| always equal this other value) for sanity, but in general
| property tests give me more confidence.
|
| I find it works particularly well with a 'given, when, then'
| approach, personally.
|
| edit: I'll also note that for the base64 case I'd suggest:
|
| a) A hardcoded suite of values.
|
| b) Generate property tests.
|
| assert_eq!(value, base64decode(base64encode(value));
|
| As well as things like "contains only these characters" and
| "ends with [=a-zA-Z]" etc.
|
| c) Oracle tests against a "known good" implementation.
| krobelus wrote:
| Sounds like a sensible mix. There is really no single
| silver bullet.
|
| We at https://symflower.com/ are working on a product to
| generate unit tests. Unlike quickcheck/proptest we promise
| to find errors, even if they are unlikely (for example
| [this input](https://github.com/AltSysrq/proptest/blob/mast
| er/proptest/RE...) would be trivial for Symflower). Also,
| unlike fuzzing our technology is deterministic.
|
| Here's one of our blog posts that explains the approach:
| https://symflower.com/en/company/blog/2021/symflower-
| finds-m...
| 2OEH8eoCRo0 wrote:
| > Great to see fuzzing becoming more mainstream.
|
| Agreed. I wrote a fuzzer at my last job and it found a bunch of
| bugs right before a release. Nobody knew what fuzzing was so I
| was attacked by the program owner for trying to break the
| software and given an insulting performance review for it. Then
| I had all the fuzzing results and coredumps deleted out of
| their directories by the program owner so the release looked
| immaculate. Defense software ftw
| staticassertion wrote:
| Yikes, sounds pretty toxic on their part, but good on you for
| taking a strong approach to software stability.
|
| Also, writing fuzzers is super fun.
| masklinn wrote:
| > I almost always write generated tests at this point with unit
| tests being a fallback for slow code or niche cases. What I
| dont generally write though is fuzz tests, which would really
| be a 'next step'. In Rust it's not very hard to do so, but it
| hasn't quite hit the "trivial" mark yet for me, whereas
| quickcheck is virtually the same amount of work to use as to
| not use.
|
| Did you mean genera _tive_ tests? You 're talking about
| quickcheck and that's what it does.
|
| "Generated tests" would usually be interpreted as codegen'd
| test which you commit.
| staticassertion wrote:
| Tests with generated input. Call it what you like.
| cinntaile wrote:
| What would you say are the main differences between a fuzzer and
| QuickCheck? The authors of quickcheck don't call it a fuzzer so I
| assume there is some difference but both seem to randomize
| inputs?
| ryanschneider wrote:
| Anyone seen good articles on converting go-fuzz tests to native
| fuzzing? Specifics on the new corpus format and a converter from
| go-fuzz would be really useful.
|
| It's great to hear that the fuzzer is built on go-fuzz so
| hopefully the conversion process won't be too bad:
| https://github.com/dvyukov/go-fuzz/issues/329
| aleksi wrote:
| https://pkg.go.dev/golang.org/x/tools@v0.1.8/cmd/file2fuzz
| morelisp wrote:
| I've pre-emptively migrated a couple projects and found that
| loading the old corpus files wherever you already had them and
| then `Add`ing them as whatever new appropriate type was the
| easiest way. The inclusion of types necessitates at least a
| minor migration. I did not find any official documentation on
| the format, though it's trivial to read, e.g.:
| go test fuzz v1 string("\xff0")
|
| Overall while the API (and of course tooling) is a huge step
| forward, corpus management feels like a small step backwards
| compared to go-fuzz - I didn't find a way to pull non-crashers
| into an in-repo corpus other than manually copying them out of
| my cache directory. And one-file-per-case still blows up a lot
| of repo management tools.
| dang wrote:
| Past related thread:
|
| _Go: Fuzzing Is Beta Ready_ -
| https://news.ycombinator.com/item?id=27391048 - June 2021 (53
| comments)
| xiaq wrote:
| Fuzzing is awesome. I just discovered an accidental O(2^n) code
| path in my project with fuzzing and fixed it:
| https://github.com/elves/elvish/commit/9cda3f643efafce2df567...
|
| Edit: shortly after I wrote this comment, fuzzing discovered
| another pathological input - and that was fixed in
| https://github.com/elves/elvish/commit/04173ee8ab3c7fc4a9e79...
|
| (In case people are curious, the project is a Unix shell, Elvish:
| https://elv.sh)
| damagednoob wrote:
| I will never understand why this has been included in the
| standard library instead of as a standalone library available for
| download. Now it's locked to the Go release cycle and have the
| potential to languish because of backward compatibility concerns.
|
| The decision to include it is perplexing when other language
| ecosystems have chosen to keep this kind of functionality out of
| the standard lib, e.g. requests in python[1]. To quote Kenneth
| Reitz: "...the standard library is where a library goes to die."
|
| [1] https://github.com/psf/requests/issues/2424
| cube2222 wrote:
| Seems fairly standard for Go.
|
| You mentioned requests - the Go net/http library is widely
| used, even though it's in the standard library. It didn't
| languish, it didn't die. The interfaces are also used in most
| 3rd party libraries and work well.
|
| Moreover, Go's quality standard library is often cited as one
| of its main strengths.
|
| Thus, the inclusion of fuzzing in the stdlib isn't surprising
| to me. Not saying the other way around would be bad. It's just
| not surprising, and I don't think it's a bad choice, looking at
| Go historically.
| dainiusse wrote:
| I think it is great in general. OTOH - nobody prohibits to use
| any third party library whoever wants to. Third party libraries
| also die like - https://github.com/go-check/check
| Scaevolus wrote:
| Python had an 18 month release cycle for most of its life (now
| 12 month), while Go has a 6 month release cycle.
|
| Many Python devs use the OS packaged Python versions, while Go
| devs tend to use the latest release.
|
| Integrating this in the stdlib means that more people will use
| basic fuzzing functionality. There's nothing preventing third
| party fuzzers from continuing to develop.
| smasher164 wrote:
| The Go fuzzing tool takes advantage of compiler
| instrumentation. It can also work with built-in Go types, in
| comparison to traditional fuzzing tools that just work with
| bytes. Additionally, integrating it into the testing tool
| allows it to be as easy to write as a unit test. This can help
| provide a batteries-included fuzzing experience.
| staticassertion wrote:
| This is just a work around for:
|
| a) A lack of strong typing
|
| b) A custom compiler toolchain
|
| llvm already has instrumentation/ coverage support and
| generics make it easy to work with higher level constructs
| than bytes, although you generally do want to just work with
| bytes when fuzzing imo.
|
| The language is weak, therefor the language has to add more
| and more batteries-included because extending it is
| purposefully difficult.
| masklinn wrote:
| > This is just a work around for:
|
| > [...]
|
| > llvm already has instrumentation/ coverage support
|
| I mean, that supports the idea of having fuzzing support in
| whatever core you have.
| staticassertion wrote:
| My point is that between Go's custom compiler backend and
| inexpressive typing there's much more need to build
| things like this in directly vs what other languages can
| do by just using llvm/gcc. Like if Go developers want
| sanitizers equivalent to what llvm packages they'll have
| to build that themselves, although that won't have the
| same issue with inexpressive types.
| aleksi wrote:
| > Like if Go developers want sanitizers equivalent to
| what llvm packages they'll have to build that themselves
|
| Go uses LLVM's ThreadSanitizer since 1.1.
| staticassertion wrote:
| I don't think that really addresses my point, it just
| shows that they did the work for one sanitizer already.
| smasher164 wrote:
| > llvm already has instrumentation
|
| The Go toolchain actually supports emitting instrumentation
| for LLVM's libFuzzer with -gcflags=all=-d=libfuzzer.
|
| > therefor the language has to add more
|
| Okay, and so? If this ends up making fuzzing more popular
| and easy-to-use, I frankly don't care if it was added as a
| library or deeply integrated into the toolchain.
| staticassertion wrote:
| > The Go toolchain actually supports emitting
| instrumentation for LLVM's libFuzzer with
| -gcflags=all=-d=libfuzzer.
|
| Sweet, that's a smart approach.
|
| > Okay, and so? If this ends up making fuzzing more
| popular and easy-to-use, I frankly don't care if it was
| added as a library or deeply integrated into the
| toolchain.
|
| I don't care either because I don't write Go, so just the
| fact that it's supported is nice for me since it
| encourages this in languages I do care about.
|
| But if I were a go developer I might care a lot about how
| my language evolves, what's built in, what's a library,
| what the capabilities are, what tools I can integrate
| with, etc.
|
| It sounds like they've done a pretty good job with
| regards to this implementation though, happy to see it.
| morelisp wrote:
| > The Go fuzzing tool takes advantage of compiler
| instrumentation.
|
| This is the main benefit. I've been using go-fuzz for years
| and compiler upgrades (especially any changes related to
| modules/GOROOT/GOPATH) was a pain because it always behaved
| slightly differently.
|
| > It can also work with built-in Go types, in comparison to
| traditional fuzzing tools that just work with bytes.
|
| This could have been done just as efficiently without
| upstream integration.
| throwaway894345 wrote:
| Maybe, but I wouldn't support support that argument by holding
| up Python and its HTTP situation as exemplary. The standard
| HTTP libraries are a nightmare, requests proves that Python
| packages can and will languish even outside of the standard
| library, and writing even a simple HTTP script in Python means
| you now need to choose between the standard HTTP libraries or
| tackle dependency management and multi-file deployment issues.
|
| By contrast Go ships with a high quality standard HTTP library
| that has lasted a decade and no "requests" equivalent has risen
| up to challenge it.
|
| Note also that Go's testing situation in general is much nicer
| than many other languages precisely because things are baked
| into the standard library--no need to quibble over which test
| framework to use or to memorize each framework's equivalent for
| "run tests that match this pattern" and so on.
| staticassertion wrote:
| The fact that requests has "languished" (not sure how tbh)
| doesn't really change the fact that the Python stdlib is a
| bit of a hilarious disaster from afar. Tons of cases of "that
| shouldn't be in std" where libraries have quirky, locked in
| behaviors, or an entire major breaking release with decades
| of work to migrate has to be made to clean up the mistakes.
|
| Python should be a case study in the many ways not to build a
| language.
|
| > By contrast Go ships with a high quality standard HTTP
| library that has lasted a decade and no "requests" equivalent
| has risen up to challenge it.
|
| Yeah this also means that you need to update your compiler
| when there's a vulnerability instead of just a single point
| release in a library. This happens with some frequency.
|
| The parent poster is right, in my opinion.
| cube2222 wrote:
| > Yeah this also means that you need to update your
| compiler when there's a vulnerability instead of just a
| single point release in a library.
|
| Has updating the Go compiler actually been an issue for you
| in the past? To me, with Go's stability, it's never been
| more disruptive than updating a library in practice, so I
| don't see much of a difference.
| staticassertion wrote:
| I don't know that I'd hate it if I were a go dev, it
| would just be a bit annoying for a number of reasons.
|
| For one thing I update libraries all the time so it's a
| very fast, simple, well worn operation. Updating the
| compiler is a bit more of a chore and I'm going to worry
| a bit more about the impact (since it's global to all
| code vs local to one package).
|
| For another, I would want to make sure I had tooling that
| could tell me "is this library in use by service X". I
| don't know Go's story there, but I would hope it's
| trivial to do so for a library but I suspect if it's part
| of the standard library that may be trickier. If not,
| nbd.
|
| It's a bad smell to me, but if I were a Go developer it
| wouldn't break me.
|
| Perhaps ironically, until this native fuzzing package,
| upgrading the compiler if you had fuzz tests would be one
| case where things would likely break.
| throwaway894345 wrote:
| > Updating the compiler is a bit more of a chore and I'm
| going to worry a bit more about the impact (since it's
| global to all code vs local to one package).
|
| This is indeed a chore in other languages. In Go, the
| compiler is trivially installed. Typically this just
| means bumping the version in your Dockerfile and "gvm use
| $newVersion --default".
|
| > For another, I would want to make sure I had tooling
| that could tell me "is this library in use by service X".
| I don't know Go's story there, but I would hope it's
| trivial to do so for a library but I suspect if it's part
| of the standard library that may be trickier. If not,
| nbd.
|
| This is supported out of the box by Go's tooling. `go mod
| graph` is what you're looking for.
| staticassertion wrote:
| > This is indeed a chore in other languages. In Go, the
| compiler is trivially installed. Typically this just
| means bumping the version in your Dockerfile and "gvm use
| $newVersion --default".
|
| The issue isn't with installing the new compiler, that's
| trivial in our use case as well (for Rust at least,
| Python's a disaster, but I accept that). The issue is
| ensuring compatibility, ensuring no new bugs are
| introduced, etc. It's just a much heavier change to your
| produced binary vs changing a package.
|
| > This is supported out of the box by Go's tooling. `go
| mod graph` is what you're looking for.
|
| Cool, thanks.
| rat9988 wrote:
| It is for me, I can't just go build in the new version.
| So i'm keeping the software with the old compiler.
| TheDong wrote:
| > Has updating the Go compiler actually been an issue for
| you in the past? To me, with Go's stability, it's never
| been more disruptive than updating a library in practice
|
| I've run into issues with several go version updates.
|
| Off the top of my head, all of the following caused
| breakages:
|
| 1. go 1.4 making directories named 'internal' special and
| un-importable. Cross-package imports that used to work no
| longer would compile with a compiler error.
|
| 2. go 1.9 adding monotonic clock readings in a breaking
| way, i.e. this program changed output from 1.8 to 1.9:
| https://go.dev/play/p/Mi6cGCPd0rS (I know it looks
| contrived, but I'm not digging up the actual code that
| broke)
|
| 3. The change of the http.Server default to serving http2
| instead of http/1.1 broke stuff. Of course it did. How
| can that possibly _not_ break stuff?
|
| 4. The changes in 'GO111MODULE' defaults broke many
| imports which had either malformed or incorrect go.mod
| files. This one was quite painful for the whole
| ecosystem.
|
| 5. go1.17 switched to silently truncating a lot of query
| strings. Of course that broke stuff, how could it not?
| https://go.dev/play/p/azODBvkb-zK
|
| Those are all intentional breaking changes which were not
| fixed upstream (i.e. are "working as intended"). The
| unintentional breaking changes, from changing error
| messages to cause string-based error detection to fail
| (because so many stdlib errors aren't exported so you
| have to do string matching), to just plain dumb bugs in
| the stdlib.... those are vastly more common. Those
| usually do get fixed in point releases. Take a gander at
| those release notes, many of the issues highlighted in
| those changelogs come from pain people hit during
| upgrades.
|
| I think the majority of go version upgrades have had some
| amount of pain, and most of them have been far more
| disruptive than updating a well-built library.
|
| I would much rather update just my fuzz-testing library
| in a commit, and be confident that it's only used in
| tests so CI is good enough to validate it, than have to
| update that and my http package and my tls package and my
| os package all at once and have to look for bugs
| _everywhere_.
| cube2222 wrote:
| I admit I wasn't bit by these changes and had a much
| better experience overall. Thank you for the long write-
| up.
|
| However, I think you only mentioned changes in major
| releases, whereas in this scenario (vulnerability fix) a
| minor release would suffice (the parent mentioned
| updating to a point release of a library). Did you also
| have issues with minor releases?
| staticassertion wrote:
| > a minor release would suffice
|
| Does the Go compiler have LTS releases? Like if I'm on
| 1.0, but 1.5 is out, are they going to release a 1.0.1
| for a vuln that impacts 1.0+ ?
|
| It seems unlikely but I'd be curious to know.
|
| Libraries release patches more frequently, and it's also
| generally easier to apply a patch yourself if you need
| to.
|
| Otherwise a point release may still imply a major
| release.
| cube2222 wrote:
| The most recent 2 major releases get the fix in case of
| security issues[0]. This means you can be up to 6 months
| behind the newest release to never be forced to do a
| major version update under time pressure.
|
| [0]:https://github.com/golang/go/wiki/MinorReleases
| throwaway894345 wrote:
| > The fact that requests has "languished" (not sure how
| tbh) doesn't really change the fact that the Python stdlib
| is a bit of a hilarious disaster from afar.
|
| Agreed that the Python stdlib is a disaster, but my point
| was that the OP contradicts himself by arguing that
| stability guarantees hold the standard library back while
| pointing to requests which itself hasn't made many/any
| intrepid breaking changes or even sensible non-breaking
| changes a la async support. Note that "stability is bad" is
| the OP's point of view and not mine.
|
| > Tons of cases of "that shouldn't be in std" where
| libraries have quirky, locked in behaviors, or an entire
| major breaking release with decades of work to migrate has
| to be made to clean up the mistakes.
|
| But the parent pointed to the requests library which is not
| in the stdlib. Note also that Go has been around for a
| decade and has needed no such major migration initiative.
|
| > Yeah this also means that you need to update your
| compiler when there's a vulnerability instead of just a
| single point release in a library. This happens with some
| frequency.
|
| The frequency is very low and updating the compiler is
| minimally risky due to Go's strong compatibility guarantees
| (precisely the kind of stability the parent opposes). This
| is a much lesser problem than dependency management in
| Python (I have 15 years of experience in Python and 10 in
| Go).
| makapuf wrote:
| Note that Python was already almost two decades when v3
| got out and is now three decades old.
| masklinn wrote:
| FWIW older design documents have (some) reasoning for
| integrating fuzzing natively:
|
| -
| https://docs.google.com/document/d/1N-12_6YBPpF9o4_Zys_E_ZQn...
|
| -
| https://go.googlesource.com/proposal/+/master/design/draft-f...
|
| One of the original proposals
| (https://docs.google.com/document/u/1/d/1zXR-
| TFL3BfnceEAWytV8...) further explains why, by apparently go-
| fuzz maintainers team (per issue 329[0] none of them seems in
| any way broken-hearted about the idea of deprecating go-fuzz
| eventually):
|
| > go-fuzz suffers from several problems:
|
| > - It breaks multiple times per Go release because it's tied
| to the way go build works, std lib package structure and
| dependencies, etc. It broke due to internal packages (multiple
| times), vendoring (multiple times), changed dependencies in std
| lib, etc.
|
| > - It tries to do compiler work regarding coverage
| instrumentation without compiler help. This leads to build
| breakages on corner case code; poor performance; suboptimal
| quality of coverage instrumentation (missed edges).
|
| > - Considerable difficulty in integrating it into other build
| systems and non-standard contexts as it uses source pre-
| processing.
|
| > Goal of this proposal is to make fuzzing as easy to use as
| unit testing.
|
| [0] https://github.com/dvyukov/go-fuzz/issues/329
| [deleted]
___________________________________________________________________
(page generated 2022-01-01 23:00 UTC)