[HN Gopher] You can use C-Reduce for any language
___________________________________________________________________
You can use C-Reduce for any language
Author : Tomte
Score : 212 points
Date : 2024-11-27 17:56 UTC (5 hours ago)
(HTM) web link (bernsteinbear.com)
(TXT) w3m dump (bernsteinbear.com)
| jchw wrote:
| I did not know about C-Reduce until just now, and I'm already
| hooked. This is basically like discovering git bisect for the
| first time.
|
| I'll have to stuff that into the back of my mind until I run into
| something where it might fit.
| __MatrixMan__ wrote:
| Same story, I can't wait to try this.
| geon wrote:
| I ran into some issues with what might be compiler bugs in
| cc65, a C compiler for the 6502 processor (c64, NES, Apple 1).
|
| I might see if I can get this set up. Apparently VICE supports
| "printing" to a file on the host OS, so I could run my test on
| the emulator.
| brightball wrote:
| ...so I just discovered git bisect for the first time and this
| is pretty cool.
| furyofantares wrote:
| Without an explanation of _why_ it works on languages other than
| C, that is a hard claim to believe! I mean, I don 't think
| they're lying but (given it doesn't use an LLM) I am bewildered.
| johnisgood wrote:
| Yeah, I would have liked to see more examples and an
| explanation.
| evmar wrote:
| I glanced at the code. It seems to have multiple possible
| "passes" which reduce the code in various ways, and the
| passes here not tagged with "C"=>1 are used in the not-c mode
| recommended in the post.
|
| https://github.com/csmith-
| project/creduce/blob/31e855e290970...
|
| You can see the pass implementation in files on the left.
| wat10000 wrote:
| The key thing is that the transforms it makes aren't
| required to always produce a program that can actually run,
| or even build.
|
| The user provides a script which determines if a program is
| "interesting." A program with build-time errors should be
| considered "not interesting" in most cases. (And if you're
| hunting an incorrectly reported error, you'll want to make
| sure you only consider that error to be interesting.)
|
| Then it yoinks out parts of your program and checks if the
| result is still interesting. If it's not, that attempt is
| discarded and it tries something else. If the reduced
| program is still interesting, it will try yoinking out more
| stuff. Repeat until you like the result.
|
| There doesn't need to be any understanding of the program
| in order for this to work. You just need something where
| removing some bit of the program has a chance of producing
| a new program that's still interesting. That works in most
| programming languages.
|
| This process can take a long time, and it's slower when
| there's a lower probability of a removal producing an
| interesting program. Thus heuristics are added: don't
| remove random character ranges, but work at a token level.
| When removing a brace, find the matching brace and remove
| the insides. That sort of thing. Most modern languages are
| similar enough to C that many of these rules are helpful
| for them too. And even if they don't help, this just slows
| the process down, but it still works.
| bqmjjx0kac wrote:
| Yes, and without understanding how it works, I'm left wondering
| whether it's even safe to use this way. Will creduce execute
| mutated versions of the input script, potentially deleting my
| files or eating my lunch?
| None4U wrote:
| it will execute whatever script you pass it (immutable) and
| mutate whatever other files you pass it
| defen wrote:
| I think the question is, how do you know C-reduce isn't
| going to mutate your test case into calling
| `remove("/path/to/my/important/file");`
| frabert wrote:
| C-reduce is meant to... Reduce your files, it would not
| add stuff that was not there in the first place. Also, I
| think it's meant to only be run against the "frontend" of
| most languages, not full execution
| chubot wrote:
| Yeah but you could have a shell script that does
| rm -rf /foo/bar
|
| and in theory it could reduce it to rm
| -rf /
|
| The passes are very heuristic, so that doesn't seem like
| something you can rule out.
|
| (and I actually want to run it on shell scripts! )
| Cieric wrote:
| While I've used c-reduce before, I've never done it in a
| way where it could be destructive. However speculating
| based on what I do know. I think the 2 things I would do
| would be in the interesting-ness test, grep for already
| known harmful instructions and force them to be
| uninteresting (returning 1). And then if I was still
| unsure of the potential harm of the program and I had to
| run it to determine if it's interesting or not (segfault
| or something similar). I think I would hoist the
| binary/script into a docker container and run it there.
| That way the log and result can still be checked, but
| it's access to actual file system is minimized as much as
| possible.
|
| TLDR; C-Reduce just gives you text to run/compile, if
| you're worried about things like that sandbox as much as
| possible.
| Kuinox wrote:
| Without looking at the paper, I would guess it would be like a
| fuzzer that apply mutations that reduce the size of the input.
| gnulinux wrote:
| Isn't that potentially unsafe though? Random permutations can
| reduce arbitrary programs to destructive programs, in
| particular any program can be mutated to `rm -rf /` given
| enough time. Also even the problem of finding syntactically
| valid programs is combinatoric, it's pretty surprising that
| it can go toward a local optima in such a short time without
| having any understanding of the function or its derivatives.
| jewel wrote:
| For compiled languages it should be fine, as you're only
| going to compile the permuted source code, not execute it.
|
| Given a sufficiently bad compiler bug anything is possible,
| but I think given that it's trying to minimize the size of
| an input that gives a particular output I don't think it's
| likely to explore distant branches.
| Quekid5 wrote:
| > Given a sufficiently bad compiler bug anything is
| possible, ...
|
| I can't help but wonder about the consteval/constexpr
| machinery in the various C++ compilers... It'd be
| interesting to see how much adversarial input has been
| tested for those.
|
| (I guess Zig's comptime might also be relevant, but I'm
| not sure what that's allowed to do. Maybe execute
| arbitrary code?)
|
| ... anyway just a stray thought.
| adastra22 wrote:
| It runs the executed code in order to determine if the
| bug exists, does it not?
| Cieric wrote:
| That depends completely on the interesting-ness test
| that's provided. If you're looking for a compiler bug
| (like I do often for my language), then the interesting-
| ness test checks the compile logs for information like
| the "Segmentation Fault" text, no need to run the actual
| executable. You could also hoist everything into docker
| if you really want to, or ship it off to another device
| to run.
| 0x457 wrote:
| > For compiled languages it should be fine, as you're
| only going to compile the permuted source code, not
| execute it.
|
| Only if you'te going for compiler bug. If you're working
| on minimal reproducible example. You need to make sure
| your code isn't reduced to:
|
| int main() { return 0; }
|
| in that case.
| bawolff wrote:
| It seems pretty unlikely to mutate in a malicious
| direction, random chance probably wouldn't get there, and
| there doesn't seem like any reason it would be guided in
| that direction.
|
| "Enough time" does a lot of work here and warrants further
| analysis. With enough time it might produce the works of
| shakespare (if you ignore its designed to minimize), but we
| should probably view anything that would take more than a
| year as too much times.
| wiml wrote:
| I recommend skimming the PLDI paper linked by asmeurer, it has
| a good summary. Some of its transforms are pretty C-specific,
| using the Clang frontend; some are pretty generic and probably
| work on any Algol-descended language. It's meant to be a
| modular tool, so you could add transforms that understand other
| languages if you like.
| keybored wrote:
| > I mean, I don't think they're lying but (given it doesn't use
| an LLM) I am bewildered.
|
| I'm pretty sure (based on the last time I saw this) that this
| is just good old fashioned computer science.[1]
|
| I really hope that HN hasn't forgotten about good old fashioned
| computer science stuff already.
|
| [1] All the algorithm and whatnot stuff, not the spooky machine
| learning stuff. Even things like Prolog-for-AI, although that
| has the slight downside of not working (for the purposes of
| making AI).
| furyofantares wrote:
| To be clear my comment was meant to be an awareness that it
| is good old fashioned computer science. Without LLMs, which
| this predates, it is surprising to me that you'd have a lot
| of success reducing a program in an arbitrary language and
| still having something that's valid syntax!
|
| I do get that it will reject a lot of stuff as not working
| (and has to even in the target language) and I also get that
| algol-like languages are all very similar, but I am still
| surprised that it works well enough to function on ~arbitrary
| languages.
|
| These are very LLM-era properties for a program to have. The
| question is not "does it work for language x" but "how well
| does it work for language x", and the answer is not "is it
| one of the languages it was designed for" but instead "idunno
| try it out and see".
| graypegg wrote:
| Found this article with examples of before and after:
| https://pramodkumbhar.com/2024/01/c-reduce-systematically-ta...
|
| But still having trouble understanding how it knows WHAT to
| remove when trying each iteration. There must be some
| tokenization going on, but then I don't know how that would work
| across programming languages
| wood_spirit wrote:
| Creduce is awesome.
|
| I used a test script that spent hours using CSmith to generate
| random test programs when developing an esoteric llvm target.
| When they crashed it automatically Creduced and left them for
| inspection. Invaluable!
| asmeurer wrote:
| A paper by the authors John Regehr et al. from 2012 explaining
| how it works https://fsl.cs.illinois.edu/publications/regehr-
| chen-cuoq-ei....
| gnulinux wrote:
| I read this paper and I still feel lost as to _how can this
| even be possible_. It seems to understand how to tokenize,
| merge lines, remove tokens etc for arbitrary programming
| languages. Is there another paper that explains this algorithm
| alone?
| IshKebab wrote:
| It basically runs transformation passes over the code until
| they stop changing the code or break its behaviour.
|
| It seems like a small number of the passes are not specific
| to C:
|
| https://github.com/csmith-
| project/creduce/blob/master/creduc...
|
| `"C" => 1,` means it is only for C.
|
| I would guess `pass_lines` is the most important for non-C
| code; I'm guessing (it's written in unreadable Perl) that it
| removes lines.
|
| So while it _can_ work for languages other than C, most of
| its features are C-specific so it 's not going to work nearly
| as well. Still I'd never heard of C-Reduce; pretty cool tool!
|
| Someone make one based on Tree Sitter, quick!
| nyrikki wrote:
| This paper is not about C-Reduce as a whole but 3 new "domain-
| specific test-case reducers" that the authors added to the
| project.
|
| Most of the non-domain specific reductions in C-Reduce is
| simply brute force IIRC.
| lambda wrote:
| Note that as recommended by John Regehr, author of C-Reduce, for
| this use case you might also want to try Shrinkray, a tool that
| was written to be format independent and works well for cases
| that C-Reduce dow not:
| https://mastodon.social/@regehr/113489759789563570
| canu7 wrote:
| Looking forward to try it! A shout out to cvise
| (https://github.com/marxin/cvise) as a python alternative that
| works well too for non c languages.
| andrewchambers wrote:
| I wrote a general purpose delta debugger in python -
| https://github.com/andrewchambers/ddmin-python. Can be used to do
| the same thing.
| judofyr wrote:
| Well, aren't you going to share the reduced file?
|
| Okay fine, let's see for ourselves: # Setup
| git clone git@github.com:RustPython/RustPython.git && cd
| RustPython && cargo build --release git clone
| git@github.com:tekknolagi/scrapscript.git && cp
| scrapscript/scrapscript.py . # Download
| interesting.sh, replace the path to RustPython chmod +x
| interesting.sh # Command in article: nix run
| nixpkgs#creduce -- --not-c interesting.sh scrapscript.py
|
| Niiice: (93.2 %, 13717 bytes) (93.2 %,
| 13683 bytes) (93.3 %, 13614 bytes) (93.3 %, 13592
| bytes) (93.3 %, 13571 bytes) (93.3 %, 13517
| bytes) (93.3 %, 13449 bytes) (93.4 %, 13412
| bytes) (93.4 %, 13365 bytes) (93.4 %, 13333
| bytes) (93.4 %, 13313 bytes)
|
| It seems to have stopped at "(96.4 %, 7347 bytes)" with the
| following output:
| https://gist.github.com/judofyr/47cba8a20cb2cd5798943ef975d0...
| def- wrote:
| Works well for SQL too, I've been using it at my day job, found
| out via https://github.com/sqlancer/sqlancer?tab=readme-ov-
| file#redu...
| gmueckl wrote:
| How does this compare to dustmite
| (https://dlang.org/blog/2020/04/13/dustmite-the-general-
| purpo...)?
| dswilkerson wrote:
| Delta debugging is not new:
| https://en.wikipedia.org/wiki/Delta_debugging
|
| My implementation of delta debugging, "delta" is 19+ years old:
| https://github.com/dsw/delta
|
| I released it as Open Source because Microsoft Research sent
| someone to my office to ask me to, back when Microsoft was
| calling Open Source a "cancer".
|
| The LLVM introduction by Latner refers to the "standard delta
| debugging tool", so it is rather well-known:
| https://aosabook.org/en/v1/llvm.html 'unlike the standard "delta"
| command line tool.'
| arxanas wrote:
| For other readers' benefit: C-Reduce is a little more
| sophisticated than plain delta-debugging. From the abstract of
| Test-Case Reduction for C Compiler Bugs (2012):
|
| > [...] [C-Reduce] produces outputs that are, on average, more
| than 25 times smaller than those produced by our other reducers
| or by the existing reducer that is most commonly used by
| compiler developers. We conclude that effective program
| reduction requires more than straightforward delta debugging.
|
| (Of course, this means that C-Reduce is 12 years old now.)
|
| At the same time, C-Reduce seems to be more general than the
| LLVM tool you linked ("BugPoint", dating to 2002), since that
| one works with LLVM IR specifically.
|
| I think most developers are generally unfamiliar with automatic
| test case minimization tools and techniques, so the post may be
| helpful even if the ideas have been known in their respective
| circles for quite some time.
___________________________________________________________________
(page generated 2024-11-27 23:00 UTC)