[HN Gopher] You can use C-Reduce for any language
       ___________________________________________________________________
        
       You can use C-Reduce for any language
        
       Author : Tomte
       Score  : 212 points
       Date   : 2024-11-27 17:56 UTC (5 hours ago)
        
 (HTM) web link (bernsteinbear.com)
 (TXT) w3m dump (bernsteinbear.com)
        
       | jchw wrote:
       | I did not know about C-Reduce until just now, and I'm already
       | hooked. This is basically like discovering git bisect for the
       | first time.
       | 
       | I'll have to stuff that into the back of my mind until I run into
       | something where it might fit.
        
         | __MatrixMan__ wrote:
         | Same story, I can't wait to try this.
        
         | geon wrote:
         | I ran into some issues with what might be compiler bugs in
         | cc65, a C compiler for the 6502 processor (c64, NES, Apple 1).
         | 
         | I might see if I can get this set up. Apparently VICE supports
         | "printing" to a file on the host OS, so I could run my test on
         | the emulator.
        
         | brightball wrote:
         | ...so I just discovered git bisect for the first time and this
         | is pretty cool.
        
       | furyofantares wrote:
       | Without an explanation of _why_ it works on languages other than
       | C, that is a hard claim to believe! I mean, I don 't think
       | they're lying but (given it doesn't use an LLM) I am bewildered.
        
         | johnisgood wrote:
         | Yeah, I would have liked to see more examples and an
         | explanation.
        
           | evmar wrote:
           | I glanced at the code. It seems to have multiple possible
           | "passes" which reduce the code in various ways, and the
           | passes here not tagged with "C"=>1 are used in the not-c mode
           | recommended in the post.
           | 
           | https://github.com/csmith-
           | project/creduce/blob/31e855e290970...
           | 
           | You can see the pass implementation in files on the left.
        
             | wat10000 wrote:
             | The key thing is that the transforms it makes aren't
             | required to always produce a program that can actually run,
             | or even build.
             | 
             | The user provides a script which determines if a program is
             | "interesting." A program with build-time errors should be
             | considered "not interesting" in most cases. (And if you're
             | hunting an incorrectly reported error, you'll want to make
             | sure you only consider that error to be interesting.)
             | 
             | Then it yoinks out parts of your program and checks if the
             | result is still interesting. If it's not, that attempt is
             | discarded and it tries something else. If the reduced
             | program is still interesting, it will try yoinking out more
             | stuff. Repeat until you like the result.
             | 
             | There doesn't need to be any understanding of the program
             | in order for this to work. You just need something where
             | removing some bit of the program has a chance of producing
             | a new program that's still interesting. That works in most
             | programming languages.
             | 
             | This process can take a long time, and it's slower when
             | there's a lower probability of a removal producing an
             | interesting program. Thus heuristics are added: don't
             | remove random character ranges, but work at a token level.
             | When removing a brace, find the matching brace and remove
             | the insides. That sort of thing. Most modern languages are
             | similar enough to C that many of these rules are helpful
             | for them too. And even if they don't help, this just slows
             | the process down, but it still works.
        
         | bqmjjx0kac wrote:
         | Yes, and without understanding how it works, I'm left wondering
         | whether it's even safe to use this way. Will creduce execute
         | mutated versions of the input script, potentially deleting my
         | files or eating my lunch?
        
           | None4U wrote:
           | it will execute whatever script you pass it (immutable) and
           | mutate whatever other files you pass it
        
             | defen wrote:
             | I think the question is, how do you know C-reduce isn't
             | going to mutate your test case into calling
             | `remove("/path/to/my/important/file");`
        
               | frabert wrote:
               | C-reduce is meant to... Reduce your files, it would not
               | add stuff that was not there in the first place. Also, I
               | think it's meant to only be run against the "frontend" of
               | most languages, not full execution
        
               | chubot wrote:
               | Yeah but you could have a shell script that does
               | rm -rf /foo/bar
               | 
               | and in theory it could reduce it to                   rm
               | -rf /
               | 
               | The passes are very heuristic, so that doesn't seem like
               | something you can rule out.
               | 
               | (and I actually want to run it on shell scripts! )
        
               | Cieric wrote:
               | While I've used c-reduce before, I've never done it in a
               | way where it could be destructive. However speculating
               | based on what I do know. I think the 2 things I would do
               | would be in the interesting-ness test, grep for already
               | known harmful instructions and force them to be
               | uninteresting (returning 1). And then if I was still
               | unsure of the potential harm of the program and I had to
               | run it to determine if it's interesting or not (segfault
               | or something similar). I think I would hoist the
               | binary/script into a docker container and run it there.
               | That way the log and result can still be checked, but
               | it's access to actual file system is minimized as much as
               | possible.
               | 
               | TLDR; C-Reduce just gives you text to run/compile, if
               | you're worried about things like that sandbox as much as
               | possible.
        
         | Kuinox wrote:
         | Without looking at the paper, I would guess it would be like a
         | fuzzer that apply mutations that reduce the size of the input.
        
           | gnulinux wrote:
           | Isn't that potentially unsafe though? Random permutations can
           | reduce arbitrary programs to destructive programs, in
           | particular any program can be mutated to `rm -rf /` given
           | enough time. Also even the problem of finding syntactically
           | valid programs is combinatoric, it's pretty surprising that
           | it can go toward a local optima in such a short time without
           | having any understanding of the function or its derivatives.
        
             | jewel wrote:
             | For compiled languages it should be fine, as you're only
             | going to compile the permuted source code, not execute it.
             | 
             | Given a sufficiently bad compiler bug anything is possible,
             | but I think given that it's trying to minimize the size of
             | an input that gives a particular output I don't think it's
             | likely to explore distant branches.
        
               | Quekid5 wrote:
               | > Given a sufficiently bad compiler bug anything is
               | possible, ...
               | 
               | I can't help but wonder about the consteval/constexpr
               | machinery in the various C++ compilers... It'd be
               | interesting to see how much adversarial input has been
               | tested for those.
               | 
               | (I guess Zig's comptime might also be relevant, but I'm
               | not sure what that's allowed to do. Maybe execute
               | arbitrary code?)
               | 
               | ... anyway just a stray thought.
        
               | adastra22 wrote:
               | It runs the executed code in order to determine if the
               | bug exists, does it not?
        
               | Cieric wrote:
               | That depends completely on the interesting-ness test
               | that's provided. If you're looking for a compiler bug
               | (like I do often for my language), then the interesting-
               | ness test checks the compile logs for information like
               | the "Segmentation Fault" text, no need to run the actual
               | executable. You could also hoist everything into docker
               | if you really want to, or ship it off to another device
               | to run.
        
               | 0x457 wrote:
               | > For compiled languages it should be fine, as you're
               | only going to compile the permuted source code, not
               | execute it.
               | 
               | Only if you'te going for compiler bug. If you're working
               | on minimal reproducible example. You need to make sure
               | your code isn't reduced to:
               | 
               | int main() { return 0; }
               | 
               | in that case.
        
             | bawolff wrote:
             | It seems pretty unlikely to mutate in a malicious
             | direction, random chance probably wouldn't get there, and
             | there doesn't seem like any reason it would be guided in
             | that direction.
             | 
             | "Enough time" does a lot of work here and warrants further
             | analysis. With enough time it might produce the works of
             | shakespare (if you ignore its designed to minimize), but we
             | should probably view anything that would take more than a
             | year as too much times.
        
         | wiml wrote:
         | I recommend skimming the PLDI paper linked by asmeurer, it has
         | a good summary. Some of its transforms are pretty C-specific,
         | using the Clang frontend; some are pretty generic and probably
         | work on any Algol-descended language. It's meant to be a
         | modular tool, so you could add transforms that understand other
         | languages if you like.
        
         | keybored wrote:
         | > I mean, I don't think they're lying but (given it doesn't use
         | an LLM) I am bewildered.
         | 
         | I'm pretty sure (based on the last time I saw this) that this
         | is just good old fashioned computer science.[1]
         | 
         | I really hope that HN hasn't forgotten about good old fashioned
         | computer science stuff already.
         | 
         | [1] All the algorithm and whatnot stuff, not the spooky machine
         | learning stuff. Even things like Prolog-for-AI, although that
         | has the slight downside of not working (for the purposes of
         | making AI).
        
           | furyofantares wrote:
           | To be clear my comment was meant to be an awareness that it
           | is good old fashioned computer science. Without LLMs, which
           | this predates, it is surprising to me that you'd have a lot
           | of success reducing a program in an arbitrary language and
           | still having something that's valid syntax!
           | 
           | I do get that it will reject a lot of stuff as not working
           | (and has to even in the target language) and I also get that
           | algol-like languages are all very similar, but I am still
           | surprised that it works well enough to function on ~arbitrary
           | languages.
           | 
           | These are very LLM-era properties for a program to have. The
           | question is not "does it work for language x" but "how well
           | does it work for language x", and the answer is not "is it
           | one of the languages it was designed for" but instead "idunno
           | try it out and see".
        
       | graypegg wrote:
       | Found this article with examples of before and after:
       | https://pramodkumbhar.com/2024/01/c-reduce-systematically-ta...
       | 
       | But still having trouble understanding how it knows WHAT to
       | remove when trying each iteration. There must be some
       | tokenization going on, but then I don't know how that would work
       | across programming languages
        
       | wood_spirit wrote:
       | Creduce is awesome.
       | 
       | I used a test script that spent hours using CSmith to generate
       | random test programs when developing an esoteric llvm target.
       | When they crashed it automatically Creduced and left them for
       | inspection. Invaluable!
        
       | asmeurer wrote:
       | A paper by the authors John Regehr et al. from 2012 explaining
       | how it works https://fsl.cs.illinois.edu/publications/regehr-
       | chen-cuoq-ei....
        
         | gnulinux wrote:
         | I read this paper and I still feel lost as to _how can this
         | even be possible_. It seems to understand how to tokenize,
         | merge lines, remove tokens etc for arbitrary programming
         | languages. Is there another paper that explains this algorithm
         | alone?
        
           | IshKebab wrote:
           | It basically runs transformation passes over the code until
           | they stop changing the code or break its behaviour.
           | 
           | It seems like a small number of the passes are not specific
           | to C:
           | 
           | https://github.com/csmith-
           | project/creduce/blob/master/creduc...
           | 
           | `"C" => 1,` means it is only for C.
           | 
           | I would guess `pass_lines` is the most important for non-C
           | code; I'm guessing (it's written in unreadable Perl) that it
           | removes lines.
           | 
           | So while it _can_ work for languages other than C, most of
           | its features are C-specific so it 's not going to work nearly
           | as well. Still I'd never heard of C-Reduce; pretty cool tool!
           | 
           | Someone make one based on Tree Sitter, quick!
        
         | nyrikki wrote:
         | This paper is not about C-Reduce as a whole but 3 new "domain-
         | specific test-case reducers" that the authors added to the
         | project.
         | 
         | Most of the non-domain specific reductions in C-Reduce is
         | simply brute force IIRC.
        
       | lambda wrote:
       | Note that as recommended by John Regehr, author of C-Reduce, for
       | this use case you might also want to try Shrinkray, a tool that
       | was written to be format independent and works well for cases
       | that C-Reduce dow not:
       | https://mastodon.social/@regehr/113489759789563570
        
         | canu7 wrote:
         | Looking forward to try it! A shout out to cvise
         | (https://github.com/marxin/cvise) as a python alternative that
         | works well too for non c languages.
        
       | andrewchambers wrote:
       | I wrote a general purpose delta debugger in python -
       | https://github.com/andrewchambers/ddmin-python. Can be used to do
       | the same thing.
        
       | judofyr wrote:
       | Well, aren't you going to share the reduced file?
       | 
       | Okay fine, let's see for ourselves:                   # Setup
       | git clone git@github.com:RustPython/RustPython.git && cd
       | RustPython && cargo build --release         git clone
       | git@github.com:tekknolagi/scrapscript.git && cp
       | scrapscript/scrapscript.py .              # Download
       | interesting.sh, replace the path to RustPython         chmod +x
       | interesting.sh              # Command in article:         nix run
       | nixpkgs#creduce -- --not-c interesting.sh scrapscript.py
       | 
       | Niiice:                   (93.2 %, 13717 bytes)         (93.2 %,
       | 13683 bytes)         (93.3 %, 13614 bytes)         (93.3 %, 13592
       | bytes)         (93.3 %, 13571 bytes)         (93.3 %, 13517
       | bytes)         (93.3 %, 13449 bytes)         (93.4 %, 13412
       | bytes)         (93.4 %, 13365 bytes)         (93.4 %, 13333
       | bytes)         (93.4 %, 13313 bytes)
       | 
       | It seems to have stopped at "(96.4 %, 7347 bytes)" with the
       | following output:
       | https://gist.github.com/judofyr/47cba8a20cb2cd5798943ef975d0...
        
       | def- wrote:
       | Works well for SQL too, I've been using it at my day job, found
       | out via https://github.com/sqlancer/sqlancer?tab=readme-ov-
       | file#redu...
        
       | gmueckl wrote:
       | How does this compare to dustmite
       | (https://dlang.org/blog/2020/04/13/dustmite-the-general-
       | purpo...)?
        
       | dswilkerson wrote:
       | Delta debugging is not new:
       | https://en.wikipedia.org/wiki/Delta_debugging
       | 
       | My implementation of delta debugging, "delta" is 19+ years old:
       | https://github.com/dsw/delta
       | 
       | I released it as Open Source because Microsoft Research sent
       | someone to my office to ask me to, back when Microsoft was
       | calling Open Source a "cancer".
       | 
       | The LLVM introduction by Latner refers to the "standard delta
       | debugging tool", so it is rather well-known:
       | https://aosabook.org/en/v1/llvm.html 'unlike the standard "delta"
       | command line tool.'
        
         | arxanas wrote:
         | For other readers' benefit: C-Reduce is a little more
         | sophisticated than plain delta-debugging. From the abstract of
         | Test-Case Reduction for C Compiler Bugs (2012):
         | 
         | > [...] [C-Reduce] produces outputs that are, on average, more
         | than 25 times smaller than those produced by our other reducers
         | or by the existing reducer that is most commonly used by
         | compiler developers. We conclude that effective program
         | reduction requires more than straightforward delta debugging.
         | 
         | (Of course, this means that C-Reduce is 12 years old now.)
         | 
         | At the same time, C-Reduce seems to be more general than the
         | LLVM tool you linked ("BugPoint", dating to 2002), since that
         | one works with LLVM IR specifically.
         | 
         | I think most developers are generally unfamiliar with automatic
         | test case minimization tools and techniques, so the post may be
         | helpful even if the ideas have been known in their respective
         | circles for quite some time.
        
       ___________________________________________________________________
       (page generated 2024-11-27 23:00 UTC)