[HN Gopher] Emitting Safer Rust with C2Rust
___________________________________________________________________
Emitting Safer Rust with C2Rust
Author : dtolnay
Score : 96 points
Date : 2023-03-14 05:32 UTC (1 days ago)
(HTM) web link (immunant.com)
(TXT) w3m dump (immunant.com)
| Animats wrote:
| DARPA is funding this. Good.
|
| They haven't reached inter-procedural static analysis yet, which
| means they can't solve the big problem: how big is an array? Most
| of the troubles in C come from that. Whoever creates the array
| knows how big it is. Everybody else is guessing.
|
| A bit of machine learning might help here. If you see
| void dosomethingwitharray(int arr[], size_t n) {}
|
| a good conjecture is that _n_ is the length of _arr_. So, the
| question is, if this is translated to fn
| dosomethingwitharray(arr: &[i64]) {}
|
| does it break anything? Both caller and callee have to be
| analyzed. The C caller has the constraint
| assert_eq!(arr.len(), n);
|
| That's a proof goal. If a simple SMT-type prover can prove that
| true., then the call can be simplified to just use an ordinary
| Rust slice. If not, conversion to Rust has to drop to those ugly
| C pointer forms, preferably with a comment inserted. So you need
| something that makes good guesses, which is a large language
| model kind of thing, and something which checks them, which is a
| formalism kind of thing.
|
| The process can be assisted by putting asserts in the original C,
| as checks on the C and hints to the conversion process. That's
| probably the cleanest way to provide human assistance.
|
| I've wanted this for conversion of OpenJPEG code to Rust. That's
| a tangle of code doing wavelet transforms, with long blocks of
| touchy subscripting and arithmetic, plus encoders and decoders
| for an overly complex binary format containing offsets and
| lengths. Someone recently ran it through c2rust. The unsafe Rust
| code works. It's compatible with the original C - it segfaults
| for the same test cases which cause the C code to segfault. This
| is why a naive transpiler isn't too helpful.
|
| (The date at the bottom of the article is 2022-06-13. Has there
| been further progress?)
| meepmorp wrote:
| > The date at the bottom of the article is 2022-06-13. Has
| there been further progress?
|
| The article links to their github repo:
|
| https://github.com/immunant/c2rust
|
| There's commits in the last hour, so at least some signal of
| life.
| mtlmtlmtlmtl wrote:
| Has anyone put this to serious use? I played around with it at
| some point when it was fairly new and at that time I was able to
| transpile the C into Rust just fine, but that didn't help me
| much. The idea was to be able to use the Rust toolchain to better
| understand the code, but the resulting Rust code was even less
| understandable, and also much harder to refactor. In this case I
| wasn't attempting a rewrite per se, just trying to understand a C
| codebase plagued with memory safety issues. Quickly gave up on
| this avenue at that point and just started carefully refactoring
| the C to make the bugs easier to shake out.
|
| Would love to see a technical write up of someone outside
| Immunant using this on a real world codebase for whatever
| purpose.
| diego_moita wrote:
| I am very curious to see how this transpiler problems will be
| handled by gpt4 in the upcoming months.
| boredumb wrote:
| C2rust is really cool, but if you're familiar with writing rust
| and implement even a trivial C function in there it produces
| something absolutely terrifying. I really enjoy rust and pray I
| don't find myself working in a code base someone just ran c2rust
| against.
| FridgeSeal wrote:
| Isn't the point to generate _semantically_ equivalent Rust code
| from C, so that you can just get it re-compiling under Rust,
| and then from there you have a working base from which to start
| rewriting into safer Rust?
| masklinn wrote:
| Yes, it's literally spelled out in TFA:
|
| > this provides a starting point for manual refactoring into
| idiomatic and safe Rust
| FpUser wrote:
| Do no know this particular tool but some automated language to
| language transpilers I saw produce the code one would not be able
| to comprehend never mind edit if the need comes.
| masklinn wrote:
| The goal of C2rust is not to provide a usable code base per se,
| it's to provide a convenient base for conversion: once the
| project is in unsafe rust it can be managed entirely via rust
| tooling and is hopefully a lot easier to finish up than if you
| keep having to redefine bindings as you move code from C to
| Rust.
|
| C2rust is a springboard, if you move C2rust-Ed code to
| production you're doing it very wrong.
| 0cf8612b2e1e wrote:
| On the other hand, if I have some working C dependency which
| I never intend to modify (owing to its complexity or
| stability), plopping the autogenerated Rust code simplifies
| your build step.
| anticrymactic wrote:
| What problem does c2Rust solve exactly? Isn't it just gonna
| produce "garbage" rust.
|
| Calling c directly is already possible in rust.
| kelnos wrote:
| This isn't about calling external C code from Rust; it helps
| people "rewrite" their C code in Rust.
|
| You can debate the merits of doing so, of course, but some
| people do want to do that, and a tool to generate safe,
| somewhat idiomatic Rust from C code would seem to be useful.
| pohl wrote:
| From c2rust.com:
|
| _The C2Rust project is being developed by Galois and Immunant.
| This tool is able to translate most C modules into semantically
| equivalent Rust code. These modules are intended to be compiled
| in isolation in order to produce compatible object files. We
| are developing several tools that help transform the initial
| Rust sources into idiomatic Rust.
|
| The translator focuses on supporting the C99 standard. C source
| code is parsed and typechecked using clang before being
| translated by our tool._
| eptcyka wrote:
| It helps by lowering the barrier to entry when working on
| rewriting a codebase in rust.
| masklinn wrote:
| It moves the project directly into rust land and tooling, which
| hopefully makes it easier to convert it without needing to set
| up multi langage tooling and a moving barrier / interface
| between the two langages.
| dureuill wrote:
| From reading the article, I get that the latest version can
| transform some C into _safe_ Rust.
|
| This gains us machine-proved memory safety. This is huge.
| kccqzy wrote:
| The article shows what improvements they are thinking of so
| that it _doesn 't_ produce garbage rust. (If by garbage rust
| you mean unsafe rust.)
| hardwaregeek wrote:
| The post does address this and shows their attempt to produce
| higher quality Rust. I've also seen it used to move off of a C
| toolchain and onto a pure Rust toolchain by porting C code to
| Rust.
| jandrese wrote:
| It makes it easier to get your project on the front page of HN
| as you can claim it is written in Rust.
| hardwaregeek wrote:
| I'm very excited at the possibilities for C2Rust! Dynamic
| analysis to fill in the gaps of static analysis makes a lot of
| sense. I've wanted something similar for inferring TypeScript
| types via runtime analysis (would not be surprised if it exists
| already).
|
| I could see a really compelling use case in cross-compilation
| where you compile your C code to Rust, then use a Rust toolchain
| to cross compile. Or avoiding interop as well.
| CharlesW wrote:
| This seems like an interesting project to bridge the "boil the
| ocean" approach of rewriting in Rust wholesale.
|
| (For anyone else who found it slightly difficult to read, you can
| remove the added 0.06em `letter-spacing` using your browser's
| developer tools.)
___________________________________________________________________
(page generated 2023-03-15 23:00 UTC)