[HN Gopher] Translating All C to Rust (TRACTOR)
___________________________________________________________________
Translating All C to Rust (TRACTOR)
Author : steveklabnik
Score : 201 points
Date : 2024-07-30 15:42 UTC (7 hours ago)
(HTM) web link (www.darpa.mil)
(TXT) w3m dump (www.darpa.mil)
| steveklabnik wrote:
| See also
| https://sam.gov/opp/1e45d648886b4e9ca91890285af77eb7/view
| thesuperbigfrog wrote:
| Direct link to Proposer's Day info [PDF]:
| https://sam.gov/api/prod/opps/v3/opportunities/resources/fil...
|
| "The purpose of this event is to provide information on the
| TRACTOR technical goals and challenges, address questions from
| potential proposers, and provide an opportunity for potential
| proposers to consider how their research may align with the
| TRACTOR program objectives."
| PreInternet01 wrote:
| The _one_ link for those who think that 'Rewrite it All in Rust'
| will, well, settle _any_ debates: https://github.com/rust-
| lang/miri/
| calebpeterson wrote:
| Genuine question:
|
| Would you mind explaining to a dev that doesn't know much
| (anything) about Rust, how does this settle any debate?
| jerf wrote:
| I believe it goes something like, "I have constructed a
| strawman that Rust claims that all code written in it is
| automatically safe by all conceivable definitions of safe,
| but look, ha ha, here's something that detects unsafe code in
| Rust!", and I don't mean "code marked in unsafe blocks".
|
| It's a concatenation of several logical fallacies in a row;
| equivocation, straw manning, binary thinking about safety,
| several others. It's hard to pick the main one, but I'd go
| with the dominant problem being a serious case of binary
| thinking about what "safety" is. Of course, if the commentor
| is using anything other than Idris for all their programming,
| they're probably not actually acting on their own
| accusations.
| marcosdumay wrote:
| > Of course, if the commentor is using anything other than
| Idris
|
| I'm sure the Idris compiler has bugs somewhere too. If the
| OP actually programs, they are violating their rationale
| (I'm quite sure assembly or assembled binary aren't ok
| either).
| mrweiden wrote:
| From the original post > It's not enough to rely on bug-
| finding tools
|
| From the Miri github: > Miri is an Undefined Behavior
| detection tool for Rust.
| keybored wrote:
| Darpa is already ahead of you all with the hedging:
|
| > The preferred approach is to use "safe" programming
| languages
|
| "Safe". Terms and conditions may apply.
| Sharlin wrote:
| There is no contradiction. The fact that UB-finding tools
| alone are not _sufficient_ doesn 't mean they aren't
| _useful_ even with a safe(r) language.
|
| In other words, from "safer languages are necessary" it
| does not follow that "safer languages are sufficient".
| PreInternet01 wrote:
| Well, the general 'Rewrite All in Rust' consensus is that it
| solves _all_ general programming problems, _ever_.
|
| Yet, the linked repository shows a huge list of cases in
| which simple, documented use of Rust can cause Undefined
| Behavior (a.k.a. 'UB')
|
| Pretty much every argument of Rust advocates against C/C++
| boils down to either 'but memory safety' or 'but UB'.
|
| Yet there are many convincing counter-arguments that boil
| down to 'but CompCert' or similar, and, as the linked
| repository shows, there might be at least _some_ truth in
| there?
| steveklabnik wrote:
| No serious person claims that Rust solves every problem
| ever.
|
| Also, many people cite things like Cargo as a reason to
| prefer Rust over C and C++, as well as other things. UB is
| a big part of it, of course, but it isn't the only thing.
| galangalalgol wrote:
| I selected it for performance reasons myself, the UB
| protection was a nice benefit that was expected, cargo
| wasn't expected and is extremely nice coming from the
| cmake,conan,vcpkg and duct tape world I came from.
| leftyspook wrote:
| > Well, the general 'Rewrite All in Rust' consensus is that
| it solves all general programming problems, ever.
|
| a) There is no such consensus. The actual consensus is that
| even if Rust solved all problems, it would not be
| financially feasible to rewrite pretty much any substantial
| project.
|
| b) While Rust does solve many problems, it is nowhere close
| to solving all safety, otherwise there would be no `unsafe`
| keyword. Alas, fully proving safety in an impure, turing-
| complete language is mathematically impossible.
|
| c) The only reason you would think that there's some sort
| of woke Rust lobby, is if you spend way too much time
| subjecting yourself to opinions of literal sixteen year
| olds on twitter.
| superb_dev wrote:
| > Well, the general 'Rewrite All in Rust' consensus is that
| it solves all general programming problems, ever.
|
| No, that's not the consensus. This is a strawman.
| timeon wrote:
| > Well, the general 'Rewrite All in Rust' consensus is that
| it solves all general programming problems, ever.
|
| This is obvious example of strawman. Why are you doing
| this?
| PreInternet01 wrote:
| Towards general mental health. I'm just a C# wage slave,
| and I'll admit, when being prompted, that my language,
| its vendor, its runtime environment, and its general
| approach are, to put it kindly, _flawed_.
|
| However, as evidenced by the arguments and voting in this
| thread, Rust proponents will take _no_ criticism,
| _whatsoever_.
|
| I linked to a GitHub repository that documents many, many
| instances in which _generally safe_ Rust causes UB.
|
| The same kind of UB that recently hit one of my
| coworkers, caused a 3-day outage and now (despite all my
| counseling to the contrary!) will burn them out
| permanently.
|
| My only request: can you guys please _back off_ just a
| _little bit_? Programming is already hard enough without
| the purity wars you 're stoking all the time...
| keybored wrote:
| Stoking language flame wars based on hysterical
| exaggeration has never promoted mental health.
| fargle wrote:
| to be fair, from his perspective, it's often the rusty
| crowd who is stoking the flame wars - this sounds like a
| reaction to them.
|
| how often do we hear something like "C and C++ are
| horribly flawed and completely unsafe. it's basically a
| crime against humankind and gross negligence to use
| them"?
|
| i get weary of that kind of thing too. i wouldn't
| approach it by reacting in the same way as the GP
| comment, but i get it. and it's not really that much of a
| strawman. it's more exasperation and sarcasm.
|
| personally, i'm very interested in rust. but everytime
| someone at best "overhypes" it or at worse, outright dogs
| on other languages, it's a negative point toward dealing
| with the whole rust ecosystem.
| leftyspook wrote:
| It is a tool for checking that your unsafe code doesn't cause
| UB. It doesn't really settle anything, but the commenter uses
| it as a gotcha to say "rust is no better than C, because you
| still can compile code that contains UB".
| spease wrote:
| They are claiming that because code in 'unsafe' blocks in
| Rust can have undefined behavior, that the language is no
| safer than C.
|
| This does not settle the debate because unsafe is rarely
| needed for a typical Rust program. In addition, the presence
| of an unsafe block also alerts the reader that the set of
| possible errors is greatly increased for that part of the
| code and more careful auditing is needed.
|
| It's a little like saying traffic lights are useless because
| emergency responders need to drive through them sometimes, so
| we should just leave intersections completely unsignaled and
| expect drivers to do better.
|
| Rust is by default restrictive and requires you to explicitly
| make it unsafe, C/++ are by default unsafe and require you to
| explicitly make them restrictive.
| keybored wrote:
| You linked an interpreter for some kind of internal compiler
| representation that the Rust compiler uses.
|
| What on Earth do you mean?
| nequo wrote:
| It's the old trope that some Rust code uses unsafe blocks so
| all Rust code is as unsafe as C.
| keybored wrote:
| Of course. I should have expected the Nirvana Fallacy. :)
| melling wrote:
| I don't know Rust but even if the Rust is just as unsafe in
| certain blocks, simply being translated to Rust removes a
| lot of corporate resistance to adopt the language.
|
| Getting people to adopt a new language can be a lot of
| work. I remember people claiming they missed headers files
| in Swift so they wanted to stick with Objective C.
| PreInternet01 wrote:
| > What on Earth do you mean?
|
| That _documented_ use of _safe_ Rust can easily lead to UB,
| which this infernal 'internal compiler representation'
| demonstrates.
|
| I'm not even sure what is even remotely confusing about that?
| woodruffw wrote:
| Miri is a MIR interpreter aimed at _unsafe_ Rust, not safe
| Rust. Using the fact that it operats on an internal
| representation is a very weird swipe; almost all static and
| dynamic analysis tools work on some kind of IR or
| decomposed program representation.
| commodoreboxer wrote:
| > Miri is an Undefined Behavior detection tool for Rust. It
| can run binaries and test suites of cargo projects and
| detect unsafe code that fails to uphold its safety
| requirements.
|
| > ... detect unsafe code that fails ...
|
| Show me the documented safe Rust code that causes UB
| without using any unsafe blocks outside of the standard
| library.
| steveklabnik wrote:
| There are some soundness holes in the implementation that
| can cause this. Just like any project, the compiler can
| have bugs. They'll be fixed just like any bug.
| commodoreboxer wrote:
| Yes, in particular some interactions with LLVM have
| caused some frustrating UB. But those are considered
| implementation bugs, rather than user bugs, and all the
| conditions Miri states at the top are relevant primarily
| in unsafe code, which contradicts the OP's point, which
| is that there are tons of documented cases of UB in safe
| Rust. This is not true. There are a few documented cases,
| and most have been fixed. It's nowhere close to the world
| of C or C++'s UB minefield.
| steveklabnik wrote:
| For sure, just making sure to acknowledge this is the
| case, before someone responded to your post with cve-rs.
| :)
| PreInternet01 wrote:
| Ah, a voice of sort-of sanity, at long last.
|
| So, the reason I posted my original reply, is that at one
| of my $DAYJOBs, we recently had a 3-day outage on some
| service, related to Rust. Something like using AVX to
| read, like, up to 7 bytes too many from an array.
|
| Nothing really major -- we have a 10-day backup window,
| and the damage was limited to 4 days, so we were able to
| identify and fix all identified cases. But the person-to-
| Git-blame for this issue happened to be one of my
| mentees, and... they were blown away by it.
|
| As in: literally heartbroken. Unable to talk about it.
| "But the compiler said it was okay!", crying. One of my
| coworkers pointed at MIRI, which correctly warned about
| the issue-at-hand, at which point I recommended
| incorporating that tool into the build pipeline, as well
| as (the usual advice in cases such as this) improving
| unit tests and focusing on X-1 and X+1 cases that might
| be problematic.
|
| To this day, I'm _truly_ worried about my mentee. I 'm
| just a C# wagie, and I fully accept that my code, my
| language, my compiler, and my runtime environment are all
| shit.
|
| But, as evidenced by my experience and supported by the
| voting in this thread, it seems that Rust users seem to
| self-identify with the absolute infallibility of anything
| relate to the language, and react quite violently and
| self-destructively to any evidence to the contrary.
|
| As a community leader, do you see any room for
| improvement there? And if not, what would it take to
| convince you?
| n_plus_1_acc wrote:
| The Rust community as a whole very much promotes the idea
| of trusting the Compiler. Which is a very useful thing,
| especially for folks coming from other languages like C.
| It's not perfect of course as the compiler has bugs, but
| I think it still a good thing to teach.
| steveklabnik wrote:
| > using AVX
|
| This would require using unsafe code.
|
| > As in: literally heartbroken. Unable to talk about it.
|
| I would hope that this person improves as an engineer,
| because this isn't particularly professional behavior,
| from the way you describe it.
|
| > "But the compiler said it was okay!"
|
| Given that you'd have to use unsafe to do this, the
| compiler _can 't_ say it was okay. It sounds like this
| person may not fully understand Rust either.
|
| > it seems that Rust users seem to self-identify with the
| absolute infallibility of anything relate to the
| language, and react quite violently and self-
| destructively to any evidence to the contrary.
|
| I don't see how this generalizes. You had one (apparently
| junior, given "mentee"?) person make a mistake and
| respond poorly to feedback. You also barged into this
| thread and made incorrect statements about Rust, and were
| downvoted for it. That doesn't mean that Rust users think
| everything is perfect.
|
| > As a community leader, do you see any room for
| improvement there?
|
| I _do_ think sometimes enthusiastic people who don 't
| understand things misrepresent the thing they're
| enthusiastic about, but that's a human problem, not a
| Rust problem. I do not think there's a way to fix that,
| no.
| neonsunset wrote:
| Don't worry, your language and _especially_ the runtime
| and compiler are great. Particularly so in the last few
| years. I wouldn 't worry about the noise, maybe it
| concerns C++, but C# is a strict productivity upgrade for
| general-purpose applications despite _some_ * of the
| dated bits in the language (but not the runtime).
|
| * like un-unified representation of nullable reference
| types and structs under generics for example, or just the
| weight of features over the years, still makes most other
| alternatives look abysmal in comparison
| keybored wrote:
| > I'm just a C# wagie, and I fully accept that my code,
| my language, my compiler, and my runtime environment are
| all shit.
|
| What is shit about those things for C#? That's the
| application programming language that seems to get the
| least flak out of all of them.
|
| If I'm using an alpha or beta compiler, I might suspect a
| compiler bug from time to time... not really when I'm
| working in a decades-old, very established language.
| keybored wrote:
| Indeed. There have been UB bugs in the standard library
| caused by unsafe blocks.
|
| Those are bugs. They are faults in the code. They need to
| be fixed. They are not UB-as-a-feature like in C/C++. "Well
| watch out for those traps every time you use this."
|
| This is like getting mad that a programming language boasts
| that it produces great binaries and yet the compiler has a
| test suite to catch bugs in the emitted assembly. That's
| literally what you are doing.
| Calavar wrote:
| > Those are bugs. They are faults in the code. They need
| to be fixed. They are not UB-as-a-feature like in C/C++.
|
| Rust has UB-as-a-feature too. They could have eliminated
| UB from the language entirely, but they chose not to (for
| very valid reasons in my opinion).
|
| UB is a set of contracts that you as the author agree to
| never violate. In return, you get faster code under the
| assumption that you never actually encounter a UB
| condition. If you violate those contracts in Rust and
| actually encounter UB, that's a a bug, that's a fault in
| the code. If you violate those contracts in C++, that's a
| bug, that's a fault in the code. This is the same in both
| languages.
|
| It's true that Rust UB can only _arise_ from unsafe
| blocks, but it is not _limited_ to unsafe blocks. Rust UB
| has "spooky action at a distance" the same way C++ UB
| does. In other words, you can write UB free code in Rust,
| but if any third party code encounters UB (including the
| standard library), your safe code is now potentially
| infected by UB as well. This is also the same in both
| languages.
|
| There are good reasons to favor Rust's flavor of UB over
| C++'s, but I keep seeing these same incorrect arguments
| getting repeated everywhere, which is frustrating.
| keybored wrote:
| > There are good reasons to favor Rust's flavor of UB
| over C++'s, but I keep seeing these same incorrect
| arguments getting repeated everywhere, which is
| frustrating.
|
| Tell me what I wrote that was incorrect. I called them UB
| bugs in the standard library. If they were trivial bugs
| that caused some defined-behavior logic bug while used
| outside of the standard library then it wouldn't rise to
| the level of being called an UB bug.
| Calavar wrote:
| > They are not UB-as-a-feature like in C/C++.
|
| That's the part that's incorrect. That, plus the
| implication that UB is a bug in Rust, but not in C++. As
| I said, the existence of UB is a feature in both
| languages and actually encountering UB is a bug in both
| languages. You can play with the semantics of the word
| "feature" but I don't think it's possible to find a
| definition that captures C++ UB and excludes Rust UB
| without falling into a double standard. Unfortunately
| double standards on UB are pretty common in conversations
| about C++ and Rust.
| keybored wrote:
| You're done editing the comment now?
|
| Do you think UB-as-feature is something that someone
| would honestly describe C or C++ as? It's a pretty
| demeaning way of framing things. Indeed it's a tongue-in-
| cheek remark, a vhimsical exaggeration/description of the
| by-default UB of those languages which was added to the
| end of the completely factual description of the role
| that finding UB in the Safe Rust subset of the standard
| library of Rust serves.
|
| Of course one cannot, from the Rust Side so to speak, use
| tongue in cheek, off-hand remarks in these discussions;
| one must painstakingly add footnotes and caveats, list
| and mention every trivial fact like "you can get UB in
| unsafe blocks"[1] or else you have a "double standard".
|
| [1] Obligatory footnote: even though all participants in
| the discussion clearly knows this already.
| oconnor663 wrote:
| > It's true that Rust UB can only arise from unsafe
| blocks, but it is not limited to unsafe blocks.
|
| This is correct, and it's hard to teach, and I agree that
| a lot of folks get it wrong. (Here's my attempt:
| https://jacko.io/safety_and_soundness.html.) But I think
| this comment is understating how big of a difference this
| makes:
|
| 1. Rust has a large, powerful safe subset, which includes
| lots of real-world programs. Unsafe code is an advanced
| topic, and beginners don't need to learn about it to
| start getting their work done. Beginners can contribute
| to big projects without touching the unsafe parts (as you
| clarified, that means the _module privacy boundaries_
| that include unsafe code, not just the unsafe blocks),
| and reviewers don 't need to be paranoid about every
| line.
|
| 2. A lot of real-world unsafe Rust is easy to audit,
| because you can grep for `unsafe` in a big codebase and
| zoom right to the parts you need to look at. Again, as
| you pointed out, those blocks might not be the whole
| story, and you do need to read what they're doing to see
| how much code they "infect". But an experienced Rust
| programmer can audit a well-written codebase in
| _minutes_. It 's not always that smooth of course, but
| it's a totally different world that that's even possible.
| estebank wrote:
| > That _documented_ use of _safe_ Rust can easily lead to
| UB
|
| The only thing that comes to mind that this could be
| referring to are the open bugs at https://github.com/rust-
| lang/rust/issues?q=is%3Aopen+is%3Ais.... Are these what
| you're referring to?
|
| > this infernal 'internal compiler representation'
|
| What makes MIR "infernal"?
|
| > I'm not even sure what is even remotely confusing about
| that?
|
| You posted a link to a tool that executes pure rust
| libraries and evaluates memory accesses (both from safe and
| unsafe rust code) to assert whether they conform to the
| rust memory model. It sits in the same space as valgrind.
| You left it open to interpretation with really no other
| context. We can be excused for not knowing what you were
| trying to say. I personally still don't.
| the8472 wrote:
| The trophy cases in miri are about bugs in _unsafe_ code. Yes,
| you can write UB with unsafe code. This should not be news.
|
| And miri is a blessing. There even is a known case where
| someone found a bug in C by translating it to rust and then
| running it through miri.
| mike_hearn wrote:
| That sounds ... hard. Especially as idiomatic Rust as written by
| skilled programmers looks nothing like C, and most interesting
| code is written in C++ anyway.
|
| Isn't it equivalent to statically determining the lifetimes of
| all allocations in the C program, including those that are
| implemented using custom allocators or which cross into
| proprietary libraries? There's been a lot of research into this
| sort of thing over the years without much success. C/C++ programs
| can do things like tie allocation lifetimes to what buttons a
| user clicks, without ref counting or other mechanisms to ensure
| safety. It's not a good idea, but, they can do it.
|
| The other obvious problem with trying to write such a static
| analysis is that the programs you're analyzing are by definition
| buggy and the lifetimes might not make sense (if they did, they
| wouldn't have memory safety holes and wouldn't need to be
| replaced). The only research I've seen on this problem of
| statically detecting what lifetimes should be does assume the
| code being analyzed is actually correct to begin with. I guess
| you could try and aim for a program that detects where lifetimes
| can't be worked out and asks the developer for help though.
| mattgreenrocks wrote:
| Projects are termed DARPA-hard for a reason.
| woodruffw wrote:
| It's very hard; DARPA likes to fund hard things[1] :-).
|
| This isn't, however, DARPA's first foray into automatic program
| translation, or even automatic translation into Rust[2].
|
| [1]:
| https://www.urbandictionary.com/define.php?term=DARPA%20hard
|
| [2]: https://c2rust.com/
| the_snooze wrote:
| DARPA is basically a state-sponsored VC that optimizes for
| completely different things. Instead of looking for 100x
| financial returns, they want technical advantages for the
| United States. The "moat" is the hardness of developing and
| operationalizing those technologies first.
| woodruffw wrote:
| DARPA's commercialization track record is decidedly mixed,
| so the VC comparison is unexpectedly apt :-)
|
| (But yes: DARPA's mandate is explicitly to discover and
| develop the next generation of emerging technologies for
| military use.)
| pfdietz wrote:
| Decades ago, as my father explained to me, ARPA (no "D"
| at that time) was happy if 1% of their projects went all
| the way through to successful deployment. If they had a
| higher success rate it would mean they weren't aiming
| high enough.
| VikingCoder wrote:
| > DARPA's commercialization track record is decidedly
| mixed...
|
| If you count my number of attempts, sure.
|
| If you count by impact, it's hard to come up with many
| things more impactful than the Internet...?
| woodruffw wrote:
| Yeah, I meant by number. But also: ARPA didn't
| commercialize the Internet! They explicitly refused to
| commercialize it; commercialization only happened after
| an Act of Congress induced interconnections between
| NSFNET and commercial networks.
| mburns wrote:
| To be pedantic, In-q-tel is the literal state-sponsored VC.
|
| DARPA is a step closer to traditional research labs but
| there is obviously some overlap.
|
| https://en.wikipedia.org/wiki/In-Q-Tel
| throwup238 wrote:
| _> DARPA is a step closer to traditional research labs
| but there is obviously some overlap._
|
| It's more like the NSF but focused on commercial grantees
| with project management thrown on top to orchestrate
| everything.
|
| The really unique part is how much independence each
| program manager has and the term limits that prevent
| empire building.
| fsckboy wrote:
| in this case it seems to me the hard task that DARPA has
| chosen is to get me to forget how much they spent on pushing
| Ada.
| woodruffw wrote:
| I can't find any clear references to DARPA (or ARPA) being
| involved in Ada's development. It was a DoD program but,
| well, the DoD is notoriously large and multi-headed.
|
| (But even if DARPA was involved in Ada: I think it's clear,
| at this point, that Ada has been a resounding success in a
| _small_ number of domains without successfully breaking
| into general-purpose adoption. I don 't have a particular
| value judgment associated with that, but from a strategic
| perspective it makes a _lot_ of sense for DARPA to focus
| program analysis research on popular general-purpose
| languages -- there 's just more labor and talent
| available.)
| reaperducer wrote:
| _in this case it seems to me the hard task that DARPA has
| chosen is to get me to forget how much they spent on
| pushing Ada._
|
| You hate jumbo jets, high-speed trains, air traffic
| control, and satellites?
| 9659 wrote:
| Do you know what fear is? Getting in an airplane where
| the flight controls use NPM.
| warkdarrior wrote:
| npm ERR! install Couldn't read dependencies npm
| ERR! package.json ENOENT, open '/boeing/787-9/flaps-
| up.json' npm ERR! package.json This is most likely
| not a problem with npm itself. npm ERR!
| package.json npm can't find a package.json file in your
| current directory.
| 9659 wrote:
| ada does not require 'pushing'.
|
| once the maturity of the users advances to a sufficient
| point, then ada is the only solution.
|
| "ada. used in creating reliable software since 1983"
|
| when i first saw ada, i didn't understand the why. now i
| understand the why, but ada is effectively gone.
|
| -- old fortran / C / Assembly programmer
| 01HNNWZ0MV43FF wrote:
| Can't most c++ be machine-lowered to C?
| woodruffw wrote:
| _Lowering_ is typically easier than _lifting_ (or
| brightening). When you lower, you can erase higher-level
| semantics that aren 't relevant; when you lift, you generally
| _want_ to compose lower-level program behaviors into their
| idiomatic (and typically safer) equivalent.
| childintime wrote:
| Hard for humans. But it's DARPA, is it hard for AI? Image
| classification used to be hard also, today cars drive
| themselves.
|
| I'd say it's good timing.
| mike_hearn wrote:
| Well, Claude 3.5 can do translation from one language to
| another in a fairly competent manner if the languages are
| close enough. I've used it for that task myself with success
| (Java -> JavaScript).
|
| But, this isn't just about rewriting code from one language
| to another. It's about reverse engineering complex
| information out of the code, which may not be immediately
| visible in it, and then finding a way to make it "safe"
| according to Rust's type system. Where's the training data
| for that? It'd be really hard even for skilled humans.
|
| Personally I think the most pragmatic way to make C/C++
| memory safe quicker is one of two approaches:
|
| 1. Incrementally. Make std::vector[] properly bounds checked
| (still not done even in chrome!), convert allocations to
| allocations that know their own size and do bounds checking
| e.g. https://issues.chromium.org/issues/40285824
|
| 2. Or, go the whole hog and use runtime techniques like
| garbage collection and runtime bounds checks.
|
| A good example of approach (2) is Managed Sulong, which
| extends the JVM to execute LLVM bitcode directly whilst
| exposing to the C/C++/FORTRAN a virtualized Linux syscall
| interface. The whole piece of code can be sandboxed with
| permissions, and memory safety errors are caught at runtime.
| The compiler tries to optimize out as many bounds checks as
| possible. The interesting thing about this approach is it
| doesn't require big changes to the source code (as long as
| it's already been ported to Linux), which means the work of
| making something safe can be done by teams independent of the
| original authors. In practice "rewrite it in Rust" will
| usually mean a fork, which introduces lots of complicated
| technical, cultural and economic issues.
|
| Managed Sulong is also a research project and has a bunch of
| problems to solve, for instance it needs to lose the JITC
| dependency and go fully AOT compiled (doable, there's no
| theoretical issue with it and much of the needed infra
| already exists). And performance/memory usage can always be
| improved of course, it regresses vs the original C. But those
| are "just" systems engineering problems, not rewrite-the-
| world and solve-static-analysis problems.
|
| Disclosure: I do work part time at Oracle Labs which
| developed Managed Sulong, but I don't work on it.
| TinkersW wrote:
| std::vector [] has had bounds checking since forever if you
| set the correct compiler flag. Since they aren't using it
| this is a choice, presumably they prefer the speed gain.
| mike_hearn wrote:
| You mean _GLIBCXX_DEBUG? It's got some issues. Linux
| only, it doesn't always work [1] and it's all or nothing.
| What's really needed is the ability to selectively opt-
| out on a per-instantiation level so very hot paths can
| keep the needed performance whilst all the rest gets
| opted into safety checks.
|
| Microsoft has this:
|
| https://learn.microsoft.com/en-us/cpp/standard-
| library/safe-...
|
| but it doesn't seem to actually make std::vector[] safe.
|
| It's frustrating that low hanging fruit like this doesn't
| get harvested.
|
| [1] "although there are precondition checks for some
| string operations, e.g. operator[], they will not always
| be run when using the char and wchar_t specializations
| (std::string and std::wstring)."
| Calavar wrote:
| As far as I am aware, the standard doesn't mandate bounds
| checking for std::vector::operator[] and probably never
| will for backwards compatibility reasons. Most standard
| library implementations have opt-out std::vector[] bounds
| checking in unoptimized builds, but not in optimized
| builds.
|
| I tried a toy example with GCC [1], Clang [2], and MSVC
| [3], and none of them emit bounds checks with basic
| optimization flags.
|
| [1] https://godbolt.org/z/W5e3n5oWM
|
| [2] https://godbolt.org/z/Pe8nPPvEd
|
| [3] https://godbolt.org/z/YTdv3nabn
| Animats wrote:
| > But, this isn't just about rewriting code from one
| language to another. It's about reverse engineering complex
| information out of the code, which may not be immediately
| visible in it, and then finding a way to make it "safe"
| according to Rust's type system. Where's the training data
| for that? It'd be really hard even for skilled humans.
|
| That might not be too bad.
|
| A combination of a formal system and an LLM might work
| here. Suppose we see a C function void
| somefn(char* buf, int n);
|
| First question: is "buf" a pointer to an array, or a
| pointer to a single char? That can be answered by looking
| at what the function does with "buf", and what callers pass
| to it.
|
| If it's an array, how big is it? We don't have enough info
| to know that yet. But a reasonable guess, and one than an
| LLM might make, is that the length of buf is "n".
|
| Following that assumption, it's reasonable to translate
| this to Rust as fn somefn(buf: &[u8])
|
| and, if n is needed within the function, use
| buf.len()
|
| The next step is to validate that guess. The run-time
| approach is to write all calls to "somefn" with
| assert!(buf.len() == n); somefn(buf, n);
|
| Maybe formal methods can prove the assert true, and we can
| take it out. Or if a SAT solver or a fuzz tester can
| generate a counterexample, we know that the guess was wrong
| and this has to be done the hard way, as
| fn somefn(buf: &[u8], int n)
|
| implying more subscript checks inside "somefn".
|
| The idea is to recognize common C idioms and do clean
| translations to Rust for them. This should handle a high
| percentage of cases.
| Calavar wrote:
| > today cars drive themselves
|
| You can attach about a hundred asterisks to that.
|
| If anything, I think self the failure to hit L5 driving after
| billions of dollars and millions of man hours invested is
| probably reflective of how automatic C to Rust translation
| will go. We'll cruise 90% of the way, but the last 10% will
| prove insurmountable with current technology.
|
| Think about the number of C programs in the wild that rely on
| compiler-specific or libc-specific or platform-specific
| behavior, or even undefined behavior plus the dumb luck of a
| certain brittle combination of {compiler version} [?] {libc
| version} [?] {linker version} [?] {build flags} emitting
| workable machine code. There's a huge chunk of C software
| where there's not enough context within the source itself (or
| even source plus build scripts) to understand the behavior.
| It's not even clear that this is a solvable problem in the
| abstract.
|
| None of that is to say that DARPA shouldn't fund this.
| Research isn't always about finding an industrial strength
| end product; the knowledge and expertise gained along the way
| is important too.
| psychoslave wrote:
| Ok, but if it's like 90% of small projects can use it as
| direct no pain bridge, that can be a huge win.
|
| Even if it's "can handle well 90%" of the transition for
| any project, this is still interesting. Unlike cars on the
| road, most code transition project out there doesn't need
| to be 100% fine to provide some useful value.
| 0cf8612b2e1e wrote:
| Even if every project can only be 90% done, that's a huge
| win. Best would be if it could just wrap the C equivalent
| code into an unsafe block which would be automatically
| triaged for human review.
|
| Just getting something vaguely Rust shaped which can
| compile is the first step in overcoming the inertia to
| leave the program in its current language.
| programd wrote:
| > > today cars drive themselves
|
| > You can attach about a hundred asterisks to that.
|
| Not in San Francisco. There are about 300 Waymo cars safely
| driving in one of the most difficult urban environments
| around (think steep hills, fog, construction, crazy
| traffic, crazy drivers, crazier pedestrians). Five years
| ago this was "someday" science-fiction. Frankly I trust
| them much more then human drivers and envision a future
| utopia where human drivers are banned from urban centers.
|
| To get back on topic, I don't think automatic programming
| language translation is nearly as hard, especially since we
| have a deterministic model of the machines it runs on. I
| can see a possible approach where AI systems take the
| assembler code of a C++ program, then translate that into
| Rust, or anything else. Can they get 100% accuracy and bit-
| for-bit compatibility on output? I would not bet against
| it.
| creata wrote:
| Isn't 100% accuracy (relatively) easy? c2rust already
| does that, or at least comes close, as far as I know.
|
| Getting identical outputs on safe executions, catching
| any unsafe behavior (at translation-time or run-time),
| and producing efficient, maintainable code all at once is
| a million times harder.
| m0llusk wrote:
| Opinions about automated driving systems vary. Just from
| my own experience doing business all around San Francisco
| I have seen at least a half dozen instances of Waymo
| vehicles making unsafe maneuvers. Responders have told me
| and local government officials that Waymo vehicles
| frequently fail to acknowledge emergency situations or
| respond to driving instructions. Driving is a social
| exercise which requires understanding of a number of
| abstractions.
| saagarjha wrote:
| San Francisco, for all its challenges, mostly has traffic
| laws that people follow. This is not true throughout the
| world.
| D-Coder wrote:
| In addition to the other replies, this is a one-time
| project. After everything (or almost everything) has been
| translated, you're done, you won't be running into new edge
| cases.
| sqeaky wrote:
| This is the exact formulation of the argument before
| computers beat humans at chess, or drew pictures, or
| represented color correctly, or... Self driving cars will
| be solved. There is at least one general purpose computer
| that can solve it already (a human brain), so of a purpose
| built computer can also be made to solve it.
|
| In 10 (or 2 or 50 or X) years when Chevy, Ford, and others
| are rolling out cheap self driving this argument stops
| working. The important thing is that this argument stops
| working with no change in how hard C to Rust conversion is.
|
| We really should be looking at the specifics of both
| problems. What makes computer language translation hard?
| Why is driving hard? One needs to be correct while
| inferring intent and possibly reformulating code to meet
| new restrictions. The other needs to be able to make snap
| judgments and in realtime avoid hitting things even if it
| just means stopping to prefer safety over motion. One
| problem can be solved piecewise without significant regard
| to time and the other solved in realtime as it happens
| without producing unsafe output.
|
| These problems really aren't analogous.
|
| I think you picked self driving cars just because it is a
| big and only partially solved problem. One could just as
| easily pick a big solved problem or a big unstarted problem
| and formulate equally bad arguments.
|
| I am not saying this problem is easy, just that it seems
| solvable with sufficient effort.
| mywittyname wrote:
| > These problems really aren't analogous.
|
| I'd put money on the solutions to said problems looking
| largely the same though - big ass machine learning
| models.
|
| My prediction is that a tool like copilot (but
| specialized to this domain) will do the bulk of source
| code conversions, with a really smart human coming behind
| to validate.
| eesmith wrote:
| As a reminder, DARPA funded self-driving car research since
| at least the 1980s with the Autonomous Land driven Vehicle
| (ALV) project, plus the DARPA Grand Challenges, and more.
| sam0x17 wrote:
| speaking of hard, the DOE actually funds a project that has
| been around for 20+ years now (ROSE) that involves (among other
| things) doing static analysis on and automatically translating
| between C/C++/Cuda and even high level languages like Python as
| well as HPC variants of C/C++. They have a combined AST that
| supports all of those languages with the same set of node types
| essentially. Quite cool. I got to work on it when I was an
| intern at Livermore, summer of 2014.
|
| and it's open source as well!
| http://rosecompiler.org/ROSE_HTML_Reference/index.html
| jandrese wrote:
| I have to think the approach will be something like "AI
| summarizes the features of the program into some kind of
| technical language, then the AI synthesizes Rust code that
| covers the same feature set".
|
| It would be most interesting if the approach was not to feed
| the program the original program but rather the manual for the
| program. That said it's rare that a manual captures all of the
| nuances of the program so a view into the source code is
| probably necessary, at least for getting the ground truth.
| munificent wrote:
| More like:
|
| "AI more or less sort of summarizes the features of the
| program into some approximate kind of technical language,
| then the AI synthesizes something not too far from Rust code
| that hopefully covers aspirationally the same feature set".
| downrightmike wrote:
| If the IRS could have more timely funding, all their Cobol
| would be translated to Java by now
| psunavy03 wrote:
| COBOL migrations are tar pits of replicating 40+ years of
| undocumented niche business logic for a given field, edge
| cases included, that was "commonly understood" by people who
| are now retired or dead. Don't get your hopes up.
| the8472 wrote:
| Write tests for your C code. Run c2rust (mechanical
| translation), including the tests. Let a LLM/MCTS/verifier loop
| go to town. Verifier here means it passes compiler checks,
| tests, santiziers and miri.
|
| Additional training data can be generated by running mrustc or
| by inlining unsafe code (from std/core/leaf crates) into safe
| code and running semantics-preserving mechanical refactorings
| on the code.
|
| This can be closer to AlphaProof than ChatGPT
| rectang wrote:
| I have to imagine that in the general case it will be a
| translation to unsafe Rust, with occasional isolated leaf nodes
| being translated to safe Rust.
|
| If you think it's hard wrestling with the borrow checker, just
| imagine how much harder it is to write automatic translation to
| borrow-checker-approved code that accounts for all the possible
| program space of C and all it's celebrated undefined behavior.
| A classic problem of writing compilers is that the space of
| valid programs is much larger than the space of programs which
| will compile.
|
| A quick web search reveals some other efforts, such as c2rust
| [1]. I wonder how TRACTOR differs.
|
| [1] https://github.com/immunant/c2rust
| Someone wrote:
| > have to imagine that in the general case it will be a
| translation to unsafe Rust, with occasional isolated leaf
| nodes being translated to safe Rust.
|
| That's not what they are aiming for. FTA: _"The goal is to
| achieve the same quality and style that a skilled Rust
| developer would produce"_
|
| > just imagine how much harder it is to write automatic
| translation to borrow-checker-approved code that accounts for
| all the possible program space of C and all it's celebrated
| undefined behavior
|
| Nitpick: undefined behavior gives the compiler leeway in
| deciding what a program does, so the more undefined behavior
| a C program invokes, the easier it is to translate its code
| to rust.
|
| (Doing that translation in such a way that the behavior
| remains what gcc, clang or "most C compilers" do may be
| harder, but I'm not sure of that)
| rectang wrote:
| > _undefined behavior gives the compiler leeway in deciding
| what a program does, so the more undefined behavior a C
| program invokes, the easier it is to translate its code to
| rust._
|
| That's the kind of language lawyer approach that caused a
| rebellion in the last decade amongst C programmers against
| irresponsible compiler optimizations. "Who cares if your
| program actually works as intended? My optimization is
| legal according to the standard, it's _your_ program that
| 's written to exploit loopholes".
|
| I don't see any evidence that that's the attitude being
| taken by TRACTOR -- I sure hope it isn't. But hell, even if
| the result is unreliable in practice, I suppose that if
| somebody gets to claim "it works" then the incentives are
| aligned to produce garbage.
| atiedebee wrote:
| > Who cares if your program actually works as intended?
| My optimization is legal according to the standard, it's
| your program that's relying written to exploit
| loopholes".
|
| If your program invokes undefined behaviour, it's invalid
| and non-portable. Out of bounds array accesses are UB,
| yet a program containing them may just happen to work. It
| won't be portable even between different compiler
| versions.
|
| The C standard is a 2 way contract: the programmer
| doesn't produce code that invokes undefined behaviour,
| and the compiler returns a standard conforming executable
| rectang wrote:
| The C standard with its extensive undefined behavior
| causes programmers and compiler writers to be at odds. In
| a sane world, "undefined behavior" wouldn't be assumed to
| mean "the programmer must have meant for me to optimize
| this whole section of code away". We aren't on the same
| team, even if I believe that all parties are acting with
| the best of intentions.
|
| I don't feel that the Rust language situation
| incentivizes such awful conflict, and it's one of many
| reasons I now try _really_ hard to avoid C and use Rust
| instead.
| Asooka wrote:
| Doing one funny thing on platform A and a different funny
| thing on platform B when an edge case arises is way
| better than completely deleting the code on all platforms
| with no warning.
| derdi wrote:
| > undefined behavior gives the compiler leeway in deciding
| what a program does, so the more undefined behavior a C
| program invokes, the easier it is to translate its code to
| rust.
|
| You assume that the compiler can determine what behavior is
| undefined. It can't. C compilers don't just look at some
| individual line of the program and say "oh, that's
| undefined, unleash the nasal demons". C compilers look at
| code, reason that _if_ such-and-such variable has a certain
| value (say, a null or invalid pointer), then such-and-such
| operation is undefined (say, dereferencing that variable),
| and _therefore_ on the next line that variable can be
| assumed not to have that bad value. Despite all the FUD,
| this is a very limited power. C compilers don 't usually
| know the actual values in question, all they do is exclude
| some invalid ones.
| kragen wrote:
| presumably dan wouldn't have gotten darpa funding if it were
| obviously feasible, and success wouldn't give him anything
| publishable academically
| dgacmu wrote:
| Just to be clear to others, Dan is the darpa PM on this - he
| convinced darpa internally it was worth funding other people
| to do the work, so he himself / his research group won't be
| doing this work. He's on leave from Rice for a few years to
| be a PM at DARPA's I2O.
|
| And while DARPA doesn't directly care about research
| publications as an outcome, there's certainly a publishable
| research component to this, as well as a lot of lower papers-
| per-$ engineering and validation work. A lot of the contracts
| they hand out end up going to some kind of contractor prime
| (BBN, Raytheon, that kind of company) with one or more
| academic subs. The academic subs publish.
| kragen wrote:
| thank you for the correction; I didn't realize he was the
| darpa pm
|
| what you describe is exactly my experience as a darpa
| performer (on a program which dan is apparently now the pm
| for!)
| niemandhier wrote:
| Is this supposed to be automatic ? And if so wouldn't any
| Programm that can automatically port c to rust, by necessity
| contain all the functionality to make the c code itself safe?
| gpm wrote:
| I don't think a reasonable reading of the statement implies
| "fully automated", at which point the answer to the question is
| no.
|
| Obviously some C code isn't just "not verifiable correct" but
| "actually wrong in a memory unsafe way". That code isn't going
| to be automatically translated without human intervention
| because, how could it be, there is no correct equivalent code.
| The tooling is going to have to have an escape hatch where it
| says "I don't know what this code is _meant_ to do, and I know
| it isn 't meant to do what it does do (violate promises to the
| compiler), help me human".
|
| On a theoretical level it's not _possible_ for that escape
| hatch to only be used when undefined behaviour _does_ occur
| (rices theorem). On a practical level it 's probably not even
| desirable to try because obtuse enough code shouldn't just be
| blindly translated.
|
| So what I imagine the tooling ends up looking like is an
| interactive tool that does the vast majority of the work for
| you, but is guided by a human, and ultimately as a result of
| that human guidance doesn't end up with _exactly_ equivalent
| code, just code that serves the same purpose.
| nanolith wrote:
| I'm personally not a fan of "rewrite the world in Rust"
| mentality, but that being said, if one is planning to port a
| project to a new language or platform, mechanical translation is
| a poor means of doing so. Spend the time planning better
| architecture and designing a better software system, and find a
| way to replace it piece by piece. Don't build a castle in the
| sky, because it will never reach the ground. If you've decided to
| use Rust for this system, that's fine. But, write Rust. Don't try
| to back-port C into Rust.
|
| I think a far better and more mature process is to update C to
| modern C and use a model checker such as CBMC to verify memory,
| resource, and integer math safety. One gets the same safety as a
| gradual Rust rewrite, but the code base, knowledge base, and
| developers can be maintained.
| pdimitar wrote:
| > _I 'm personally not a fan of "rewrite the world in Rust"
| mentality_
|
| There is no such mentality anywhere. There is a ton of software
| that's much better off left alone in a dynamic language, or a
| statically typed language with a garbage collector (like
| Golang). Good engineers understand the idea of using the right
| tool for the job.
|
| The push is to start reducing those memory safety CVEs because
| they have been proven to be a real problem, many times over.
|
| > _mechanical translation is a poor means of doing so_
|
| Agreed. If we could automatically and reliably translate C/C++
| to Rust it would have been done already.
|
| > _Spend the time planning better architecture and designing a
| better software system, and find a way to replace it piece by
| piece._
|
| OK, I am just saying that somewhere along that process people
| might get a bout of confidence and tell themselves "oh, we're
| doing C much better now, we no longer write memory safety bugs,
| can't we stop here?" and they absolutely will. Cue another
| hilarious buffer overflow CVE 6 months later.
|
| > _I think a far better and more mature process is to update C
| to modern C and use a model checker such as CBMC to verify
| memory, resource, and integer math safety._
|
| A huge investment. If you are going to do that then you might
| as well just move to Rust.
|
| > _One gets the same safety as a gradual Rust rewrite_
|
| Maybe, but that sounds fairly uncertain or far from a clear
| takeaway to me.
| nanolith wrote:
| > A huge investment. If you are going to do that then you
| might as well just move to Rust.
|
| People say that, but the people who say this rarely have any
| practical experience using CBMC. It's very straight-forward
| to use. I could teach a developer to use it reliably, on
| practical software, in a month.
| pdimitar wrote:
| I am not denying it, nor am I claiming that "just move to
| Rust" is an universal escape hatch.
|
| What I am saying is that if it were as simple as "just
| learn CBMC" then maybe Microsoft and Google would have not
| published their studies demonstrating that 60% - 75% of all
| CVEs are memory safety errors like buffer under-/over-
| flows.
| nanolith wrote:
| These studies aren't wrong. But, that's _also_ because
| neither Microsoft nor Google make use of practical formal
| methods in practice. Both have research teams and pie-in-
| the-sky projects, not dissimilar to this DARPA project.
| But, when it comes down to the nitty-gritty development
| cycle, both companies use decades old software
| development practices.
| uecker wrote:
| Rewriting is rarely a good idea in general. Rust proponents
| like to pretend that it is impossible to avoid safety issues
| in C while it is automatically given in Rust. But this is not
| so simply in reality.
| pdimitar wrote:
| I don't like generalizations... in in general. :D
| (Addressing your "rewrites are rarely a good idea in
| general" here.)
|
| My experience tells me that if a tech stack supports
| certain safety guarantees by default that this leads to
| measurable reduction of those safety problems when you
| switch to the stack. People love convenient defaults,
| that's a fact of life.
|
| The apparently inconvenient truth is that most programmers
| are quite average and you can't rely on them going above
| and beyond to reduce memory safety errors.
|
| So I don't buy the good old argument of "just hire better C
| programmers". We still have a ton of buffer overflow CVEs
| regardless.
|
| And I never "pretended it's impossible to avoid safety
| issues in C". I'll appreciate if you don't clump me in some
| imaginary group of "Rust proponents".
|
| What I'm saying is this: _use the right tool for the job_.
| The C devs have been given _decades_ and yet memory safety
| CVEs are still prevalent.
|
| What conclusion would you arrive at if you were in my place
| -- i.e. not coding C for a living for like 18 years now but
| still witnessing it periodically crapping the bed?
|
| I'm curious of your take on this. Again, what other
| conclusion would you arrive at?
| uecker wrote:
| I am complaining about the usual phrases which are part
| of the Rust marketing, like the "just hire better C
| programmer did not work" or the "why are there still
| CVEs" pseudo arguments, etc.
|
| For example, let's look at the "hire better C programmers
| does not work" argument. Like every good propaganda it
| starts with a truism: In this case that even highly
| skilled C/C++ programmers will make mistakes that could
| lead to exploitable memory safety issues. The problem
| comes from exaggerating this to the idea that "all hope
| is lost and nothing can be done". In reality one can
| obviously do a lot of things to improve safety in C/C++.
| And even one short look at CVEs should make it clear that
| there is often huge room for improvements even with
| relatively simple measures. For example, a lot of memory
| safety bugs in C/C++ come from open-coded string or
| buffer manipulation. But it is not exactly rocket science
| to abstract this away behind a safer interface. But once
| this is understood, the obvious conclusion is that
| addressing some of these low-hanging fruits would be far
| more effective in improving safety than wasting a lot of
| time and effort in rewriting in Rust.
| IshKebab wrote:
| > I think a far better and more mature process is to update C
| to modern C and use a model checker such as CBMC to verify
| memory, resource, and integer math safety.
|
| No chance. CBMC is amazing, but have you actually tried
| formally verifying a "real" program?
|
| I agree replacing with a hand-architected Rust version is
| clearly the better solution but also more expensive. I think
| they're going for an RLBox style "improve security
| significantly with little-to-no effort" type product here. That
| doesn't mean you shouldn't do a full manual rewrite if you have
| the resources, but it's better than nothing if you haven't.
| nanolith wrote:
| > No chance. CBMC is amazing, but have you actually tried
| formally verifying a "real" program?
|
| Yes. Every day. It's actually quite easy to do. Write shadow
| methods covering the resources and function contracts of
| called functions, then verify the function. Repeat all of the
| way up and down the stack. It adds about 30% overhead over
| just TDD development.
| PhilipRoman wrote:
| Last time I tried CBMC, it ended up running out of memory
| for relatively small programs, do you encounter any
| resource usage issues with it? I'm learning Frama-C and I
| find it more predictable, although the non-determinism of
| solvers shocked me when I first tried to prove non-trivial
| programs. I guess ideally I would like something even more
| explicit than Frama-C.
| nanolith wrote:
| CBMC works best on functions, not programs. You want to
| isolate an individual function, then provide shadows of
| the functions it calls. The shadows should have
| nondeterministic behavior (cover every possible error
| condition) and otherwise follow the same memory and
| resource rules as the original function. For instance, if
| shadowing a function that reads a buffer, the shadow
| should ensure full buffer access as part of its
| assertions.
|
| The biggest issue you will run into with bounded model
| checking is recursion and looping. In these cases, you
| want to refactor the code to make it easier to formally
| verify outside of the loop. Capture and assert on loop
| variants / invariants, and feed these forward in
| assertions on code.
|
| There's no way I can capture all of this in an HN
| comment, but to get CBMC to work, you need to break down
| your code.
| PhilipRoman wrote:
| Thanks, that was really helpful. Relying on getting
| shadow functions right does seem icky, but I guess the
| improved productivity of CBMC should make up for it.
| Definitely going to give it another chance!
| nanolith wrote:
| You're welcome. I've been meaning to write a blog article
| on the subject, because it is a subtle thing to get
| working.
|
| Think of shadow functions as the specifications that you
| are building. Unlike proof assistants or Frama-C, you
| write specifications in C itself, and they work similarly
| to code. Often, the same contracts you write in these
| specifications can be shared by both the shadow functions
| and the real functions they shadow.
|
| I take a bottom-up approach to model checking. I'll start
| by model checking the lowest level code, then I'll shadow
| this code to model check code that depends on it. In this
| way, I can increase the level of abstraction for model
| checking, focusing just on the side effects and contracts
| of functions I shadow, and move up the stack toward more
| and more general code.
| Apofis wrote:
| This is definitely a pie-in-the-sky DARPA challenge that would
| be great to have around as we migrate away from legacy systems,
| however, even taking your functions/methods in one language and
| giving them to ChatGPT and asking it to translate your method
| to a different language generally doesn't work. Asking ChatGPT
| the initial problem you're trying to solve, works more
| frequently, but still generally doesn't work. You still need to
| do a lot of tinkering and thinking to get even basic things to
| work that it outputs.
| usrusr wrote:
| _If_ you have dormant code, as in running everywhere but not
| getting worked on anywhere, a "translate to shitty rust before
| ever touching again" has a certain appeal. Not the appeal of an
| obviously good idea: chances are the "shitty rust" created
| through translation would be so much worse to work on than C
| with some level of background noise of bugs (that would also be
| present in the "shitty rust" thanks to faithful translation).
| In C, people have an idea about how to deal with the problems.
| In "shitty rust", it's, well, shitty, because rust people are
| not used to that stuff.
|
| But there's a non-zero chance that someone could develop a
| skillset for iteratively cleaning up into something tolerable.
|
| And then there are non-goal things that could grow out of the
| project, e.g. some form of linter feedback "can't translate
| into tolerable rust because of x, y and z". C people could look
| into that, and once the code is translatable into good rust,
| why translate.
|
| _If_ that was an outcome of the project, some people might
| find it easier to describe their solution in runnable C and let
| the "translator/linter" guide them to a non-broken approach.
|
| I'd certainly consider all these positive outcomes quite
| unlikely, but isn't it pretty much the job description of DARPA
| to do the occasional dark horse bet?
| suprjami wrote:
| In my experience (supporting a machine-translated codebase
| which resulted in shitty Java) your theory doesn't play out.
|
| If you give developers a shitty codebase then those
| developers will leave to work somewhere else.
|
| After a few years of working on this codebase we had 88%
| turnover. 1 in 10 developers remembered the original
| project's design philosophy and intention.
|
| It wasn't a good situation.
| TinkersW wrote:
| Good luck with that..also shouldn't the target be C++ to Rust? Is
| there really that much pure C still being written?
| surfingdino wrote:
| IoT, embedded systems still use it. There's loads of them.
| riku_iki wrote:
| AGI may find much simpler, more robust/performant and safe
| language.
| deepsun wrote:
| They didn't explain why they've chosen Rust. There are a lot of
| memory-safe languages besides Rust, especially in application-
| level area (not systems-level like Rust).
| woodruffw wrote:
| There are a lot of memory safe languages; there are fewer that
| have (1) marginal runtime requirements, (2) transparent
| interop/FFI with existing C codebases, (3) enable both spatial
| and temporal memory safety without GC, and (4) have significant
| development momentum behind them. Rust doesn't _have_ to be
| unique among these qualifications, but it 's currently
| preeminent.
| deepsun wrote:
| Yes, but you assume all their projects need all 4 of these. I
| like Rust, but it's a bad choice for many areas (e.g.
| aforementioned application-level code). I'd expect serious
| decisions to at least take that into account.
| woodruffw wrote:
| I'm not assuming anything of the sort. These are just
| properties that make Rust a nice target for automatic
| translation of C programs; there are myriad factors that
| _guarantee_ that nowhere close to 100% of programs (C,
| application level, or otherwise) won't be suitable for
| translation.
| galangalalgol wrote:
| If you have your cross hair on c, then you want a language that
| can do whatever c does. That makes the list of memory safe
| languages a lot shorter.
| oconnor663 wrote:
| Apart from runtime/embedded requirements, there's the big
| question of how you represent what C is doing in other
| languages that don't have interior pointers and pointer
| casting. For example, in C I might have a `struct foo*` that
| aliases the 7th element of a `struct foo[]` array. How do you
| represent that in Java or Python? I don't think you can use
| regular objects or regular arrays/lists from either of those
| languages, because you need _assignments through the pointer_
| (of the whole `struct foo`, not just individual field writes)
| to affect the array. Even worse, in C I might have a `const
| char*` that aliases the same element and expects every write to
| affect its _bytes_. To model all this you 'd need some
| Frankenstein, technically-Turing-complete, giant-bytestring-
| that-represents-all-of-memory thing that wouldn't really be
| Java or Python in any meaningful sense, wouldn't be remotely
| readable or maintainable, and wouldn't be able to interoperate
| with any existing libraries.
|
| In Rust you presumably do all of that with raw pointers, which
| leaves you with a big unsafe mess to clean up over time, and I
| imagine a lot of the hard work of this project is trying to
| minimize that mess. But at least the mess that you have is
| recognizably Rust, and incremental cleanup is possible.
| thibran wrote:
| Porting the Linux kernel to 100% Rust should be the benchmark for
| AGI.
|
| ... and when done, please port SQLite too :)
| 0cf8612b2e1e wrote:
| I am fully in the RIIR koolaid, but SQLite would be near the
| absolute bottom of my prioritization list. Care to explain?
| SQLite is extensively tested, has requirements to run on ~every
| platform, be backwards compatible, and has a relatively small
| blast radius if there is a C derived bug. There is much more
| fertile ground in any number of core system services (network,
| sudo, dns, etc)
| eric-p7 wrote:
| Not a small blast radius. There are an estimated 1 trillion
| active deployed SQLite instances:
| https://news.ycombinator.com/item?id=29461127
| commodoreboxer wrote:
| A lot of people are reading this as a call or demand to translate
| all C and C++ code to Rust, but (despite the catchy project
| name), I don't read the abstract in that way. There are two
| related but separate paragraphs.
|
| 1. C and C++ just aren't safe enough at large. Even with careful
| programming and good tooling, so many vulnerabilities are caused
| by their unsafe by default designs. Therefore, as much code as
| possible should be translated to or written in "safe" languages
| (especially ones that guarantee memory safety).
|
| 2. We are funding and calling for software to translate existing
| C code into Rust.
|
| It's not a consensus to rewrite the world in Rust. It's a
| consensus to migrate to safe languages, which Rust is an example
| of, and a program that targets Rust in such migration.
| akira2501 wrote:
| > or written in "safe" languages
|
| So when those languages have 'unsafe' constructs what are the
| rules going to be around using those? Without a defining set of
| rules to use here you're just going to end up right back where
| you started.
|
| > to migrate to safe languages, which Rust is an example of
|
| Rust has a safe mode. It is _not_ a safe language. To do
| anything interesting you will require unsafe blocks. This will
| not get you very much.
|
| Meanwhile you have tons of garbage collected languages that
| don't even let the programmer touch pointers. Why aren't those
| considered? The reason is performance. And because Rust
| programmers "care" so much about performance you're not ever
| going to solve the fundamental problem with that language.
|
| Do you want performance or safety? You can't have both.
| timeon wrote:
| > To do anything interesting you will require unsafe blocks.
| This will not get you very much.
|
| This is not true.
| akira2501 wrote:
| > This is not true.
|
| Burying unsafe blocks in unevaluated cargo modules does not
| make this true. You're just taking the original problem and
| sweeping it under the rug.
| bigstrat2003 wrote:
| > Rust has a safe mode. It is _not_ a safe language. To do
| anything interesting you will require unsafe blocks. This
| will not get you very much.
|
| 1. There are plenty of interesting programs which don't
| require unsafe.
|
| 2. Even if your program does require unsafe, Rust still
| limits where the unsafety is. This lets you focus your
| scrutiny on the small section of the program which is
| critical for safety guarantees to hold. That is still a win.
| 0xbadcafebee wrote:
| Every tool has its own specific quirks. Over many years of using
| a tool, "expertise" is the intimate knowledge of those quirks and
| how to use that tool most effectively. Changing tools requires
| you to gain expertise again. You're going to be less proficient
| in the new tool for a long time, and make a lot of mistakes.
|
| Considering we already know how to make C/C++ programs memory
| safe, it's bizarre that people would ditch all of their
| expertise, and the years and years of perfecting the operation of
| those programs, and throw all that out the window because they
| can't be bothered to use a particular set of functions [that
| enforce memory safety].
|
| If you're going to go to all of the trouble to gain expertise in
| an entirely new tool, plus porting a legacy program to the new
| tool, I think you need a better rationale than "it does memory
| safety now". You should have more to show for your efforts than
| just that, and take advantage of the situation to add more value.
| wffurr wrote:
| But even proficient C and C++ programmers continue to produce
| code with memory safety issues leading to remote code execution
| exploits. This argument doesn't hold up to the actual
| experience of large C and C++ projects.
| 0xbadcafebee wrote:
| They aren't trying to prevent them. It's trivial to prevent
| them if you actually put effort into it; if you don't, it's
| going to be vulnerable. This is true of all security
| concerns.
| woodruffw wrote:
| "You aren't trying hard enough" isn't a serious approach to
| security: if it was, we wouldn't require seatbelts in cars
| or health inspections in restaurants.
|
| (It's also not clear that they _aren 't_ trying hard
| enough: Google, Apple, etc. have billions of dollars riding
| on the safety of their products, but still largely fail to
| produce memory-safe C and C++ codebases.)
| jcalvinowens wrote:
| This isn't some "pie in the sky" thing, Immunant has a working C
| to Rust transpiler and it's really interesting:
| https://github.com/immunant/c2rust
| steveklabnik wrote:
| Their work was also previously sponsored by DARPA, though I do
| not know if it was under this program or something else.
| Animats wrote:
| I've tried that thing. The Rust that comes out is terrible. It
| converts C into a set of Rust function calls which explicitly
| emulate C semantics by manipulating raw pointers. It doesn't
| even convert C arrays to a Vec. It's a brute-force
| transliteration, not a translation.
|
| I and someone else ran this on a JPEG 2000 decoder that
| sometimes crashed with a bad memory reference. The Rust version
| crashed with the same bad memory reference. It's bug-
| compatible.
|
| What comes out is totally unreadable and much bigger than the
| original C code. Manual "refactoring" of that output is
| hopeless.
| marcosdumay wrote:
| Any automatic translation is bug-compatible with the
| original. Did you expect it to divine some requirements?
|
| It still leave you with Rust code that you can improve
| piecewise. The only question is if something like it is
| better than FFI calling the C code.
| bornfreddy wrote:
| > Any automatic translation is bug-compatible with the
| original. Did you expect it to divine some requirements?
|
| That would be useless when translating C to Rust. Yes, I
| would expect the tool to point out the flaws in the
| original memory handling and only translate the corrected
| code. This is far from easy, since some information
| (intent) is missing, but a good coder could do it on decent
| codebases. The question is, can an automated tool do it
| too? We'll see.
| jcranmer wrote:
| As I mentioned elsewhere
| (https://news.ycombinator.com/item?id=41113257), that tool is
| pretty much useless unless you have some checkbox that says "no
| C code allowed anywhere". It's not even a feasible starting
| point for refactoring because the code is so far from idiomatic
| Rust.
| jll29 wrote:
| Difficult: most C programs I know would convert to one single
| large "unsafe" block...
|
| One might argue that re-writing from scratch is the safer option;
| and a re-write is also an opportunity to do things differently
| (read: improve the architecture by using what one has learned),
| despite the much-feared "second system" syndrome.
|
| But nothing wrong with spending some research dollars towards
| tooling for "assisted legacy rewrites". DARPA and her sister
| IARPA fund step innovation (high risk, high reward), and this is
| an area where good things can come potentially come from.
| Animats wrote:
| It's good to see DARPA pushing on this. It's a hard problem, but
| by no means impossible. Translating to _safe_ Rust, though, is
| going to be really tough. There 's a C to Rust translator now,
| but what comes out is horrible Rust, which just rewrites C
| pointer manipulation as unsafe Rust struct manipulation. The
| result is less maintainable than the original.
|
| So what would it take to actually do this right? The two big
| problems are 1) array sizes, and 2) non-affine pointer usage.
| Pointer arithmetic is also hard, but rare. Most pointer
| arithmetic can be expressed as slices.
|
| Every array in C has a size. It's just that the compiler doesn't
| know what it is.
|
| Where is this being discussed in detail?
| steveklabnik wrote:
| > Where is this being discussed in detail?
|
| In my understanding, this is a call for proposals to do the
| work, there is no detailed discussion yet. That will come when
| there's actual responses to this call.
| Animats wrote:
| Right, there's a call, and a project day with an in-person
| meeting coming up.
| jcranmer wrote:
| I once tried to use c2rust as a starting point for
| rustification of code and... it's not even good at that. The
| code is just too freakishly literal to the original C semantics
| that you can't even take the non-pointery bits and strip off
| the unsafe block and use that as a basis.
|
| (To give you a sense, it translates something like a + 1 to
| a.unwrapped_add(1i32), and my recollection is that for (int i =
| 0; i < 10; i++) gets helpfully turned into a while loop instead
| of a for loop).
|
| In general, the various challenges that all need to be solved
| that aren't solved yet are:
|
| a) when is integer overflow intentional in the original code so
| that you know when to use wrapping_op instead of regular Rust
| operators?
|
| b) how to convert unions into Rust enums
|
| c) when pointers are slices, and what corresponds to the length
| of the slice
|
| d) convert pointers to references, and know when they're
| mutable or const references
|
| e) work out lifetime annotations where necessary
|
| f) know when to add interior mutability to structs
|
| g) wrap things in Mutex/RwLock/etc. for multithreaded access
|
| We're a very long way from having full-application conversion
| workable, and that might be sufficiently difficult that it's
| impossible.
| Animats wrote:
| That doesn't mention the affine type problem. Rust references
| are restricted to single ownership. If A has a reference to
| B, B can't have a reference to A. Bi-directional references
| are not only a common idiom in C, they're an inherent part of
| C++ objects.
|
| Rust has to use reference counts in such situations. You have
| an Rc wrapped around structs, sometimes a RefCell, and
| .borrow() calls that panic when you have a conflict. C code
| translates badly into that kind of structure.
|
| Static analysis might help find .borrow() and .borrow_mut()
| calls that will panic, or which won't panic. It's very
| similar to finding lock deadlocks of the type where one
| thread locks the same lock twice.
|
| (If static analysis shows that no .borrow() or .borrow_mut()
| for an RwLock will panic, you don't really need the RwLock.
| That's worth pursuing as a way to allow Rust to have back
| references.)
| jcranmer wrote:
| I'd lump that analysis somewhere in the d-g, because you
| have to remember that &mut is also noalias and work out
| downstream implications of that. It's probably presumptive
| of me to assume a particular workflow for reconstructing
| the ownership model to express in Rust, and dividing that
| into the steps I did isn't the only way to do it.
|
| In any case, it's the difficulty of that reconstruction
| step that leaves me thinking that automated conversion of
| whole-application to Rust is a near-impossibility.
| Conversion of an individual function that works on plain-
| old-data structures is probably doable, if somewhat
| challenging.
|
| An off-the-cuff idea I just had is to implement a semi-
| automated transformation, where the user has to input what
| a final conversion of a struct type should look like
| (including all Cell/Rc/whatever wrappers as needed), and
| the tool can use that to work out the rest of the
| translation. There's probably a lot of ways that can go
| horribly wrong, but it seems more feasible than trying to
| figure out all of the wrappers need to be.
| clintfred wrote:
| Even if just all the unsafe areas were marked, wouldn't that be
| valuable? At least it would focus review efforts on the parts
| with the most risk?
| sans-seraph wrote:
| I have been aware of this proposed initiative for some time and I
| find it interesting that it is now becoming public. It is a very
| ambitious proposal and I agree that this level of ambition is
| appropriate for DARPA's mission and I wish them well.
|
| As a Rust advocate in this domain I have attempted to temper the
| expectations of those driving this proposal with due respect to
| the feasibility of automatic translation from C to Rust. The
| fundamental obstacle that I foresee remains that C source code
| contains less information than Rust source code. In order to
| translate C code to Rust code that missing information must be
| produced by someone or something. It is easy to prove that it is
| impossible to infallibly generate this missing information for
| the same reason that scaling an image to make it larger cannot
| infallibly produce bits of information that were not captured by
| the original image. Instead we must extrapolate (invent) the
| missing information from the existing source code. To extrapolate
| correctly we must exercise judgement and this is a fallible
| process especially when exercised in large quantities by
| unsupervised language models. I have proposed solutions that I
| believe would go some way towards addressing these problems but I
| will decline to go into detail.
|
| Ultimately I will say that I believe that it is possible for this
| project to achieve a measure of success, although it must be
| undertaken with caution and with measured expectations. At the
| same time it should be emphasized it is also possible that no
| public result will come of this project and so I caution those
| here against reading too much into this at this time. In
| particular I would remind everyone that the government is not a
| singular entity and so I would not interpret this project as a
| blanket denouncement against C or vice versa as a blanket
| blessing of Rust. Each agency will set its own direction and
| timelines for the adoption of memory-safe technologies. For
| example NIST recommends Rust as well as Ada SPARK in addition to
| various hardened dialects of C/C++.
| steveklabnik wrote:
| > As a Rust advocate in this domain I have attempted to temper
| the expectations of those driving this proposal
|
| Thank you!
| pfdietz wrote:
| How does it relate to the CRAM effort at Grammatech?
|
| https://cpp-rust-assisted-migration.gitlab.io/blog/
| simon_void wrote:
| a) if every C program could be translated into an equivalent safe
| Rust program, that would mean that each C program is as safe as
| the safe Rust equivalent. b) since there are C programs that are
| open to memory currption in a way safe Rust isn't, this
| corruptability would need to be translated into partially unsafe
| Rust. Congrats, you now have a corruptible Rust program, what's
| the point again?? c) so DARPA must be trying to fix/change what
| the program is doing when switching to Rust. So how to discern
| what behaviour is intended and which is not? Doesn't this run
| directly into the undecidability/uncomputability of the halting
| problem!?!
| Arnavion wrote:
| >Doesn't this run directly into the
| undecidability/uncomputability of the halting problem!?!
|
| The programmer gets to decide. DARPA does not expect the
| translator program to autonomously output a perfect Rust
| program. It just wants a " _high degree_ of automation towards
| translating legacy C to Rust " (from the sam.gov link in the
| submission, emphasis mine).
| PaulHoule wrote:
| Whatever happened to Ada?
| wffurr wrote:
| It languished in government work behind a wall of extremely
| expensive compilers and contractors. Never heard anyone suggest
| RiiA - Rewrite it in Ada.
| nvy wrote:
| GCC contains `gnat` which is a libre Ada compiler.
|
| I think Ada has a lot of technical merit but it's just not
| fashionable the way Rust is, for lots of uninteresting
| reasons.
| PaulHoule wrote:
| I remember Ada getting pushed in a time when there were
| many in the computer industry that were pushing Pascal as
| both a systems and a teaching language. Ada was a lot like
| Pascal which I think caused an immediate violent reaction
| in some people. (e.g. the implementers of every other
| programming language were pissed that BASIC was so
| hegemonic but they never asked "Why?" or if their
| alternatives were really any better)
|
| In the early 1980s, microcomputer implementations such as
| UCSD Pascal were absolutely horrific in terms of
| performance plus missing the features you'd need to do
| actual systems programming work. In the middle of the
| decade you saw Turbo Pascal which could compile programs
| before you aged to death and also extended Pascal
| sufficiently to compete with C. But then you had C, and the
| three-letter agencies were still covering up everything
| they knew about buffer overflows.
| sim7c00 wrote:
| i like the idea but i struggle to see how one can go about doing
| 'safe' disk reads, having 'safe' ways to manage global resources
| in kernel land (page tables, descriptor tables etc) and a lot of
| other stuff. perhaps if those devices also have rust in their
| firmware they can reply safely?? genuinely curious because i went
| back to C from rust in my OS. i could not figure it out (maybe i
| am not a darpa level engineer but i did work at a similar place
| doing similar things).
|
| id be excited if this gets solved. rust is a lot more comfy for
| higher level kernel stuff.
| rpoisel wrote:
| I think we have to take that literally: They only translate C
| code to Rust. Not C++.
| plasticeagle wrote:
| If
|
| 1) Rust contains no memory bugs 2) C can be automatically
| translated to it
|
| Then all memory bugs can be fixed automatically, which is almost
| certainly untrue. This task is very likely completely impossible
| in the general case.
| warkdarrior wrote:
| Since you did not specify that you wish to preserve all
| behaviors of the C code, there are trivial solutions to this
| problem. For example, one could replace all dynamic memory
| allocations with fixed buffers (set at translation time), and
| reject all inputs that do not fit in those buffers.
| ksp-atlas wrote:
| Technically, Zig has this functionality built in via translate-c,
| but it's designed for reading by a C compiler, not a human
| kernal wrote:
| I'm working on something similar that just wraps the C code in an
| Unsafe block.
| ristos wrote:
| I get the idea of moving to more memory safety, but the whole
| "rewrite everything in Rust" trend feels really misguided,
| because if you're talking about being able to trust code and code
| safety:
|
| - Rust's compiler is 1.8 million lines of recursively compiled
| code, how can you or anyone know that what was written is
| actually trustworthy? Also memory safety is just a very small
| part of being able to actually trust code.
|
| - C compiles down to straightforward assembly, almost like a
| direct translation, so you can at least verify that smaller
| programs that you write in C actually do compile down to assembly
| you expect, and compose those smaller programs into larger ones.
|
| - C has valgrind and ASAN so it's at least possible to write safe
| code with code coding discipline, and plenty of software has been
| able to do this for decades.
|
| - A lot of (almost all) higher level programming languages are
| written in C, which means that those languages just need to make
| sure they get the compiler and GC right, and then those languages
| can be used for general purpose, scripting, "low level" high
| level code like Go or OCaml, etc.
|
| - There are many C compilers and only one Rust compiler, and it's
| unclear whether it'll really be feasible to have more than one
| Rust compiler due to the complexity of the language. So you're
| putting a lot of trust into a small group of people, and even if
| they're the most amazing, most ethical people, surely if a lot of
| critical infra is based on Rust they'll get targeted in some way.
|
| - Something being open source doesn't mean it's been fully
| audited. We've seen all sorts of security vulnerabilities cause a
| world a hurt for a lot of people that came from all open source
| code, and often very small libraries that could actually be much
| easier to audit than lines with millions of lines of code.
|
| - Similarly, Rust does not translate to straightforward assembly,
| and again would seem to be impossible to do given the complexity
| of the language.
|
| - There was an interesting project I came across called CompCert,
| which aims to have a C compiler that's formally verified (in Coq)
| to translate into the assembly you expect. Something like a
| recursively compiled CompCert C -> OCaml -> Coq -> CompCert would
| be an interesting undertaking, which would make OCaml and Coq
| themselves built on formally verified code, but I'm not sure if
| that'll really work and I suspect it's too complicated.
|
| - I think Rust might be able to solve some of these problems if
| they have a fully formally verified thing, and the formally
| verified thing is itself formally verified, and the compiler was
| verified by that thing, and then you know that you can trust the
| whole thing. Still, the level of complexity and the inability to
| at least manually audit the core of it makes me suspect it's too
| complicated and would still be based on trust of some sort.
|
| - I still think that static analysis and building higher level
| languages on top of C is a better approach, and working on formal
| verification from there, because there are really small C
| compilers like tinycc that are ~50k LOCs, which can be hand
| verified. You can compile chibi-scheme with tinycc, for example,
| which is also about ~50k LOCs of C, and so you get a higher level
| language from about 100k LOCs (tcc and chibi), which is feasible
| for an ordinary but motivated dev to manually audit to know that
| it's producing sound assembly and not something wonky or sketchy.
| Ideally we should be building compilers and larger systems that
| are formally verified, but I think the core of whatever the
| formally verified system is has to be hand verifiable in some way
| in order to be trustworthy, so that you can by induction trust
| whatever gets built up from that, and I think that would need to
| require a straightforward translation into assembly, with ideally
| open source ISA and hardware, and a small enough codebase to be
| manually audited like the tinycc and chibi-scheme example I gave.
|
| - Worst case everyone kind of shrugs it all off and just trusts
| all of these layers of complexity, which can be like C ->
| recursively compiled higher level lang -> coffeescript-like layer
| on top -> framework, which is apparently a thing now, and just
| hope that all of these layers of millions of lines of code of
| complexity don't explode in some weird way, intentionally or
| unintentionally.
|
| - Best case of the worst case is that all of our appliances are
| now "smart" appliances, and then one day they just transform into
| robots that start chasing you around the house, all the while the
| Transformers cartoon theme is playing in the background while,
| which would match up nicely with the current trend of everything
| being both terrifying and hilarious in a really bizarre way.
| nickpsecurity wrote:
| I think this is indirectly a great argument for automated, test
| generation or equivalence checking. The reason is that these
| translations might change the function of the code. Automated
| testing would show whether or not that happened. It also reveals
| many bugs.
|
| So, they should solve total, automated testing first. Maybe in
| parallel. Then, use it for equivalence checks.
| pizlonator wrote:
| Or you could just use Fil-C.
| luke-stanley wrote:
| Surely this could be better pitched to researchers as just
| another AI benchmark, a bit like ARC Prize? ;) There could be
| some exiting C projects that are already public, with tests for
| feedback during development iteration and some holdout tests, and
| some holdout projects too with a leaderboard and prizes. For
| preferences about converted code quality, both automated
| assesment and human preferences could be ranked with Elo? Kaggle
| is made for this sort of thing I think? I'm sure Google Deepmind
| and others have some MCTS agents that could do a great job with a
| bit of effort.
___________________________________________________________________
(page generated 2024-07-30 23:00 UTC)