hngopher.com

       [HN Gopher] Translating All C to Rust (TRACTOR)
       ___________________________________________________________________
        
       Translating All C to Rust (TRACTOR)
        
       Author : steveklabnik
       Score  : 201 points
       Date   : 2024-07-30 15:42 UTC (7 hours ago)
        
 (HTM) web link (www.darpa.mil)
 (TXT) w3m dump (www.darpa.mil)
        
       | steveklabnik wrote:
       | See also
       | https://sam.gov/opp/1e45d648886b4e9ca91890285af77eb7/view
        
         | thesuperbigfrog wrote:
         | Direct link to Proposer's Day info [PDF]:
         | https://sam.gov/api/prod/opps/v3/opportunities/resources/fil...
         | 
         | "The purpose of this event is to provide information on the
         | TRACTOR technical goals and challenges, address questions from
         | potential proposers, and provide an opportunity for potential
         | proposers to consider how their research may align with the
         | TRACTOR program objectives."
        
       | PreInternet01 wrote:
       | The _one_ link for those who think that  'Rewrite it All in Rust'
       | will, well, settle _any_ debates: https://github.com/rust-
       | lang/miri/
        
         | calebpeterson wrote:
         | Genuine question:
         | 
         | Would you mind explaining to a dev that doesn't know much
         | (anything) about Rust, how does this settle any debate?
        
           | jerf wrote:
           | I believe it goes something like, "I have constructed a
           | strawman that Rust claims that all code written in it is
           | automatically safe by all conceivable definitions of safe,
           | but look, ha ha, here's something that detects unsafe code in
           | Rust!", and I don't mean "code marked in unsafe blocks".
           | 
           | It's a concatenation of several logical fallacies in a row;
           | equivocation, straw manning, binary thinking about safety,
           | several others. It's hard to pick the main one, but I'd go
           | with the dominant problem being a serious case of binary
           | thinking about what "safety" is. Of course, if the commentor
           | is using anything other than Idris for all their programming,
           | they're probably not actually acting on their own
           | accusations.
        
             | marcosdumay wrote:
             | > Of course, if the commentor is using anything other than
             | Idris
             | 
             | I'm sure the Idris compiler has bugs somewhere too. If the
             | OP actually programs, they are violating their rationale
             | (I'm quite sure assembly or assembled binary aren't ok
             | either).
        
           | mrweiden wrote:
           | From the original post > It's not enough to rely on bug-
           | finding tools
           | 
           | From the Miri github: > Miri is an Undefined Behavior
           | detection tool for Rust.
        
             | keybored wrote:
             | Darpa is already ahead of you all with the hedging:
             | 
             | > The preferred approach is to use "safe" programming
             | languages
             | 
             | "Safe". Terms and conditions may apply.
        
             | Sharlin wrote:
             | There is no contradiction. The fact that UB-finding tools
             | alone are not _sufficient_ doesn 't mean they aren't
             | _useful_ even with a safe(r) language.
             | 
             | In other words, from "safer languages are necessary" it
             | does not follow that "safer languages are sufficient".
        
           | PreInternet01 wrote:
           | Well, the general 'Rewrite All in Rust' consensus is that it
           | solves _all_ general programming problems, _ever_.
           | 
           | Yet, the linked repository shows a huge list of cases in
           | which simple, documented use of Rust can cause Undefined
           | Behavior (a.k.a. 'UB')
           | 
           | Pretty much every argument of Rust advocates against C/C++
           | boils down to either 'but memory safety' or 'but UB'.
           | 
           | Yet there are many convincing counter-arguments that boil
           | down to 'but CompCert' or similar, and, as the linked
           | repository shows, there might be at least _some_ truth in
           | there?
        
             | steveklabnik wrote:
             | No serious person claims that Rust solves every problem
             | ever.
             | 
             | Also, many people cite things like Cargo as a reason to
             | prefer Rust over C and C++, as well as other things. UB is
             | a big part of it, of course, but it isn't the only thing.
        
               | galangalalgol wrote:
               | I selected it for performance reasons myself, the UB
               | protection was a nice benefit that was expected, cargo
               | wasn't expected and is extremely nice coming from the
               | cmake,conan,vcpkg and duct tape world I came from.
        
             | leftyspook wrote:
             | > Well, the general 'Rewrite All in Rust' consensus is that
             | it solves all general programming problems, ever.
             | 
             | a) There is no such consensus. The actual consensus is that
             | even if Rust solved all problems, it would not be
             | financially feasible to rewrite pretty much any substantial
             | project.
             | 
             | b) While Rust does solve many problems, it is nowhere close
             | to solving all safety, otherwise there would be no `unsafe`
             | keyword. Alas, fully proving safety in an impure, turing-
             | complete language is mathematically impossible.
             | 
             | c) The only reason you would think that there's some sort
             | of woke Rust lobby, is if you spend way too much time
             | subjecting yourself to opinions of literal sixteen year
             | olds on twitter.
        
             | superb_dev wrote:
             | > Well, the general 'Rewrite All in Rust' consensus is that
             | it solves all general programming problems, ever.
             | 
             | No, that's not the consensus. This is a strawman.
        
             | timeon wrote:
             | > Well, the general 'Rewrite All in Rust' consensus is that
             | it solves all general programming problems, ever.
             | 
             | This is obvious example of strawman. Why are you doing
             | this?
        
               | PreInternet01 wrote:
               | Towards general mental health. I'm just a C# wage slave,
               | and I'll admit, when being prompted, that my language,
               | its vendor, its runtime environment, and its general
               | approach are, to put it kindly, _flawed_.
               | 
               | However, as evidenced by the arguments and voting in this
               | thread, Rust proponents will take _no_ criticism,
               | _whatsoever_.
               | 
               | I linked to a GitHub repository that documents many, many
               | instances in which _generally safe_ Rust causes UB.
               | 
               | The same kind of UB that recently hit one of my
               | coworkers, caused a 3-day outage and now (despite all my
               | counseling to the contrary!) will burn them out
               | permanently.
               | 
               | My only request: can you guys please _back off_ just a
               | _little bit_? Programming is already hard enough without
               | the purity wars you 're stoking all the time...
        
               | keybored wrote:
               | Stoking language flame wars based on hysterical
               | exaggeration has never promoted mental health.
        
               | fargle wrote:
               | to be fair, from his perspective, it's often the rusty
               | crowd who is stoking the flame wars - this sounds like a
               | reaction to them.
               | 
               | how often do we hear something like "C and C++ are
               | horribly flawed and completely unsafe. it's basically a
               | crime against humankind and gross negligence to use
               | them"?
               | 
               | i get weary of that kind of thing too. i wouldn't
               | approach it by reacting in the same way as the GP
               | comment, but i get it. and it's not really that much of a
               | strawman. it's more exasperation and sarcasm.
               | 
               | personally, i'm very interested in rust. but everytime
               | someone at best "overhypes" it or at worse, outright dogs
               | on other languages, it's a negative point toward dealing
               | with the whole rust ecosystem.
        
           | leftyspook wrote:
           | It is a tool for checking that your unsafe code doesn't cause
           | UB. It doesn't really settle anything, but the commenter uses
           | it as a gotcha to say "rust is no better than C, because you
           | still can compile code that contains UB".
        
           | spease wrote:
           | They are claiming that because code in 'unsafe' blocks in
           | Rust can have undefined behavior, that the language is no
           | safer than C.
           | 
           | This does not settle the debate because unsafe is rarely
           | needed for a typical Rust program. In addition, the presence
           | of an unsafe block also alerts the reader that the set of
           | possible errors is greatly increased for that part of the
           | code and more careful auditing is needed.
           | 
           | It's a little like saying traffic lights are useless because
           | emergency responders need to drive through them sometimes, so
           | we should just leave intersections completely unsignaled and
           | expect drivers to do better.
           | 
           | Rust is by default restrictive and requires you to explicitly
           | make it unsafe, C/++ are by default unsafe and require you to
           | explicitly make them restrictive.
        
         | keybored wrote:
         | You linked an interpreter for some kind of internal compiler
         | representation that the Rust compiler uses.
         | 
         | What on Earth do you mean?
        
           | nequo wrote:
           | It's the old trope that some Rust code uses unsafe blocks so
           | all Rust code is as unsafe as C.
        
             | keybored wrote:
             | Of course. I should have expected the Nirvana Fallacy. :)
        
             | melling wrote:
             | I don't know Rust but even if the Rust is just as unsafe in
             | certain blocks, simply being translated to Rust removes a
             | lot of corporate resistance to adopt the language.
             | 
             | Getting people to adopt a new language can be a lot of
             | work. I remember people claiming they missed headers files
             | in Swift so they wanted to stick with Objective C.
        
           | PreInternet01 wrote:
           | > What on Earth do you mean?
           | 
           | That _documented_ use of _safe_ Rust can easily lead to UB,
           | which this infernal  'internal compiler representation'
           | demonstrates.
           | 
           | I'm not even sure what is even remotely confusing about that?
        
             | woodruffw wrote:
             | Miri is a MIR interpreter aimed at _unsafe_ Rust, not safe
             | Rust. Using the fact that it operats on an internal
             | representation is a very weird swipe; almost all static and
             | dynamic analysis tools work on some kind of IR or
             | decomposed program representation.
        
             | commodoreboxer wrote:
             | > Miri is an Undefined Behavior detection tool for Rust. It
             | can run binaries and test suites of cargo projects and
             | detect unsafe code that fails to uphold its safety
             | requirements.
             | 
             | > ... detect unsafe code that fails ...
             | 
             | Show me the documented safe Rust code that causes UB
             | without using any unsafe blocks outside of the standard
             | library.
        
               | steveklabnik wrote:
               | There are some soundness holes in the implementation that
               | can cause this. Just like any project, the compiler can
               | have bugs. They'll be fixed just like any bug.
        
               | commodoreboxer wrote:
               | Yes, in particular some interactions with LLVM have
               | caused some frustrating UB. But those are considered
               | implementation bugs, rather than user bugs, and all the
               | conditions Miri states at the top are relevant primarily
               | in unsafe code, which contradicts the OP's point, which
               | is that there are tons of documented cases of UB in safe
               | Rust. This is not true. There are a few documented cases,
               | and most have been fixed. It's nowhere close to the world
               | of C or C++'s UB minefield.
        
               | steveklabnik wrote:
               | For sure, just making sure to acknowledge this is the
               | case, before someone responded to your post with cve-rs.
               | :)
        
               | PreInternet01 wrote:
               | Ah, a voice of sort-of sanity, at long last.
               | 
               | So, the reason I posted my original reply, is that at one
               | of my $DAYJOBs, we recently had a 3-day outage on some
               | service, related to Rust. Something like using AVX to
               | read, like, up to 7 bytes too many from an array.
               | 
               | Nothing really major -- we have a 10-day backup window,
               | and the damage was limited to 4 days, so we were able to
               | identify and fix all identified cases. But the person-to-
               | Git-blame for this issue happened to be one of my
               | mentees, and... they were blown away by it.
               | 
               | As in: literally heartbroken. Unable to talk about it.
               | "But the compiler said it was okay!", crying. One of my
               | coworkers pointed at MIRI, which correctly warned about
               | the issue-at-hand, at which point I recommended
               | incorporating that tool into the build pipeline, as well
               | as (the usual advice in cases such as this) improving
               | unit tests and focusing on X-1 and X+1 cases that might
               | be problematic.
               | 
               | To this day, I'm _truly_ worried about my mentee. I 'm
               | just a C# wagie, and I fully accept that my code, my
               | language, my compiler, and my runtime environment are all
               | shit.
               | 
               | But, as evidenced by my experience and supported by the
               | voting in this thread, it seems that Rust users seem to
               | self-identify with the absolute infallibility of anything
               | relate to the language, and react quite violently and
               | self-destructively to any evidence to the contrary.
               | 
               | As a community leader, do you see any room for
               | improvement there? And if not, what would it take to
               | convince you?
        
               | n_plus_1_acc wrote:
               | The Rust community as a whole very much promotes the idea
               | of trusting the Compiler. Which is a very useful thing,
               | especially for folks coming from other languages like C.
               | It's not perfect of course as the compiler has bugs, but
               | I think it still a good thing to teach.
        
               | steveklabnik wrote:
               | > using AVX
               | 
               | This would require using unsafe code.
               | 
               | > As in: literally heartbroken. Unable to talk about it.
               | 
               | I would hope that this person improves as an engineer,
               | because this isn't particularly professional behavior,
               | from the way you describe it.
               | 
               | > "But the compiler said it was okay!"
               | 
               | Given that you'd have to use unsafe to do this, the
               | compiler _can 't_ say it was okay. It sounds like this
               | person may not fully understand Rust either.
               | 
               | > it seems that Rust users seem to self-identify with the
               | absolute infallibility of anything relate to the
               | language, and react quite violently and self-
               | destructively to any evidence to the contrary.
               | 
               | I don't see how this generalizes. You had one (apparently
               | junior, given "mentee"?) person make a mistake and
               | respond poorly to feedback. You also barged into this
               | thread and made incorrect statements about Rust, and were
               | downvoted for it. That doesn't mean that Rust users think
               | everything is perfect.
               | 
               | > As a community leader, do you see any room for
               | improvement there?
               | 
               | I _do_ think sometimes enthusiastic people who don 't
               | understand things misrepresent the thing they're
               | enthusiastic about, but that's a human problem, not a
               | Rust problem. I do not think there's a way to fix that,
               | no.
        
               | neonsunset wrote:
               | Don't worry, your language and _especially_ the runtime
               | and compiler are great. Particularly so in the last few
               | years. I wouldn 't worry about the noise, maybe it
               | concerns C++, but C# is a strict productivity upgrade for
               | general-purpose applications despite _some_ * of the
               | dated bits in the language (but not the runtime).
               | 
               | * like un-unified representation of nullable reference
               | types and structs under generics for example, or just the
               | weight of features over the years, still makes most other
               | alternatives look abysmal in comparison
        
               | keybored wrote:
               | > I'm just a C# wagie, and I fully accept that my code,
               | my language, my compiler, and my runtime environment are
               | all shit.
               | 
               | What is shit about those things for C#? That's the
               | application programming language that seems to get the
               | least flak out of all of them.
               | 
               | If I'm using an alpha or beta compiler, I might suspect a
               | compiler bug from time to time... not really when I'm
               | working in a decades-old, very established language.
        
             | keybored wrote:
             | Indeed. There have been UB bugs in the standard library
             | caused by unsafe blocks.
             | 
             | Those are bugs. They are faults in the code. They need to
             | be fixed. They are not UB-as-a-feature like in C/C++. "Well
             | watch out for those traps every time you use this."
             | 
             | This is like getting mad that a programming language boasts
             | that it produces great binaries and yet the compiler has a
             | test suite to catch bugs in the emitted assembly. That's
             | literally what you are doing.
        
               | Calavar wrote:
               | > Those are bugs. They are faults in the code. They need
               | to be fixed. They are not UB-as-a-feature like in C/C++.
               | 
               | Rust has UB-as-a-feature too. They could have eliminated
               | UB from the language entirely, but they chose not to (for
               | very valid reasons in my opinion).
               | 
               | UB is a set of contracts that you as the author agree to
               | never violate. In return, you get faster code under the
               | assumption that you never actually encounter a UB
               | condition. If you violate those contracts in Rust and
               | actually encounter UB, that's a a bug, that's a fault in
               | the code. If you violate those contracts in C++, that's a
               | bug, that's a fault in the code. This is the same in both
               | languages.
               | 
               | It's true that Rust UB can only _arise_ from unsafe
               | blocks, but it is not _limited_ to unsafe blocks. Rust UB
               | has  "spooky action at a distance" the same way C++ UB
               | does. In other words, you can write UB free code in Rust,
               | but if any third party code encounters UB (including the
               | standard library), your safe code is now potentially
               | infected by UB as well. This is also the same in both
               | languages.
               | 
               | There are good reasons to favor Rust's flavor of UB over
               | C++'s, but I keep seeing these same incorrect arguments
               | getting repeated everywhere, which is frustrating.
        
               | keybored wrote:
               | > There are good reasons to favor Rust's flavor of UB
               | over C++'s, but I keep seeing these same incorrect
               | arguments getting repeated everywhere, which is
               | frustrating.
               | 
               | Tell me what I wrote that was incorrect. I called them UB
               | bugs in the standard library. If they were trivial bugs
               | that caused some defined-behavior logic bug while used
               | outside of the standard library then it wouldn't rise to
               | the level of being called an UB bug.
        
               | Calavar wrote:
               | > They are not UB-as-a-feature like in C/C++.
               | 
               | That's the part that's incorrect. That, plus the
               | implication that UB is a bug in Rust, but not in C++. As
               | I said, the existence of UB is a feature in both
               | languages and actually encountering UB is a bug in both
               | languages. You can play with the semantics of the word
               | "feature" but I don't think it's possible to find a
               | definition that captures C++ UB and excludes Rust UB
               | without falling into a double standard. Unfortunately
               | double standards on UB are pretty common in conversations
               | about C++ and Rust.
        
               | keybored wrote:
               | You're done editing the comment now?
               | 
               | Do you think UB-as-feature is something that someone
               | would honestly describe C or C++ as? It's a pretty
               | demeaning way of framing things. Indeed it's a tongue-in-
               | cheek remark, a vhimsical exaggeration/description of the
               | by-default UB of those languages which was added to the
               | end of the completely factual description of the role
               | that finding UB in the Safe Rust subset of the standard
               | library of Rust serves.
               | 
               | Of course one cannot, from the Rust Side so to speak, use
               | tongue in cheek, off-hand remarks in these discussions;
               | one must painstakingly add footnotes and caveats, list
               | and mention every trivial fact like "you can get UB in
               | unsafe blocks"[1] or else you have a "double standard".
               | 
               | [1] Obligatory footnote: even though all participants in
               | the discussion clearly knows this already.
        
               | oconnor663 wrote:
               | > It's true that Rust UB can only arise from unsafe
               | blocks, but it is not limited to unsafe blocks.
               | 
               | This is correct, and it's hard to teach, and I agree that
               | a lot of folks get it wrong. (Here's my attempt:
               | https://jacko.io/safety_and_soundness.html.) But I think
               | this comment is understating how big of a difference this
               | makes:
               | 
               | 1. Rust has a large, powerful safe subset, which includes
               | lots of real-world programs. Unsafe code is an advanced
               | topic, and beginners don't need to learn about it to
               | start getting their work done. Beginners can contribute
               | to big projects without touching the unsafe parts (as you
               | clarified, that means the _module privacy boundaries_
               | that include unsafe code, not just the unsafe blocks),
               | and reviewers don 't need to be paranoid about every
               | line.
               | 
               | 2. A lot of real-world unsafe Rust is easy to audit,
               | because you can grep for `unsafe` in a big codebase and
               | zoom right to the parts you need to look at. Again, as
               | you pointed out, those blocks might not be the whole
               | story, and you do need to read what they're doing to see
               | how much code they "infect". But an experienced Rust
               | programmer can audit a well-written codebase in
               | _minutes_. It 's not always that smooth of course, but
               | it's a totally different world that that's even possible.
        
             | estebank wrote:
             | > That _documented_ use of _safe_ Rust can easily lead to
             | UB
             | 
             | The only thing that comes to mind that this could be
             | referring to are the open bugs at https://github.com/rust-
             | lang/rust/issues?q=is%3Aopen+is%3Ais.... Are these what
             | you're referring to?
             | 
             | > this infernal 'internal compiler representation'
             | 
             | What makes MIR "infernal"?
             | 
             | > I'm not even sure what is even remotely confusing about
             | that?
             | 
             | You posted a link to a tool that executes pure rust
             | libraries and evaluates memory accesses (both from safe and
             | unsafe rust code) to assert whether they conform to the
             | rust memory model. It sits in the same space as valgrind.
             | You left it open to interpretation with really no other
             | context. We can be excused for not knowing what you were
             | trying to say. I personally still don't.
        
         | the8472 wrote:
         | The trophy cases in miri are about bugs in _unsafe_ code. Yes,
         | you can write UB with unsafe code. This should not be news.
         | 
         | And miri is a blessing. There even is a known case where
         | someone found a bug in C by translating it to rust and then
         | running it through miri.
        
       | mike_hearn wrote:
       | That sounds ... hard. Especially as idiomatic Rust as written by
       | skilled programmers looks nothing like C, and most interesting
       | code is written in C++ anyway.
       | 
       | Isn't it equivalent to statically determining the lifetimes of
       | all allocations in the C program, including those that are
       | implemented using custom allocators or which cross into
       | proprietary libraries? There's been a lot of research into this
       | sort of thing over the years without much success. C/C++ programs
       | can do things like tie allocation lifetimes to what buttons a
       | user clicks, without ref counting or other mechanisms to ensure
       | safety. It's not a good idea, but, they can do it.
       | 
       | The other obvious problem with trying to write such a static
       | analysis is that the programs you're analyzing are by definition
       | buggy and the lifetimes might not make sense (if they did, they
       | wouldn't have memory safety holes and wouldn't need to be
       | replaced). The only research I've seen on this problem of
       | statically detecting what lifetimes should be does assume the
       | code being analyzed is actually correct to begin with. I guess
       | you could try and aim for a program that detects where lifetimes
       | can't be worked out and asks the developer for help though.
        
         | mattgreenrocks wrote:
         | Projects are termed DARPA-hard for a reason.
        
         | woodruffw wrote:
         | It's very hard; DARPA likes to fund hard things[1] :-).
         | 
         | This isn't, however, DARPA's first foray into automatic program
         | translation, or even automatic translation into Rust[2].
         | 
         | [1]:
         | https://www.urbandictionary.com/define.php?term=DARPA%20hard
         | 
         | [2]: https://c2rust.com/
        
           | the_snooze wrote:
           | DARPA is basically a state-sponsored VC that optimizes for
           | completely different things. Instead of looking for 100x
           | financial returns, they want technical advantages for the
           | United States. The "moat" is the hardness of developing and
           | operationalizing those technologies first.
        
             | woodruffw wrote:
             | DARPA's commercialization track record is decidedly mixed,
             | so the VC comparison is unexpectedly apt :-)
             | 
             | (But yes: DARPA's mandate is explicitly to discover and
             | develop the next generation of emerging technologies for
             | military use.)
        
               | pfdietz wrote:
               | Decades ago, as my father explained to me, ARPA (no "D"
               | at that time) was happy if 1% of their projects went all
               | the way through to successful deployment. If they had a
               | higher success rate it would mean they weren't aiming
               | high enough.
        
               | VikingCoder wrote:
               | > DARPA's commercialization track record is decidedly
               | mixed...
               | 
               | If you count my number of attempts, sure.
               | 
               | If you count by impact, it's hard to come up with many
               | things more impactful than the Internet...?
        
               | woodruffw wrote:
               | Yeah, I meant by number. But also: ARPA didn't
               | commercialize the Internet! They explicitly refused to
               | commercialize it; commercialization only happened after
               | an Act of Congress induced interconnections between
               | NSFNET and commercial networks.
        
             | mburns wrote:
             | To be pedantic, In-q-tel is the literal state-sponsored VC.
             | 
             | DARPA is a step closer to traditional research labs but
             | there is obviously some overlap.
             | 
             | https://en.wikipedia.org/wiki/In-Q-Tel
        
               | throwup238 wrote:
               | _> DARPA is a step closer to traditional research labs
               | but there is obviously some overlap._
               | 
               | It's more like the NSF but focused on commercial grantees
               | with project management thrown on top to orchestrate
               | everything.
               | 
               | The really unique part is how much independence each
               | program manager has and the term limits that prevent
               | empire building.
        
           | fsckboy wrote:
           | in this case it seems to me the hard task that DARPA has
           | chosen is to get me to forget how much they spent on pushing
           | Ada.
        
             | woodruffw wrote:
             | I can't find any clear references to DARPA (or ARPA) being
             | involved in Ada's development. It was a DoD program but,
             | well, the DoD is notoriously large and multi-headed.
             | 
             | (But even if DARPA was involved in Ada: I think it's clear,
             | at this point, that Ada has been a resounding success in a
             | _small_ number of domains without successfully breaking
             | into general-purpose adoption. I don 't have a particular
             | value judgment associated with that, but from a strategic
             | perspective it makes a _lot_ of sense for DARPA to focus
             | program analysis research on popular general-purpose
             | languages -- there 's just more labor and talent
             | available.)
        
             | reaperducer wrote:
             | _in this case it seems to me the hard task that DARPA has
             | chosen is to get me to forget how much they spent on
             | pushing Ada._
             | 
             | You hate jumbo jets, high-speed trains, air traffic
             | control, and satellites?
        
               | 9659 wrote:
               | Do you know what fear is? Getting in an airplane where
               | the flight controls use NPM.
        
               | warkdarrior wrote:
               | npm ERR! install Couldn't read dependencies        npm
               | ERR! package.json ENOENT, open '/boeing/787-9/flaps-
               | up.json'        npm ERR! package.json This is most likely
               | not a problem with npm itself.        npm ERR!
               | package.json npm can't find a package.json file in your
               | current directory.
        
             | 9659 wrote:
             | ada does not require 'pushing'.
             | 
             | once the maturity of the users advances to a sufficient
             | point, then ada is the only solution.
             | 
             | "ada. used in creating reliable software since 1983"
             | 
             | when i first saw ada, i didn't understand the why. now i
             | understand the why, but ada is effectively gone.
             | 
             | -- old fortran / C / Assembly programmer
        
         | 01HNNWZ0MV43FF wrote:
         | Can't most c++ be machine-lowered to C?
        
           | woodruffw wrote:
           | _Lowering_ is typically easier than _lifting_ (or
           | brightening). When you lower, you can erase higher-level
           | semantics that aren 't relevant; when you lift, you generally
           | _want_ to compose lower-level program behaviors into their
           | idiomatic (and typically safer) equivalent.
        
         | childintime wrote:
         | Hard for humans. But it's DARPA, is it hard for AI? Image
         | classification used to be hard also, today cars drive
         | themselves.
         | 
         | I'd say it's good timing.
        
           | mike_hearn wrote:
           | Well, Claude 3.5 can do translation from one language to
           | another in a fairly competent manner if the languages are
           | close enough. I've used it for that task myself with success
           | (Java -> JavaScript).
           | 
           | But, this isn't just about rewriting code from one language
           | to another. It's about reverse engineering complex
           | information out of the code, which may not be immediately
           | visible in it, and then finding a way to make it "safe"
           | according to Rust's type system. Where's the training data
           | for that? It'd be really hard even for skilled humans.
           | 
           | Personally I think the most pragmatic way to make C/C++
           | memory safe quicker is one of two approaches:
           | 
           | 1. Incrementally. Make std::vector[] properly bounds checked
           | (still not done even in chrome!), convert allocations to
           | allocations that know their own size and do bounds checking
           | e.g. https://issues.chromium.org/issues/40285824
           | 
           | 2. Or, go the whole hog and use runtime techniques like
           | garbage collection and runtime bounds checks.
           | 
           | A good example of approach (2) is Managed Sulong, which
           | extends the JVM to execute LLVM bitcode directly whilst
           | exposing to the C/C++/FORTRAN a virtualized Linux syscall
           | interface. The whole piece of code can be sandboxed with
           | permissions, and memory safety errors are caught at runtime.
           | The compiler tries to optimize out as many bounds checks as
           | possible. The interesting thing about this approach is it
           | doesn't require big changes to the source code (as long as
           | it's already been ported to Linux), which means the work of
           | making something safe can be done by teams independent of the
           | original authors. In practice "rewrite it in Rust" will
           | usually mean a fork, which introduces lots of complicated
           | technical, cultural and economic issues.
           | 
           | Managed Sulong is also a research project and has a bunch of
           | problems to solve, for instance it needs to lose the JITC
           | dependency and go fully AOT compiled (doable, there's no
           | theoretical issue with it and much of the needed infra
           | already exists). And performance/memory usage can always be
           | improved of course, it regresses vs the original C. But those
           | are "just" systems engineering problems, not rewrite-the-
           | world and solve-static-analysis problems.
           | 
           | Disclosure: I do work part time at Oracle Labs which
           | developed Managed Sulong, but I don't work on it.
        
             | TinkersW wrote:
             | std::vector [] has had bounds checking since forever if you
             | set the correct compiler flag. Since they aren't using it
             | this is a choice, presumably they prefer the speed gain.
        
               | mike_hearn wrote:
               | You mean _GLIBCXX_DEBUG? It's got some issues. Linux
               | only, it doesn't always work [1] and it's all or nothing.
               | What's really needed is the ability to selectively opt-
               | out on a per-instantiation level so very hot paths can
               | keep the needed performance whilst all the rest gets
               | opted into safety checks.
               | 
               | Microsoft has this:
               | 
               | https://learn.microsoft.com/en-us/cpp/standard-
               | library/safe-...
               | 
               | but it doesn't seem to actually make std::vector[] safe.
               | 
               | It's frustrating that low hanging fruit like this doesn't
               | get harvested.
               | 
               | [1] "although there are precondition checks for some
               | string operations, e.g. operator[], they will not always
               | be run when using the char and wchar_t specializations
               | (std::string and std::wstring)."
        
               | Calavar wrote:
               | As far as I am aware, the standard doesn't mandate bounds
               | checking for std::vector::operator[] and probably never
               | will for backwards compatibility reasons. Most standard
               | library implementations have opt-out std::vector[] bounds
               | checking in unoptimized builds, but not in optimized
               | builds.
               | 
               | I tried a toy example with GCC [1], Clang [2], and MSVC
               | [3], and none of them emit bounds checks with basic
               | optimization flags.
               | 
               | [1] https://godbolt.org/z/W5e3n5oWM
               | 
               | [2] https://godbolt.org/z/Pe8nPPvEd
               | 
               | [3] https://godbolt.org/z/YTdv3nabn
        
             | Animats wrote:
             | > But, this isn't just about rewriting code from one
             | language to another. It's about reverse engineering complex
             | information out of the code, which may not be immediately
             | visible in it, and then finding a way to make it "safe"
             | according to Rust's type system. Where's the training data
             | for that? It'd be really hard even for skilled humans.
             | 
             | That might not be too bad.
             | 
             | A combination of a formal system and an LLM might work
             | here. Suppose we see a C function                  void
             | somefn(char* buf, int n);
             | 
             | First question: is "buf" a pointer to an array, or a
             | pointer to a single char? That can be answered by looking
             | at what the function does with "buf", and what callers pass
             | to it.
             | 
             | If it's an array, how big is it? We don't have enough info
             | to know that yet. But a reasonable guess, and one than an
             | LLM might make, is that the length of buf is "n".
             | 
             | Following that assumption, it's reasonable to translate
             | this to Rust as                  fn somefn(buf: &[u8])
             | 
             | and, if n is needed within the function, use
             | buf.len()
             | 
             | The next step is to validate that guess. The run-time
             | approach is to write all calls to "somefn" with
             | assert!(buf.len() == n);        somefn(buf, n);
             | 
             | Maybe formal methods can prove the assert true, and we can
             | take it out. Or if a SAT solver or a fuzz tester can
             | generate a counterexample, we know that the guess was wrong
             | and this has to be done the hard way, as
             | fn somefn(buf: &[u8], int n)
             | 
             | implying more subscript checks inside "somefn".
             | 
             | The idea is to recognize common C idioms and do clean
             | translations to Rust for them. This should handle a high
             | percentage of cases.
        
           | Calavar wrote:
           | > today cars drive themselves
           | 
           | You can attach about a hundred asterisks to that.
           | 
           | If anything, I think self the failure to hit L5 driving after
           | billions of dollars and millions of man hours invested is
           | probably reflective of how automatic C to Rust translation
           | will go. We'll cruise 90% of the way, but the last 10% will
           | prove insurmountable with current technology.
           | 
           | Think about the number of C programs in the wild that rely on
           | compiler-specific or libc-specific or platform-specific
           | behavior, or even undefined behavior plus the dumb luck of a
           | certain brittle combination of {compiler version} [?] {libc
           | version} [?] {linker version} [?] {build flags} emitting
           | workable machine code. There's a huge chunk of C software
           | where there's not enough context within the source itself (or
           | even source plus build scripts) to understand the behavior.
           | It's not even clear that this is a solvable problem in the
           | abstract.
           | 
           | None of that is to say that DARPA shouldn't fund this.
           | Research isn't always about finding an industrial strength
           | end product; the knowledge and expertise gained along the way
           | is important too.
        
             | psychoslave wrote:
             | Ok, but if it's like 90% of small projects can use it as
             | direct no pain bridge, that can be a huge win.
             | 
             | Even if it's "can handle well 90%" of the transition for
             | any project, this is still interesting. Unlike cars on the
             | road, most code transition project out there doesn't need
             | to be 100% fine to provide some useful value.
        
               | 0cf8612b2e1e wrote:
               | Even if every project can only be 90% done, that's a huge
               | win. Best would be if it could just wrap the C equivalent
               | code into an unsafe block which would be automatically
               | triaged for human review.
               | 
               | Just getting something vaguely Rust shaped which can
               | compile is the first step in overcoming the inertia to
               | leave the program in its current language.
        
             | programd wrote:
             | > > today cars drive themselves
             | 
             | > You can attach about a hundred asterisks to that.
             | 
             | Not in San Francisco. There are about 300 Waymo cars safely
             | driving in one of the most difficult urban environments
             | around (think steep hills, fog, construction, crazy
             | traffic, crazy drivers, crazier pedestrians). Five years
             | ago this was "someday" science-fiction. Frankly I trust
             | them much more then human drivers and envision a future
             | utopia where human drivers are banned from urban centers.
             | 
             | To get back on topic, I don't think automatic programming
             | language translation is nearly as hard, especially since we
             | have a deterministic model of the machines it runs on. I
             | can see a possible approach where AI systems take the
             | assembler code of a C++ program, then translate that into
             | Rust, or anything else. Can they get 100% accuracy and bit-
             | for-bit compatibility on output? I would not bet against
             | it.
        
               | creata wrote:
               | Isn't 100% accuracy (relatively) easy? c2rust already
               | does that, or at least comes close, as far as I know.
               | 
               | Getting identical outputs on safe executions, catching
               | any unsafe behavior (at translation-time or run-time),
               | and producing efficient, maintainable code all at once is
               | a million times harder.
        
               | m0llusk wrote:
               | Opinions about automated driving systems vary. Just from
               | my own experience doing business all around San Francisco
               | I have seen at least a half dozen instances of Waymo
               | vehicles making unsafe maneuvers. Responders have told me
               | and local government officials that Waymo vehicles
               | frequently fail to acknowledge emergency situations or
               | respond to driving instructions. Driving is a social
               | exercise which requires understanding of a number of
               | abstractions.
        
               | saagarjha wrote:
               | San Francisco, for all its challenges, mostly has traffic
               | laws that people follow. This is not true throughout the
               | world.
        
             | D-Coder wrote:
             | In addition to the other replies, this is a one-time
             | project. After everything (or almost everything) has been
             | translated, you're done, you won't be running into new edge
             | cases.
        
             | sqeaky wrote:
             | This is the exact formulation of the argument before
             | computers beat humans at chess, or drew pictures, or
             | represented color correctly, or... Self driving cars will
             | be solved. There is at least one general purpose computer
             | that can solve it already (a human brain), so of a purpose
             | built computer can also be made to solve it.
             | 
             | In 10 (or 2 or 50 or X) years when Chevy, Ford, and others
             | are rolling out cheap self driving this argument stops
             | working. The important thing is that this argument stops
             | working with no change in how hard C to Rust conversion is.
             | 
             | We really should be looking at the specifics of both
             | problems. What makes computer language translation hard?
             | Why is driving hard? One needs to be correct while
             | inferring intent and possibly reformulating code to meet
             | new restrictions. The other needs to be able to make snap
             | judgments and in realtime avoid hitting things even if it
             | just means stopping to prefer safety over motion. One
             | problem can be solved piecewise without significant regard
             | to time and the other solved in realtime as it happens
             | without producing unsafe output.
             | 
             | These problems really aren't analogous.
             | 
             | I think you picked self driving cars just because it is a
             | big and only partially solved problem. One could just as
             | easily pick a big solved problem or a big unstarted problem
             | and formulate equally bad arguments.
             | 
             | I am not saying this problem is easy, just that it seems
             | solvable with sufficient effort.
        
               | mywittyname wrote:
               | > These problems really aren't analogous.
               | 
               | I'd put money on the solutions to said problems looking
               | largely the same though - big ass machine learning
               | models.
               | 
               | My prediction is that a tool like copilot (but
               | specialized to this domain) will do the bulk of source
               | code conversions, with a really smart human coming behind
               | to validate.
        
           | eesmith wrote:
           | As a reminder, DARPA funded self-driving car research since
           | at least the 1980s with the Autonomous Land driven Vehicle
           | (ALV) project, plus the DARPA Grand Challenges, and more.
        
         | sam0x17 wrote:
         | speaking of hard, the DOE actually funds a project that has
         | been around for 20+ years now (ROSE) that involves (among other
         | things) doing static analysis on and automatically translating
         | between C/C++/Cuda and even high level languages like Python as
         | well as HPC variants of C/C++. They have a combined AST that
         | supports all of those languages with the same set of node types
         | essentially. Quite cool. I got to work on it when I was an
         | intern at Livermore, summer of 2014.
         | 
         | and it's open source as well!
         | http://rosecompiler.org/ROSE_HTML_Reference/index.html
        
         | jandrese wrote:
         | I have to think the approach will be something like "AI
         | summarizes the features of the program into some kind of
         | technical language, then the AI synthesizes Rust code that
         | covers the same feature set".
         | 
         | It would be most interesting if the approach was not to feed
         | the program the original program but rather the manual for the
         | program. That said it's rare that a manual captures all of the
         | nuances of the program so a view into the source code is
         | probably necessary, at least for getting the ground truth.
        
           | munificent wrote:
           | More like:
           | 
           | "AI more or less sort of summarizes the features of the
           | program into some approximate kind of technical language,
           | then the AI synthesizes something not too far from Rust code
           | that hopefully covers aspirationally the same feature set".
        
         | downrightmike wrote:
         | If the IRS could have more timely funding, all their Cobol
         | would be translated to Java by now
        
           | psunavy03 wrote:
           | COBOL migrations are tar pits of replicating 40+ years of
           | undocumented niche business logic for a given field, edge
           | cases included, that was "commonly understood" by people who
           | are now retired or dead. Don't get your hopes up.
        
         | the8472 wrote:
         | Write tests for your C code. Run c2rust (mechanical
         | translation), including the tests. Let a LLM/MCTS/verifier loop
         | go to town. Verifier here means it passes compiler checks,
         | tests, santiziers and miri.
         | 
         | Additional training data can be generated by running mrustc or
         | by inlining unsafe code (from std/core/leaf crates) into safe
         | code and running semantics-preserving mechanical refactorings
         | on the code.
         | 
         | This can be closer to AlphaProof than ChatGPT
        
         | rectang wrote:
         | I have to imagine that in the general case it will be a
         | translation to unsafe Rust, with occasional isolated leaf nodes
         | being translated to safe Rust.
         | 
         | If you think it's hard wrestling with the borrow checker, just
         | imagine how much harder it is to write automatic translation to
         | borrow-checker-approved code that accounts for all the possible
         | program space of C and all it's celebrated undefined behavior.
         | A classic problem of writing compilers is that the space of
         | valid programs is much larger than the space of programs which
         | will compile.
         | 
         | A quick web search reveals some other efforts, such as c2rust
         | [1]. I wonder how TRACTOR differs.
         | 
         | [1] https://github.com/immunant/c2rust
        
           | Someone wrote:
           | > have to imagine that in the general case it will be a
           | translation to unsafe Rust, with occasional isolated leaf
           | nodes being translated to safe Rust.
           | 
           | That's not what they are aiming for. FTA: _"The goal is to
           | achieve the same quality and style that a skilled Rust
           | developer would produce"_
           | 
           | > just imagine how much harder it is to write automatic
           | translation to borrow-checker-approved code that accounts for
           | all the possible program space of C and all it's celebrated
           | undefined behavior
           | 
           | Nitpick: undefined behavior gives the compiler leeway in
           | deciding what a program does, so the more undefined behavior
           | a C program invokes, the easier it is to translate its code
           | to rust.
           | 
           | (Doing that translation in such a way that the behavior
           | remains what gcc, clang or "most C compilers" do may be
           | harder, but I'm not sure of that)
        
             | rectang wrote:
             | > _undefined behavior gives the compiler leeway in deciding
             | what a program does, so the more undefined behavior a C
             | program invokes, the easier it is to translate its code to
             | rust._
             | 
             | That's the kind of language lawyer approach that caused a
             | rebellion in the last decade amongst C programmers against
             | irresponsible compiler optimizations. "Who cares if your
             | program actually works as intended? My optimization is
             | legal according to the standard, it's _your_ program that
             | 's written to exploit loopholes".
             | 
             | I don't see any evidence that that's the attitude being
             | taken by TRACTOR -- I sure hope it isn't. But hell, even if
             | the result is unreliable in practice, I suppose that if
             | somebody gets to claim "it works" then the incentives are
             | aligned to produce garbage.
        
               | atiedebee wrote:
               | > Who cares if your program actually works as intended?
               | My optimization is legal according to the standard, it's
               | your program that's relying written to exploit
               | loopholes".
               | 
               | If your program invokes undefined behaviour, it's invalid
               | and non-portable. Out of bounds array accesses are UB,
               | yet a program containing them may just happen to work. It
               | won't be portable even between different compiler
               | versions.
               | 
               | The C standard is a 2 way contract: the programmer
               | doesn't produce code that invokes undefined behaviour,
               | and the compiler returns a standard conforming executable
        
               | rectang wrote:
               | The C standard with its extensive undefined behavior
               | causes programmers and compiler writers to be at odds. In
               | a sane world, "undefined behavior" wouldn't be assumed to
               | mean "the programmer must have meant for me to optimize
               | this whole section of code away". We aren't on the same
               | team, even if I believe that all parties are acting with
               | the best of intentions.
               | 
               | I don't feel that the Rust language situation
               | incentivizes such awful conflict, and it's one of many
               | reasons I now try _really_ hard to avoid C and use Rust
               | instead.
        
               | Asooka wrote:
               | Doing one funny thing on platform A and a different funny
               | thing on platform B when an edge case arises is way
               | better than completely deleting the code on all platforms
               | with no warning.
        
             | derdi wrote:
             | > undefined behavior gives the compiler leeway in deciding
             | what a program does, so the more undefined behavior a C
             | program invokes, the easier it is to translate its code to
             | rust.
             | 
             | You assume that the compiler can determine what behavior is
             | undefined. It can't. C compilers don't just look at some
             | individual line of the program and say "oh, that's
             | undefined, unleash the nasal demons". C compilers look at
             | code, reason that _if_ such-and-such variable has a certain
             | value (say, a null or invalid pointer), then such-and-such
             | operation is undefined (say, dereferencing that variable),
             | and _therefore_ on the next line that variable can be
             | assumed not to have that bad value. Despite all the FUD,
             | this is a very limited power. C compilers don 't usually
             | know the actual values in question, all they do is exclude
             | some invalid ones.
        
         | kragen wrote:
         | presumably dan wouldn't have gotten darpa funding if it were
         | obviously feasible, and success wouldn't give him anything
         | publishable academically
        
           | dgacmu wrote:
           | Just to be clear to others, Dan is the darpa PM on this - he
           | convinced darpa internally it was worth funding other people
           | to do the work, so he himself / his research group won't be
           | doing this work. He's on leave from Rice for a few years to
           | be a PM at DARPA's I2O.
           | 
           | And while DARPA doesn't directly care about research
           | publications as an outcome, there's certainly a publishable
           | research component to this, as well as a lot of lower papers-
           | per-$ engineering and validation work. A lot of the contracts
           | they hand out end up going to some kind of contractor prime
           | (BBN, Raytheon, that kind of company) with one or more
           | academic subs. The academic subs publish.
        
             | kragen wrote:
             | thank you for the correction; I didn't realize he was the
             | darpa pm
             | 
             | what you describe is exactly my experience as a darpa
             | performer (on a program which dan is apparently now the pm
             | for!)
        
       | niemandhier wrote:
       | Is this supposed to be automatic ? And if so wouldn't any
       | Programm that can automatically port c to rust, by necessity
       | contain all the functionality to make the c code itself safe?
        
         | gpm wrote:
         | I don't think a reasonable reading of the statement implies
         | "fully automated", at which point the answer to the question is
         | no.
         | 
         | Obviously some C code isn't just "not verifiable correct" but
         | "actually wrong in a memory unsafe way". That code isn't going
         | to be automatically translated without human intervention
         | because, how could it be, there is no correct equivalent code.
         | The tooling is going to have to have an escape hatch where it
         | says "I don't know what this code is _meant_ to do, and I know
         | it isn 't meant to do what it does do (violate promises to the
         | compiler), help me human".
         | 
         | On a theoretical level it's not _possible_ for that escape
         | hatch to only be used when undefined behaviour _does_ occur
         | (rices theorem). On a practical level it 's probably not even
         | desirable to try because obtuse enough code shouldn't just be
         | blindly translated.
         | 
         | So what I imagine the tooling ends up looking like is an
         | interactive tool that does the vast majority of the work for
         | you, but is guided by a human, and ultimately as a result of
         | that human guidance doesn't end up with _exactly_ equivalent
         | code, just code that serves the same purpose.
        
       | nanolith wrote:
       | I'm personally not a fan of "rewrite the world in Rust"
       | mentality, but that being said, if one is planning to port a
       | project to a new language or platform, mechanical translation is
       | a poor means of doing so. Spend the time planning better
       | architecture and designing a better software system, and find a
       | way to replace it piece by piece. Don't build a castle in the
       | sky, because it will never reach the ground. If you've decided to
       | use Rust for this system, that's fine. But, write Rust. Don't try
       | to back-port C into Rust.
       | 
       | I think a far better and more mature process is to update C to
       | modern C and use a model checker such as CBMC to verify memory,
       | resource, and integer math safety. One gets the same safety as a
       | gradual Rust rewrite, but the code base, knowledge base, and
       | developers can be maintained.
        
         | pdimitar wrote:
         | > _I 'm personally not a fan of "rewrite the world in Rust"
         | mentality_
         | 
         | There is no such mentality anywhere. There is a ton of software
         | that's much better off left alone in a dynamic language, or a
         | statically typed language with a garbage collector (like
         | Golang). Good engineers understand the idea of using the right
         | tool for the job.
         | 
         | The push is to start reducing those memory safety CVEs because
         | they have been proven to be a real problem, many times over.
         | 
         | > _mechanical translation is a poor means of doing so_
         | 
         | Agreed. If we could automatically and reliably translate C/C++
         | to Rust it would have been done already.
         | 
         | > _Spend the time planning better architecture and designing a
         | better software system, and find a way to replace it piece by
         | piece._
         | 
         | OK, I am just saying that somewhere along that process people
         | might get a bout of confidence and tell themselves "oh, we're
         | doing C much better now, we no longer write memory safety bugs,
         | can't we stop here?" and they absolutely will. Cue another
         | hilarious buffer overflow CVE 6 months later.
         | 
         | > _I think a far better and more mature process is to update C
         | to modern C and use a model checker such as CBMC to verify
         | memory, resource, and integer math safety._
         | 
         | A huge investment. If you are going to do that then you might
         | as well just move to Rust.
         | 
         | > _One gets the same safety as a gradual Rust rewrite_
         | 
         | Maybe, but that sounds fairly uncertain or far from a clear
         | takeaway to me.
        
           | nanolith wrote:
           | > A huge investment. If you are going to do that then you
           | might as well just move to Rust.
           | 
           | People say that, but the people who say this rarely have any
           | practical experience using CBMC. It's very straight-forward
           | to use. I could teach a developer to use it reliably, on
           | practical software, in a month.
        
             | pdimitar wrote:
             | I am not denying it, nor am I claiming that "just move to
             | Rust" is an universal escape hatch.
             | 
             | What I am saying is that if it were as simple as "just
             | learn CBMC" then maybe Microsoft and Google would have not
             | published their studies demonstrating that 60% - 75% of all
             | CVEs are memory safety errors like buffer under-/over-
             | flows.
        
               | nanolith wrote:
               | These studies aren't wrong. But, that's _also_ because
               | neither Microsoft nor Google make use of practical formal
               | methods in practice. Both have research teams and pie-in-
               | the-sky projects, not dissimilar to this DARPA project.
               | But, when it comes down to the nitty-gritty development
               | cycle, both companies use decades old software
               | development practices.
        
           | uecker wrote:
           | Rewriting is rarely a good idea in general. Rust proponents
           | like to pretend that it is impossible to avoid safety issues
           | in C while it is automatically given in Rust. But this is not
           | so simply in reality.
        
             | pdimitar wrote:
             | I don't like generalizations... in in general. :D
             | (Addressing your "rewrites are rarely a good idea in
             | general" here.)
             | 
             | My experience tells me that if a tech stack supports
             | certain safety guarantees by default that this leads to
             | measurable reduction of those safety problems when you
             | switch to the stack. People love convenient defaults,
             | that's a fact of life.
             | 
             | The apparently inconvenient truth is that most programmers
             | are quite average and you can't rely on them going above
             | and beyond to reduce memory safety errors.
             | 
             | So I don't buy the good old argument of "just hire better C
             | programmers". We still have a ton of buffer overflow CVEs
             | regardless.
             | 
             | And I never "pretended it's impossible to avoid safety
             | issues in C". I'll appreciate if you don't clump me in some
             | imaginary group of "Rust proponents".
             | 
             | What I'm saying is this: _use the right tool for the job_.
             | The C devs have been given _decades_ and yet memory safety
             | CVEs are still prevalent.
             | 
             | What conclusion would you arrive at if you were in my place
             | -- i.e. not coding C for a living for like 18 years now but
             | still witnessing it periodically crapping the bed?
             | 
             | I'm curious of your take on this. Again, what other
             | conclusion would you arrive at?
        
               | uecker wrote:
               | I am complaining about the usual phrases which are part
               | of the Rust marketing, like the "just hire better C
               | programmer did not work" or the "why are there still
               | CVEs" pseudo arguments, etc.
               | 
               | For example, let's look at the "hire better C programmers
               | does not work" argument. Like every good propaganda it
               | starts with a truism: In this case that even highly
               | skilled C/C++ programmers will make mistakes that could
               | lead to exploitable memory safety issues. The problem
               | comes from exaggerating this to the idea that "all hope
               | is lost and nothing can be done". In reality one can
               | obviously do a lot of things to improve safety in C/C++.
               | And even one short look at CVEs should make it clear that
               | there is often huge room for improvements even with
               | relatively simple measures. For example, a lot of memory
               | safety bugs in C/C++ come from open-coded string or
               | buffer manipulation. But it is not exactly rocket science
               | to abstract this away behind a safer interface. But once
               | this is understood, the obvious conclusion is that
               | addressing some of these low-hanging fruits would be far
               | more effective in improving safety than wasting a lot of
               | time and effort in rewriting in Rust.
        
         | IshKebab wrote:
         | > I think a far better and more mature process is to update C
         | to modern C and use a model checker such as CBMC to verify
         | memory, resource, and integer math safety.
         | 
         | No chance. CBMC is amazing, but have you actually tried
         | formally verifying a "real" program?
         | 
         | I agree replacing with a hand-architected Rust version is
         | clearly the better solution but also more expensive. I think
         | they're going for an RLBox style "improve security
         | significantly with little-to-no effort" type product here. That
         | doesn't mean you shouldn't do a full manual rewrite if you have
         | the resources, but it's better than nothing if you haven't.
        
           | nanolith wrote:
           | > No chance. CBMC is amazing, but have you actually tried
           | formally verifying a "real" program?
           | 
           | Yes. Every day. It's actually quite easy to do. Write shadow
           | methods covering the resources and function contracts of
           | called functions, then verify the function. Repeat all of the
           | way up and down the stack. It adds about 30% overhead over
           | just TDD development.
        
             | PhilipRoman wrote:
             | Last time I tried CBMC, it ended up running out of memory
             | for relatively small programs, do you encounter any
             | resource usage issues with it? I'm learning Frama-C and I
             | find it more predictable, although the non-determinism of
             | solvers shocked me when I first tried to prove non-trivial
             | programs. I guess ideally I would like something even more
             | explicit than Frama-C.
        
               | nanolith wrote:
               | CBMC works best on functions, not programs. You want to
               | isolate an individual function, then provide shadows of
               | the functions it calls. The shadows should have
               | nondeterministic behavior (cover every possible error
               | condition) and otherwise follow the same memory and
               | resource rules as the original function. For instance, if
               | shadowing a function that reads a buffer, the shadow
               | should ensure full buffer access as part of its
               | assertions.
               | 
               | The biggest issue you will run into with bounded model
               | checking is recursion and looping. In these cases, you
               | want to refactor the code to make it easier to formally
               | verify outside of the loop. Capture and assert on loop
               | variants / invariants, and feed these forward in
               | assertions on code.
               | 
               | There's no way I can capture all of this in an HN
               | comment, but to get CBMC to work, you need to break down
               | your code.
        
               | PhilipRoman wrote:
               | Thanks, that was really helpful. Relying on getting
               | shadow functions right does seem icky, but I guess the
               | improved productivity of CBMC should make up for it.
               | Definitely going to give it another chance!
        
               | nanolith wrote:
               | You're welcome. I've been meaning to write a blog article
               | on the subject, because it is a subtle thing to get
               | working.
               | 
               | Think of shadow functions as the specifications that you
               | are building. Unlike proof assistants or Frama-C, you
               | write specifications in C itself, and they work similarly
               | to code. Often, the same contracts you write in these
               | specifications can be shared by both the shadow functions
               | and the real functions they shadow.
               | 
               | I take a bottom-up approach to model checking. I'll start
               | by model checking the lowest level code, then I'll shadow
               | this code to model check code that depends on it. In this
               | way, I can increase the level of abstraction for model
               | checking, focusing just on the side effects and contracts
               | of functions I shadow, and move up the stack toward more
               | and more general code.
        
         | Apofis wrote:
         | This is definitely a pie-in-the-sky DARPA challenge that would
         | be great to have around as we migrate away from legacy systems,
         | however, even taking your functions/methods in one language and
         | giving them to ChatGPT and asking it to translate your method
         | to a different language generally doesn't work. Asking ChatGPT
         | the initial problem you're trying to solve, works more
         | frequently, but still generally doesn't work. You still need to
         | do a lot of tinkering and thinking to get even basic things to
         | work that it outputs.
        
         | usrusr wrote:
         | _If_ you have dormant code, as in running everywhere but not
         | getting worked on anywhere, a  "translate to shitty rust before
         | ever touching again" has a certain appeal. Not the appeal of an
         | obviously good idea: chances are the "shitty rust" created
         | through translation would be so much worse to work on than C
         | with some level of background noise of bugs (that would also be
         | present in the "shitty rust" thanks to faithful translation).
         | In C, people have an idea about how to deal with the problems.
         | In "shitty rust", it's, well, shitty, because rust people are
         | not used to that stuff.
         | 
         | But there's a non-zero chance that someone could develop a
         | skillset for iteratively cleaning up into something tolerable.
         | 
         | And then there are non-goal things that could grow out of the
         | project, e.g. some form of linter feedback "can't translate
         | into tolerable rust because of x, y and z". C people could look
         | into that, and once the code is translatable into good rust,
         | why translate.
         | 
         |  _If_ that was an outcome of the project, some people might
         | find it easier to describe their solution in runnable C and let
         | the  "translator/linter" guide them to a non-broken approach.
         | 
         | I'd certainly consider all these positive outcomes quite
         | unlikely, but isn't it pretty much the job description of DARPA
         | to do the occasional dark horse bet?
        
           | suprjami wrote:
           | In my experience (supporting a machine-translated codebase
           | which resulted in shitty Java) your theory doesn't play out.
           | 
           | If you give developers a shitty codebase then those
           | developers will leave to work somewhere else.
           | 
           | After a few years of working on this codebase we had 88%
           | turnover. 1 in 10 developers remembered the original
           | project's design philosophy and intention.
           | 
           | It wasn't a good situation.
        
       | TinkersW wrote:
       | Good luck with that..also shouldn't the target be C++ to Rust? Is
       | there really that much pure C still being written?
        
         | surfingdino wrote:
         | IoT, embedded systems still use it. There's loads of them.
        
         | riku_iki wrote:
         | AGI may find much simpler, more robust/performant and safe
         | language.
        
       | deepsun wrote:
       | They didn't explain why they've chosen Rust. There are a lot of
       | memory-safe languages besides Rust, especially in application-
       | level area (not systems-level like Rust).
        
         | woodruffw wrote:
         | There are a lot of memory safe languages; there are fewer that
         | have (1) marginal runtime requirements, (2) transparent
         | interop/FFI with existing C codebases, (3) enable both spatial
         | and temporal memory safety without GC, and (4) have significant
         | development momentum behind them. Rust doesn't _have_ to be
         | unique among these qualifications, but it 's currently
         | preeminent.
        
           | deepsun wrote:
           | Yes, but you assume all their projects need all 4 of these. I
           | like Rust, but it's a bad choice for many areas (e.g.
           | aforementioned application-level code). I'd expect serious
           | decisions to at least take that into account.
        
             | woodruffw wrote:
             | I'm not assuming anything of the sort. These are just
             | properties that make Rust a nice target for automatic
             | translation of C programs; there are myriad factors that
             | _guarantee_ that nowhere close to 100% of programs (C,
             | application level, or otherwise) won't be suitable for
             | translation.
        
         | galangalalgol wrote:
         | If you have your cross hair on c, then you want a language that
         | can do whatever c does. That makes the list of memory safe
         | languages a lot shorter.
        
         | oconnor663 wrote:
         | Apart from runtime/embedded requirements, there's the big
         | question of how you represent what C is doing in other
         | languages that don't have interior pointers and pointer
         | casting. For example, in C I might have a `struct foo*` that
         | aliases the 7th element of a `struct foo[]` array. How do you
         | represent that in Java or Python? I don't think you can use
         | regular objects or regular arrays/lists from either of those
         | languages, because you need _assignments through the pointer_
         | (of the whole `struct foo`, not just individual field writes)
         | to affect the array. Even worse, in C I might have a `const
         | char*` that aliases the same element and expects every write to
         | affect its _bytes_. To model all this you 'd need some
         | Frankenstein, technically-Turing-complete, giant-bytestring-
         | that-represents-all-of-memory thing that wouldn't really be
         | Java or Python in any meaningful sense, wouldn't be remotely
         | readable or maintainable, and wouldn't be able to interoperate
         | with any existing libraries.
         | 
         | In Rust you presumably do all of that with raw pointers, which
         | leaves you with a big unsafe mess to clean up over time, and I
         | imagine a lot of the hard work of this project is trying to
         | minimize that mess. But at least the mess that you have is
         | recognizably Rust, and incremental cleanup is possible.
        
       | thibran wrote:
       | Porting the Linux kernel to 100% Rust should be the benchmark for
       | AGI.
       | 
       | ... and when done, please port SQLite too :)
        
         | 0cf8612b2e1e wrote:
         | I am fully in the RIIR koolaid, but SQLite would be near the
         | absolute bottom of my prioritization list. Care to explain?
         | SQLite is extensively tested, has requirements to run on ~every
         | platform, be backwards compatible, and has a relatively small
         | blast radius if there is a C derived bug. There is much more
         | fertile ground in any number of core system services (network,
         | sudo, dns, etc)
        
           | eric-p7 wrote:
           | Not a small blast radius. There are an estimated 1 trillion
           | active deployed SQLite instances:
           | https://news.ycombinator.com/item?id=29461127
        
       | commodoreboxer wrote:
       | A lot of people are reading this as a call or demand to translate
       | all C and C++ code to Rust, but (despite the catchy project
       | name), I don't read the abstract in that way. There are two
       | related but separate paragraphs.
       | 
       | 1. C and C++ just aren't safe enough at large. Even with careful
       | programming and good tooling, so many vulnerabilities are caused
       | by their unsafe by default designs. Therefore, as much code as
       | possible should be translated to or written in "safe" languages
       | (especially ones that guarantee memory safety).
       | 
       | 2. We are funding and calling for software to translate existing
       | C code into Rust.
       | 
       | It's not a consensus to rewrite the world in Rust. It's a
       | consensus to migrate to safe languages, which Rust is an example
       | of, and a program that targets Rust in such migration.
        
         | akira2501 wrote:
         | > or written in "safe" languages
         | 
         | So when those languages have 'unsafe' constructs what are the
         | rules going to be around using those? Without a defining set of
         | rules to use here you're just going to end up right back where
         | you started.
         | 
         | > to migrate to safe languages, which Rust is an example of
         | 
         | Rust has a safe mode. It is _not_ a safe language. To do
         | anything interesting you will require unsafe blocks. This will
         | not get you very much.
         | 
         | Meanwhile you have tons of garbage collected languages that
         | don't even let the programmer touch pointers. Why aren't those
         | considered? The reason is performance. And because Rust
         | programmers "care" so much about performance you're not ever
         | going to solve the fundamental problem with that language.
         | 
         | Do you want performance or safety? You can't have both.
        
           | timeon wrote:
           | > To do anything interesting you will require unsafe blocks.
           | This will not get you very much.
           | 
           | This is not true.
        
             | akira2501 wrote:
             | > This is not true.
             | 
             | Burying unsafe blocks in unevaluated cargo modules does not
             | make this true. You're just taking the original problem and
             | sweeping it under the rug.
        
           | bigstrat2003 wrote:
           | > Rust has a safe mode. It is _not_ a safe language. To do
           | anything interesting you will require unsafe blocks. This
           | will not get you very much.
           | 
           | 1. There are plenty of interesting programs which don't
           | require unsafe.
           | 
           | 2. Even if your program does require unsafe, Rust still
           | limits where the unsafety is. This lets you focus your
           | scrutiny on the small section of the program which is
           | critical for safety guarantees to hold. That is still a win.
        
       | 0xbadcafebee wrote:
       | Every tool has its own specific quirks. Over many years of using
       | a tool, "expertise" is the intimate knowledge of those quirks and
       | how to use that tool most effectively. Changing tools requires
       | you to gain expertise again. You're going to be less proficient
       | in the new tool for a long time, and make a lot of mistakes.
       | 
       | Considering we already know how to make C/C++ programs memory
       | safe, it's bizarre that people would ditch all of their
       | expertise, and the years and years of perfecting the operation of
       | those programs, and throw all that out the window because they
       | can't be bothered to use a particular set of functions [that
       | enforce memory safety].
       | 
       | If you're going to go to all of the trouble to gain expertise in
       | an entirely new tool, plus porting a legacy program to the new
       | tool, I think you need a better rationale than "it does memory
       | safety now". You should have more to show for your efforts than
       | just that, and take advantage of the situation to add more value.
        
         | wffurr wrote:
         | But even proficient C and C++ programmers continue to produce
         | code with memory safety issues leading to remote code execution
         | exploits. This argument doesn't hold up to the actual
         | experience of large C and C++ projects.
        
           | 0xbadcafebee wrote:
           | They aren't trying to prevent them. It's trivial to prevent
           | them if you actually put effort into it; if you don't, it's
           | going to be vulnerable. This is true of all security
           | concerns.
        
             | woodruffw wrote:
             | "You aren't trying hard enough" isn't a serious approach to
             | security: if it was, we wouldn't require seatbelts in cars
             | or health inspections in restaurants.
             | 
             | (It's also not clear that they _aren 't_ trying hard
             | enough: Google, Apple, etc. have billions of dollars riding
             | on the safety of their products, but still largely fail to
             | produce memory-safe C and C++ codebases.)
        
       | jcalvinowens wrote:
       | This isn't some "pie in the sky" thing, Immunant has a working C
       | to Rust transpiler and it's really interesting:
       | https://github.com/immunant/c2rust
        
         | steveklabnik wrote:
         | Their work was also previously sponsored by DARPA, though I do
         | not know if it was under this program or something else.
        
         | Animats wrote:
         | I've tried that thing. The Rust that comes out is terrible. It
         | converts C into a set of Rust function calls which explicitly
         | emulate C semantics by manipulating raw pointers. It doesn't
         | even convert C arrays to a Vec. It's a brute-force
         | transliteration, not a translation.
         | 
         | I and someone else ran this on a JPEG 2000 decoder that
         | sometimes crashed with a bad memory reference. The Rust version
         | crashed with the same bad memory reference. It's bug-
         | compatible.
         | 
         | What comes out is totally unreadable and much bigger than the
         | original C code. Manual "refactoring" of that output is
         | hopeless.
        
           | marcosdumay wrote:
           | Any automatic translation is bug-compatible with the
           | original. Did you expect it to divine some requirements?
           | 
           | It still leave you with Rust code that you can improve
           | piecewise. The only question is if something like it is
           | better than FFI calling the C code.
        
             | bornfreddy wrote:
             | > Any automatic translation is bug-compatible with the
             | original. Did you expect it to divine some requirements?
             | 
             | That would be useless when translating C to Rust. Yes, I
             | would expect the tool to point out the flaws in the
             | original memory handling and only translate the corrected
             | code. This is far from easy, since some information
             | (intent) is missing, but a good coder could do it on decent
             | codebases. The question is, can an automated tool do it
             | too? We'll see.
        
         | jcranmer wrote:
         | As I mentioned elsewhere
         | (https://news.ycombinator.com/item?id=41113257), that tool is
         | pretty much useless unless you have some checkbox that says "no
         | C code allowed anywhere". It's not even a feasible starting
         | point for refactoring because the code is so far from idiomatic
         | Rust.
        
       | jll29 wrote:
       | Difficult: most C programs I know would convert to one single
       | large "unsafe" block...
       | 
       | One might argue that re-writing from scratch is the safer option;
       | and a re-write is also an opportunity to do things differently
       | (read: improve the architecture by using what one has learned),
       | despite the much-feared "second system" syndrome.
       | 
       | But nothing wrong with spending some research dollars towards
       | tooling for "assisted legacy rewrites". DARPA and her sister
       | IARPA fund step innovation (high risk, high reward), and this is
       | an area where good things can come potentially come from.
        
       | Animats wrote:
       | It's good to see DARPA pushing on this. It's a hard problem, but
       | by no means impossible. Translating to _safe_ Rust, though, is
       | going to be really tough. There 's a C to Rust translator now,
       | but what comes out is horrible Rust, which just rewrites C
       | pointer manipulation as unsafe Rust struct manipulation. The
       | result is less maintainable than the original.
       | 
       | So what would it take to actually do this right? The two big
       | problems are 1) array sizes, and 2) non-affine pointer usage.
       | Pointer arithmetic is also hard, but rare. Most pointer
       | arithmetic can be expressed as slices.
       | 
       | Every array in C has a size. It's just that the compiler doesn't
       | know what it is.
       | 
       | Where is this being discussed in detail?
        
         | steveklabnik wrote:
         | > Where is this being discussed in detail?
         | 
         | In my understanding, this is a call for proposals to do the
         | work, there is no detailed discussion yet. That will come when
         | there's actual responses to this call.
        
           | Animats wrote:
           | Right, there's a call, and a project day with an in-person
           | meeting coming up.
        
         | jcranmer wrote:
         | I once tried to use c2rust as a starting point for
         | rustification of code and... it's not even good at that. The
         | code is just too freakishly literal to the original C semantics
         | that you can't even take the non-pointery bits and strip off
         | the unsafe block and use that as a basis.
         | 
         | (To give you a sense, it translates something like a + 1 to
         | a.unwrapped_add(1i32), and my recollection is that for (int i =
         | 0; i < 10; i++) gets helpfully turned into a while loop instead
         | of a for loop).
         | 
         | In general, the various challenges that all need to be solved
         | that aren't solved yet are:
         | 
         | a) when is integer overflow intentional in the original code so
         | that you know when to use wrapping_op instead of regular Rust
         | operators?
         | 
         | b) how to convert unions into Rust enums
         | 
         | c) when pointers are slices, and what corresponds to the length
         | of the slice
         | 
         | d) convert pointers to references, and know when they're
         | mutable or const references
         | 
         | e) work out lifetime annotations where necessary
         | 
         | f) know when to add interior mutability to structs
         | 
         | g) wrap things in Mutex/RwLock/etc. for multithreaded access
         | 
         | We're a very long way from having full-application conversion
         | workable, and that might be sufficiently difficult that it's
         | impossible.
        
           | Animats wrote:
           | That doesn't mention the affine type problem. Rust references
           | are restricted to single ownership. If A has a reference to
           | B, B can't have a reference to A. Bi-directional references
           | are not only a common idiom in C, they're an inherent part of
           | C++ objects.
           | 
           | Rust has to use reference counts in such situations. You have
           | an Rc wrapped around structs, sometimes a RefCell, and
           | .borrow() calls that panic when you have a conflict. C code
           | translates badly into that kind of structure.
           | 
           | Static analysis might help find .borrow() and .borrow_mut()
           | calls that will panic, or which won't panic. It's very
           | similar to finding lock deadlocks of the type where one
           | thread locks the same lock twice.
           | 
           | (If static analysis shows that no .borrow() or .borrow_mut()
           | for an RwLock will panic, you don't really need the RwLock.
           | That's worth pursuing as a way to allow Rust to have back
           | references.)
        
             | jcranmer wrote:
             | I'd lump that analysis somewhere in the d-g, because you
             | have to remember that &mut is also noalias and work out
             | downstream implications of that. It's probably presumptive
             | of me to assume a particular workflow for reconstructing
             | the ownership model to express in Rust, and dividing that
             | into the steps I did isn't the only way to do it.
             | 
             | In any case, it's the difficulty of that reconstruction
             | step that leaves me thinking that automated conversion of
             | whole-application to Rust is a near-impossibility.
             | Conversion of an individual function that works on plain-
             | old-data structures is probably doable, if somewhat
             | challenging.
             | 
             | An off-the-cuff idea I just had is to implement a semi-
             | automated transformation, where the user has to input what
             | a final conversion of a struct type should look like
             | (including all Cell/Rc/whatever wrappers as needed), and
             | the tool can use that to work out the rest of the
             | translation. There's probably a lot of ways that can go
             | horribly wrong, but it seems more feasible than trying to
             | figure out all of the wrappers need to be.
        
         | clintfred wrote:
         | Even if just all the unsafe areas were marked, wouldn't that be
         | valuable? At least it would focus review efforts on the parts
         | with the most risk?
        
       | sans-seraph wrote:
       | I have been aware of this proposed initiative for some time and I
       | find it interesting that it is now becoming public. It is a very
       | ambitious proposal and I agree that this level of ambition is
       | appropriate for DARPA's mission and I wish them well.
       | 
       | As a Rust advocate in this domain I have attempted to temper the
       | expectations of those driving this proposal with due respect to
       | the feasibility of automatic translation from C to Rust. The
       | fundamental obstacle that I foresee remains that C source code
       | contains less information than Rust source code. In order to
       | translate C code to Rust code that missing information must be
       | produced by someone or something. It is easy to prove that it is
       | impossible to infallibly generate this missing information for
       | the same reason that scaling an image to make it larger cannot
       | infallibly produce bits of information that were not captured by
       | the original image. Instead we must extrapolate (invent) the
       | missing information from the existing source code. To extrapolate
       | correctly we must exercise judgement and this is a fallible
       | process especially when exercised in large quantities by
       | unsupervised language models. I have proposed solutions that I
       | believe would go some way towards addressing these problems but I
       | will decline to go into detail.
       | 
       | Ultimately I will say that I believe that it is possible for this
       | project to achieve a measure of success, although it must be
       | undertaken with caution and with measured expectations. At the
       | same time it should be emphasized it is also possible that no
       | public result will come of this project and so I caution those
       | here against reading too much into this at this time. In
       | particular I would remind everyone that the government is not a
       | singular entity and so I would not interpret this project as a
       | blanket denouncement against C or vice versa as a blanket
       | blessing of Rust. Each agency will set its own direction and
       | timelines for the adoption of memory-safe technologies. For
       | example NIST recommends Rust as well as Ada SPARK in addition to
       | various hardened dialects of C/C++.
        
         | steveklabnik wrote:
         | > As a Rust advocate in this domain I have attempted to temper
         | the expectations of those driving this proposal
         | 
         | Thank you!
        
         | pfdietz wrote:
         | How does it relate to the CRAM effort at Grammatech?
         | 
         | https://cpp-rust-assisted-migration.gitlab.io/blog/
        
       | simon_void wrote:
       | a) if every C program could be translated into an equivalent safe
       | Rust program, that would mean that each C program is as safe as
       | the safe Rust equivalent. b) since there are C programs that are
       | open to memory currption in a way safe Rust isn't, this
       | corruptability would need to be translated into partially unsafe
       | Rust. Congrats, you now have a corruptible Rust program, what's
       | the point again?? c) so DARPA must be trying to fix/change what
       | the program is doing when switching to Rust. So how to discern
       | what behaviour is intended and which is not? Doesn't this run
       | directly into the undecidability/uncomputability of the halting
       | problem!?!
        
         | Arnavion wrote:
         | >Doesn't this run directly into the
         | undecidability/uncomputability of the halting problem!?!
         | 
         | The programmer gets to decide. DARPA does not expect the
         | translator program to autonomously output a perfect Rust
         | program. It just wants a " _high degree_ of automation towards
         | translating legacy C to Rust " (from the sam.gov link in the
         | submission, emphasis mine).
        
       | PaulHoule wrote:
       | Whatever happened to Ada?
        
         | wffurr wrote:
         | It languished in government work behind a wall of extremely
         | expensive compilers and contractors. Never heard anyone suggest
         | RiiA - Rewrite it in Ada.
        
           | nvy wrote:
           | GCC contains `gnat` which is a libre Ada compiler.
           | 
           | I think Ada has a lot of technical merit but it's just not
           | fashionable the way Rust is, for lots of uninteresting
           | reasons.
        
             | PaulHoule wrote:
             | I remember Ada getting pushed in a time when there were
             | many in the computer industry that were pushing Pascal as
             | both a systems and a teaching language. Ada was a lot like
             | Pascal which I think caused an immediate violent reaction
             | in some people. (e.g. the implementers of every other
             | programming language were pissed that BASIC was so
             | hegemonic but they never asked "Why?" or if their
             | alternatives were really any better)
             | 
             | In the early 1980s, microcomputer implementations such as
             | UCSD Pascal were absolutely horrific in terms of
             | performance plus missing the features you'd need to do
             | actual systems programming work. In the middle of the
             | decade you saw Turbo Pascal which could compile programs
             | before you aged to death and also extended Pascal
             | sufficiently to compete with C. But then you had C, and the
             | three-letter agencies were still covering up everything
             | they knew about buffer overflows.
        
       | sim7c00 wrote:
       | i like the idea but i struggle to see how one can go about doing
       | 'safe' disk reads, having 'safe' ways to manage global resources
       | in kernel land (page tables, descriptor tables etc) and a lot of
       | other stuff. perhaps if those devices also have rust in their
       | firmware they can reply safely?? genuinely curious because i went
       | back to C from rust in my OS. i could not figure it out (maybe i
       | am not a darpa level engineer but i did work at a similar place
       | doing similar things).
       | 
       | id be excited if this gets solved. rust is a lot more comfy for
       | higher level kernel stuff.
        
       | rpoisel wrote:
       | I think we have to take that literally: They only translate C
       | code to Rust. Not C++.
        
       | plasticeagle wrote:
       | If
       | 
       | 1) Rust contains no memory bugs 2) C can be automatically
       | translated to it
       | 
       | Then all memory bugs can be fixed automatically, which is almost
       | certainly untrue. This task is very likely completely impossible
       | in the general case.
        
         | warkdarrior wrote:
         | Since you did not specify that you wish to preserve all
         | behaviors of the C code, there are trivial solutions to this
         | problem. For example, one could replace all dynamic memory
         | allocations with fixed buffers (set at translation time), and
         | reject all inputs that do not fit in those buffers.
        
       | ksp-atlas wrote:
       | Technically, Zig has this functionality built in via translate-c,
       | but it's designed for reading by a C compiler, not a human
        
       | kernal wrote:
       | I'm working on something similar that just wraps the C code in an
       | Unsafe block.
        
       | ristos wrote:
       | I get the idea of moving to more memory safety, but the whole
       | "rewrite everything in Rust" trend feels really misguided,
       | because if you're talking about being able to trust code and code
       | safety:
       | 
       | - Rust's compiler is 1.8 million lines of recursively compiled
       | code, how can you or anyone know that what was written is
       | actually trustworthy? Also memory safety is just a very small
       | part of being able to actually trust code.
       | 
       | - C compiles down to straightforward assembly, almost like a
       | direct translation, so you can at least verify that smaller
       | programs that you write in C actually do compile down to assembly
       | you expect, and compose those smaller programs into larger ones.
       | 
       | - C has valgrind and ASAN so it's at least possible to write safe
       | code with code coding discipline, and plenty of software has been
       | able to do this for decades.
       | 
       | - A lot of (almost all) higher level programming languages are
       | written in C, which means that those languages just need to make
       | sure they get the compiler and GC right, and then those languages
       | can be used for general purpose, scripting, "low level" high
       | level code like Go or OCaml, etc.
       | 
       | - There are many C compilers and only one Rust compiler, and it's
       | unclear whether it'll really be feasible to have more than one
       | Rust compiler due to the complexity of the language. So you're
       | putting a lot of trust into a small group of people, and even if
       | they're the most amazing, most ethical people, surely if a lot of
       | critical infra is based on Rust they'll get targeted in some way.
       | 
       | - Something being open source doesn't mean it's been fully
       | audited. We've seen all sorts of security vulnerabilities cause a
       | world a hurt for a lot of people that came from all open source
       | code, and often very small libraries that could actually be much
       | easier to audit than lines with millions of lines of code.
       | 
       | - Similarly, Rust does not translate to straightforward assembly,
       | and again would seem to be impossible to do given the complexity
       | of the language.
       | 
       | - There was an interesting project I came across called CompCert,
       | which aims to have a C compiler that's formally verified (in Coq)
       | to translate into the assembly you expect. Something like a
       | recursively compiled CompCert C -> OCaml -> Coq -> CompCert would
       | be an interesting undertaking, which would make OCaml and Coq
       | themselves built on formally verified code, but I'm not sure if
       | that'll really work and I suspect it's too complicated.
       | 
       | - I think Rust might be able to solve some of these problems if
       | they have a fully formally verified thing, and the formally
       | verified thing is itself formally verified, and the compiler was
       | verified by that thing, and then you know that you can trust the
       | whole thing. Still, the level of complexity and the inability to
       | at least manually audit the core of it makes me suspect it's too
       | complicated and would still be based on trust of some sort.
       | 
       | - I still think that static analysis and building higher level
       | languages on top of C is a better approach, and working on formal
       | verification from there, because there are really small C
       | compilers like tinycc that are ~50k LOCs, which can be hand
       | verified. You can compile chibi-scheme with tinycc, for example,
       | which is also about ~50k LOCs of C, and so you get a higher level
       | language from about 100k LOCs (tcc and chibi), which is feasible
       | for an ordinary but motivated dev to manually audit to know that
       | it's producing sound assembly and not something wonky or sketchy.
       | Ideally we should be building compilers and larger systems that
       | are formally verified, but I think the core of whatever the
       | formally verified system is has to be hand verifiable in some way
       | in order to be trustworthy, so that you can by induction trust
       | whatever gets built up from that, and I think that would need to
       | require a straightforward translation into assembly, with ideally
       | open source ISA and hardware, and a small enough codebase to be
       | manually audited like the tinycc and chibi-scheme example I gave.
       | 
       | - Worst case everyone kind of shrugs it all off and just trusts
       | all of these layers of complexity, which can be like C ->
       | recursively compiled higher level lang -> coffeescript-like layer
       | on top -> framework, which is apparently a thing now, and just
       | hope that all of these layers of millions of lines of code of
       | complexity don't explode in some weird way, intentionally or
       | unintentionally.
       | 
       | - Best case of the worst case is that all of our appliances are
       | now "smart" appliances, and then one day they just transform into
       | robots that start chasing you around the house, all the while the
       | Transformers cartoon theme is playing in the background while,
       | which would match up nicely with the current trend of everything
       | being both terrifying and hilarious in a really bizarre way.
        
       | nickpsecurity wrote:
       | I think this is indirectly a great argument for automated, test
       | generation or equivalence checking. The reason is that these
       | translations might change the function of the code. Automated
       | testing would show whether or not that happened. It also reveals
       | many bugs.
       | 
       | So, they should solve total, automated testing first. Maybe in
       | parallel. Then, use it for equivalence checks.
        
       | pizlonator wrote:
       | Or you could just use Fil-C.
        
       | luke-stanley wrote:
       | Surely this could be better pitched to researchers as just
       | another AI benchmark, a bit like ARC Prize? ;) There could be
       | some exiting C projects that are already public, with tests for
       | feedback during development iteration and some holdout tests, and
       | some holdout projects too with a leaderboard and prizes. For
       | preferences about converted code quality, both automated
       | assesment and human preferences could be ranked with Elo? Kaggle
       | is made for this sort of thing I think? I'm sure Google Deepmind
       | and others have some MCTS agents that could do a great job with a
       | bit of effort.
        
       ___________________________________________________________________
       (page generated 2024-07-30 23:00 UTC)