[HN Gopher] Retrofitting spatial safety to lines of C++
       ___________________________________________________________________
        
       Retrofitting spatial safety to lines of C++
        
       Author : jandeboevrie
       Score  : 76 points
       Date   : 2024-11-15 20:25 UTC (1 days ago)
        
 (HTM) web link (security.googleblog.com)
 (TXT) w3m dump (security.googleblog.com)
        
       | Animats wrote:
       | New buzzword for old thing alert.
        
         | vintagedave wrote:
         | I'll say.
         | 
         | > Attackers regularly exploit spatial memory safety
         | vulnerabilities, which occur when code accesses a memory
         | allocation outside of its intended bounds
         | 
         | Isn't that... 'out of bounds memory access'?
        
           | SAI_Peregrinus wrote:
           | Yes. It's as opposed to temporal memory safety
           | vulnerabilities, like use-after-free or data races.
        
           | moyix wrote:
           | [This is more of a reply to a deleted reply to you, but I
           | don't want my efforts to go to waste]
           | 
           | Spatial memory safety is a reasonably common term in the
           | security / PL field. You can see examples of it being used at
           | least as far back as 2009: https://scholar.google.com/scholar
           | ?hl=en&as_sdt=0%2C33&q=spa...
           | 
           | It's in contrast to temporal memory safety, which deals with
           | object lifetimes (use after free, for example).
           | 
           | Here Google is probably also referencing a 2022 post of
           | theirs with a very similar title, dealing with temporal
           | safety: https://security.googleblog.com/2022/05/retrofitting-
           | tempora...
           | 
           | The terms are also in Wikipedia: https://en.wikipedia.org/wik
           | i/Memory_safety#Classification_o...
        
         | aseipp wrote:
         | People (both practitioners & researchers) have been using the
         | terms "temporal" and "spatial" to refer to different classes of
         | C++ vulnerabilities for at least 12+ years, back when I was
         | actually writing exploits for a job. It is not new at all, and
         | anyone in the field within the past 6-7 years and worth their
         | salt will instantly recognize them.
        
           | tom_ wrote:
           | For whatever it's worth, I've been doing this stupid shit -
           | writing C++, that is - for 25 years, and this is the first
           | time I've heard this term. (This is a data point rather than
           | a complaint. But for a fee, it can become a complaint if you
           | would like.)
        
             | aseipp wrote:
             | I meant security engineers/exploiters actually, but yeah, I
             | can see how most working C++ programmers who aren't
             | security specialists might not be as familiar with it.
        
         | epage wrote:
         | This term is coming up more frequently in the C++ community as
         | they discuss Rust's safety features so to add more nuance to
         | the discussion and focus on subsets of the problem to solve.
         | 
         | Note that there are some more heated takes on where these terms
         | are being used. I tried to be as generous as possible in my
         | description.
        
         | pizlonator wrote:
         | Nah, "spacial safety" is a term of art among security folks and
         | among PL folks who work on security.
         | 
         | It's the part of memory safety that's just about bounds. You
         | can also call it "bounds safety" and folks will understand what
         | you mean, but "spacial safety" is the more commonly used
         | jargon.
        
       | andrewstuart wrote:
       | >> spatial safety vulnerabilities represent 40% of in-the-wild
       | memory safety exploits
       | 
       | Rust advocates tend to turn stats like this into "40% of all
       | security issues are memory safety", which sounds very similar but
       | is false.
        
         | kibwen wrote:
         | _> Rust advocates tend to turn stats like this into "40% of all
         | security issues are memory safety", which sounds very similar
         | but is false._
         | 
         | You're right that it's false. Historically it's been a much
         | more damning 70% of vulnerabilities that were rooted in memory-
         | unsafety.
         | 
         | According to the Google Security Blog, in a post linked to from
         | the OP:
         | 
         |  _We'll also share updated data on how the percentage of memory
         | safety vulnerabilities in Android dropped from 76% to 24% over
         | 6 years as development shifted to memory safe languages. [...]
         | The percent of vulnerabilities caused by memory safety issues
         | continues to correlate closely with the development language
         | that's used for new code. Memory safety issues, which accounted
         | for 76% of Android vulnerabilities in 2019, and are currently
         | 24% in 2024, well below the 70% industry norm, and continuing
         | to drop._
         | 
         | https://security.googleblog.com/2024/09/eliminating-memory-s...
        
           | andrewstuart wrote:
           | You're still not getting the point.
           | 
           | OWASPs top ten security vulnerabilities are not memory
           | safety.
        
             | pornel wrote:
             | Because most applications aren't written in C++.
             | 
             | People don't write web apps in C++, because they would have
             | to deal with memory safety issues _in addition to_ all the
             | other issues related to auth, injections, etc.
        
             | jerf wrote:
             | So, maybe you can answer a question I've really had a hard
             | time understanding, that I've posted about before:
             | https://news.ycombinator.com/item?id=39542875
             | 
             | Why are you offended at the idea that languages should be
             | memory safe by default? What code are you writing that you
             | _constantly_ need memory unsafety, constantly available,
             | without being able to write any sort of  "unsafe" keyword?
             | Who cares about whether or not it's the #1 problem in OWASP
             | when it's clearly and undeniably been a massive problem for
             | decades? It is sufficient, after all, that it crashes a
             | program or produces incorrect results for it to be a
             | problem worth pursuing, but it is _also_ extremely well
             | known to produce massive security vulnerabilities
             | regardless of what some list says.
             | 
             | Why is this a hill you are willing to die on? What are you
             | getting out of it? Is your programming life going to be
             | easier? Are you better off when debugging something to
             | _not_ be able to just know that it 's not a memory safety
             | problem, and thus to still have to consider it?
             | 
             | What actual engineering benefit do those rare few of you
             | who seem to be _crusading_ against memory safety fear
             | disappearing?
             | 
             | When I got into programming in the late 1990s, I was there
             | to catch the last few holdouts of the "everyone should just
             | write in assembler" opinion. I at least understood their
             | arguments around performance and efficiency, and I
             | understood their arguments around "not needing high level
             | languages" even though I disagree with them both then and
             | now. I think on the net they were wrong, but they did have
             | some legitimate benefits to argue on their side, even if
             | they were already outweighed by the costs then and even
             | more so outweighed today.
             | 
             | But I don't get what you folk furious about memory safety
             | are looking for. "Using" memory safety is already an
             | invalid program. It's already pretty much automatically a
             | bug, if not worse. You're not losing anything to simply
             | have it, you're not gaining anything except bugs and sharp
             | corners insisting on it. And when you absolutely,
             | positively _need_ it, which I 'd call "exceptionally rare
             | but definitely non-zero", it's still there in one form or
             | another of "unsafe". I don't see any benefits at all.
             | 
             | (And let me reiterate and forstall the usual, _memory
             | safety does not mean "Rust"_. Memory safety is _every major
             | language on the market today except C and C++_.)
        
               | addaon wrote:
               | > Why are you offended at the idea that languages should
               | be memory safe by default?
               | 
               | Why are you okay with languages that are not overflow-
               | safe, or unit-safe, or infinite-loop-safe, or safe
               | against bit flips? Memory safety violations are a major
               | chunk of bugs. Writing code to avoid them is about as
               | hard as writing code to avoid other major classes of
               | bugs. In either case, it's failable. Static analysis and
               | testing then gives confidence that the system is safe, by
               | multiple metrics. Memory safety isn't special enough to
               | demand a different approach here -- quality code requires
               | a coherent approach to quality across multiple bug
               | classes.
        
         | IshKebab wrote:
         | I think you're forgetting about temporal safety (use after
         | free). Presumably that brings it up to the 70% of security
         | issues being related to memory safety, which many studies have
         | shown - remarkably consistently.
        
         | pjmlp wrote:
         | First of all it is 70%, and secondly even if people like to FUD
         | Rust, it is all security advocates that state this, including
         | those of us that would like a better attitude towards safety in
         | C++ world.
         | 
         | We got too many C refugees that spoiled the soup.
        
           | andrewstuart wrote:
           | The top security issues do not relate to memory safety.
           | 
           | Rust advocates like to muddy the water and make it sound like
           | memory safety is the biggest issue in security. It isn't.
        
             | pjmlp wrote:
             | You mean advocates like Microsoft Security Response Center
             | and Google Project Zero?
             | 
             | Or advocates like NSA and FBI?
             | 
             | Security FUD, name calling Rust any time someone raises
             | security issues, is quite impressive.
        
       | dzogchen wrote:
       | To "lines of C++" and to "hundreds of millions of lines of C++"
       | is quite a different title.
        
       | alserio wrote:
       | > We first enabled hardened libc++ in our tests over a year ago.
       | This allowed us to identify and fix hundreds of previously
       | undetected bugs in our code and tests.
       | 
       | That's something
        
       | WalterBright wrote:
       | Dlang added array bounds checking 20 years ago. It's a huge win.
       | As evidenced by the article noting that 40% of the memory safety
       | bugs were spacial.
       | 
       | I used to have all kinds of problems with array overflows. I
       | didn't make them very often, but when I did, they took a long
       | time to track down. They've been gone for 20 years now.
       | 
       | Note that it would be easy to add it to C/C++:
       | 
       | https://www.digitalmars.com/articles/C-biggest-mistake.html
       | 
       | It would be the most _useful_ and cost-effective enhancement
       | ever.
        
         | dataflow wrote:
         | They have it already, it's called std::span.
        
           | pjmlp wrote:
           | No they didn't, if you care about security, gsl::span is the
           | answer.
        
             | dataflow wrote:
             | > No they didn't, if you care about security, gsl::span is
             | the answer.
             | 
             | https://godbolt.org/z/Pda9Me45P ?
        
               | pjmlp wrote:
               | Unless you use .at() it isn't portable to assume code
               | safety.
        
               | dataflow wrote:
               | So what? Just pass the command-line flag to enable the
               | code safety in your toolchain. The same way you pass it
               | to enable optimizations in your toolchain.
        
               | coffeeaddict1 wrote:
               | > The same way you pass it to enable optimizations in
               | your toolchain.
               | 
               | No, it's not the same. I never enable optimisations by
               | manually passing in flags to the compiler. It's always a
               | `cmake -DCMAKE_BUILD_TYPE=...`. There is no such easily
               | accessible equivalent for bounds checking.
        
               | almostgotcaught wrote:
               | > I never enable optimisations by manually passing in
               | flags to the compiler
               | 
               | Lol then you don't use your compiler/toolchain correctly.
               | How is that anyone's problem but yours?
        
               | coffeeaddict1 wrote:
               | How exactly am I not using my toolchain correctly? What's
               | the "correct" way?
        
               | dataflow wrote:
               | Have you tried https://discourse.cmake.org/t/strictly-
               | appending-to-cmake-la...
        
               | coffeeaddict1 wrote:
               | What flag can I pass to CMAKE_CXX_FLAGS to enable bounds
               | checking on _all_ platforms regardless of the compiler
               | used? I can do that for optimisations with
               | `CMAKE_BUILD_TYPE`.
        
               | dataflow wrote:
               | (How) do you have no control over the environment
               | variables you call CMake with?
        
               | coffeeaddict1 wrote:
               | I don't quite get what you mean. Of course, I could get
               | CMake to pass a specific compiler flag at the
               | configuration stage, but that misses the point. What I'm
               | saying is that there is a super-easy way to configure
               | CMake to enable optimisations (CMAKE_BUILD_TYPE=Release),
               | but this cannot be said for bounds checking. Note that my
               | own code _does_ have bounds checking enabled for Clang,
               | GCC and MSVC. What I 'm arguing is that setting up the
               | latter is significantly more effort than enabling
               | optimisations. I'm _not_ arguing that it isn 't possible
               | or that one shouldn't do that.
        
               | dataflow wrote:
               | > I don't quite get what you mean. [...] What I'm saying
               | is that there is a super-easy way to configure CMake to
               | enable optimisations (CMAKE_BUILD_TYPE=Release), but this
               | cannot be said for bounds checking
               | 
               | Maybe _I 'm_ not getting what you mean.
               | 
               | You are saying you already run                 cmake ...
               | 
               | So I am saying you can just change that to
               | CXXFLAGS="-Dblah" cmake -U CMAKE_CXX_FLAGS ...
               | 
               | That _genuinely_ seems pretty darn easy to me.
               | 
               | In any case, any beef you have is clearly with CMake
               | here. You'd have the same issue(s) with any other flag,
               | for any language, if you use CMake.
        
               | pjmlp wrote:
               | Which command line option from ISO C++23?
               | 
               | It is not in the standard, it isn't neither portable, nor
               | guaranteed to exist.
        
               | WalterBright wrote:
               | We had this debate early on with D. The resolution was
               | checking was on by default. In order to get array bounds
               | turned off, you had to throw a switch _and_ it only
               | happened for code marked @system.
               | 
               | This turned out to be the right move.
        
               | dataflow wrote:
               | That wasn't something I was even debating here. People
               | derailed this whole discussion.
               | 
               | All I was doing here was saying was that the fix for your
               | "C's biggest mistake" (your T arr[..] proposal) is
               | already in C++ and you can get it _today_ : it's called
               | std::span, and it was explicitly designed to let you get
               | bounds-checking, with just a different syntax. It needs a
               | compiler flag, and so do optimizations. You already pass
               | one, so pass the other too, and get what you wanted.
               | 
               | That was all I was saying. But this being HN, everyone
               | insisted on derailing this into an argument about whether
               | safe-by-default is better than fast-by-default, when that
               | had nothing to do with my point, and when I was certainly
               | not trying to argue one is better than the other.
        
             | coffeeaddict1 wrote:
             | It's quite obvious to me that the C++ folks running the
             | committee didn't care about safety much. How can they
             | standardise `std::span` knowing it's unsafe?
             | 
             | They care now (well they pretend at least) because Rust is
             | going to take significant market share in domains where C++
             | is still king.
        
               | pjmlp wrote:
               | Rust isn't the reason, rather governments are now serious
               | about security, just like in any other industry.
        
               | coffeeaddict1 wrote:
               | Well yes and no. Rust is the reason because it's a real
               | memory-safe alternative for system programming. If it
               | didn't exist, governments would give C and C++ a "pass"
               | for being memory unsafe.
        
               | 3836293648 wrote:
               | They didn't just standardise it when it was unsafe. They
               | got a proposal for a safe span and demanded that safety
               | be removed before they'd accept it
        
           | coffeeaddict1 wrote:
           | std::span is not bounds checked by default.
        
             | dataflow wrote:
             | Optimizations aren't enabled by default either, and yet
             | everyone passes a flag to optimize, and nobody argues C++
             | sucks just because you need to pass a flag to enable
             | optimizations. Is it so hard to pass another flag to enable
             | bounds checking?
        
               | saagarjha wrote:
               | Yes.
        
               | dataflow wrote:
               | Eh? How/why?
        
               | saagarjha wrote:
               | Because turning on optimizations makes your code faster
               | and turning on bounds checks makes your code slower.
               | Hence, one gets used far more than the other.
        
               | dataflow wrote:
               | > Because turning on optimizations makes your code faster
               | and turning on bounds checks makes your code slower.
               | Hence, one gets used far more than the other.
               | 
               | The question was "is it so hard to pass a command line
               | flag". You said "yes" when you clearly don't see any
               | difficulty with actually passing the flag. Instead you're
               | apparently answering a totally different question: "why
               | do people lack the motivation to do this." Which had
               | nothing to do with the point you replied to.
               | 
               | It's not like opt-out vs. opt-in somehow changes the
               | performance characteristics. People who want maximum
               | performance will turn it off. People who want safety will
               | turn it on.
        
               | saagarjha wrote:
               | Is it so hard to shoot someone? It's just pressing the
               | trigger. When you say it's hard to kill people you're
               | really just answering a different question, one that is
               | about the psychological or legal or moral cost of doing
               | so. Maybe your overly literal interpretation is not the
               | one people actually want.
        
               | dataflow wrote:
               | > Is it so hard to shoot someone? It's just pressing the
               | trigger. When you say it's hard to kill people you're
               | really just answering a different question, one that is
               | about the psychological or legal or moral cost of doing
               | so. Maybe your overly literal interpretation is not the
               | one people actually want.
               | 
               | You don't feel you're missing the point of the
               | discussion?
               | 
               | The whole discussion started with: " _if_ you want bounds
               | checking in your own code ". Notice the "if". That's the
               | premise.... it _by definition_ assumes you 've already
               | accepted the performance impact of getting the safety you
               | want, and thus it's not a problem for you.
               | 
               | The only remaining question at this point is, how hard is
               | it to get you that safety. Asking you "is it so much
               | harder to pass -foo like the -bar you already pass" and
               | expecting you to address the physical difficulty of
               | adding a flag isn't taking an "overly literal" reading of
               | the question, it's literally asking _the most obvious and
               | only remaining question_.
               | 
               | If you want to go back to the premise and argue about the
               | psychological hurdle of taking a performance loss, that's
               | fine and all, but then you're completely changing the
               | topic of the thread you replied to.
               | 
               | P.S. comparing passing an extra command-line flag to
               | shooting someone is a rather insane comparison. Honestly,
               | all this is really making me regret trying to share a tip
               | to help people make their code safer.
        
               | saagarjha wrote:
               | You're regretting it because you keep telling people to
               | use an interface that explicitly was designed to not
               | provide bounds checking and claiming that this is the
               | solution to make their code safer, while in reality you
               | have to look up some nonportable flag to enable it for
               | your STL if even offers the functionality at all. Maybe
               | people would be a lot more reasonable if you didn't post
               | intentional bait in the first place.
        
               | dataflow wrote:
               | > You're regretting it because
               | 
               | No, I'm regretting it because having to spend hours
               | replying to comments that ignore the premise is a
               | complete waste of my time.
               | 
               | > you keep telling people to use an interface that
               | explicitly was designed to not provide bounds checking
               | 
               | As a matter of fact it was _very intentionally and
               | specifically designed_ to allow bounds-checking to be
               | configured at build time: _" As an example, in the
               | current reference implementation, violating a range-check
               | results by default in a call to terminate() but can also
               | be configured via build-time mechanisms to continue
               | execution (albeit with undefined behavior from that point
               | on)."_ [1]
               | 
               | Calling that "explicitly designed not to provide bounds
               | checking" is quite a deceptively misleading way to paint
               | it. It's not an accident that you can enable bounds-
               | checking, it's very much _by design_ and _intended_ that
               | you do so. They just didn 't happen to standardize the
               | flag name, just like they never standardized the
               | optimization flag names.
               | 
               | > and claiming that this is the solution to make their
               | code safer, while in reality you have to look up some
               | nonportable flag to enable it for your STL if even offers
               | the functionality at all.
               | 
               | Like I said, this is _literally the same as optimization
               | flags_. Everybody passes them and nobody bashes C++ for
               | it. You 're making a big deal out of something incredibly
               | tiny just to win an internet argument on the wrong
               | thread.
               | 
               | [1] https://www.open-
               | std.org/jtc1/sc22/wg21/docs/papers/2018/p01...
        
               | saagarjha wrote:
               | You're the one misunderstanding here. The reference
               | implementation that they provided (which I actually
               | believe is gsl::span) allows configuration. The design
               | for the standard, as you have mentioned elsewhere in this
               | discussion, does not provide bounds checking. I am making
               | a big deal out of this because it is a problem that
               | affects real codebases, not something hypothetical that
               | you can wave away with your idea of how things work. The
               | fact is that people who care about security ship non-
               | bounds-checked spans because this is not the default
               | option.
        
               | orf wrote:
               | Safety by default, opt-in to unsafety.
               | 
               | It's not hard to grok.
        
               | dataflow wrote:
               | > Safety by default, opt-in to unsafety. It's not hard to
               | grok.
               | 
               | Nobody was ever saying that unsafe-by-default is somehow
               | better. That just wasn't the question being asked.
        
               | orf wrote:
               | > The question was "is it so hard to pass a command line
               | flag"
               | 
               | Can your position not be summed up as "unsafe by default
               | doesn't matter, because changing the default is easy"?
               | 
               | If so, there's an obvious flaw in that thinking.
        
               | dataflow wrote:
               | > Can your position not be summed up as "unsafe by
               | default doesn't matter, because changing the default is
               | easy"?
               | 
               | No.
               | 
               | >> Nobody was ever saying that unsafe-by-default is
               | somehow better.
        
         | lpapez wrote:
         | Thanks for sharing, I enjoy reading your posts in regards to
         | how ahead of time Dlang was in adopting these improvements.
         | 
         | I wanted to ask: did you ever consider what was missing from
         | Dlang to achieve widespread adoption? Clearly it was not
         | features, so I'm wondering what that would be from your
         | pespective.
        
           | WalterBright wrote:
           | The marketing department was what was missing. I've always
           | had that problem. Borland was brilliant at marketing an
           | inferior compiler. Phillippe Kahn is an amazing businessman.
           | (He's also a very fun person to talk to.)
           | 
           | For example, Borland at one point decided to include the
           | source code to some of its runtime library for free. At a
           | compiler roundup in the magazine, this was hailed as a great
           | advance forward by the reviewer. Meanwhile, Datalight C was
           | also in the roundup, and had always included 100% of the
           | runtime library source code. No mention was made of this.
        
           | dataflow wrote:
           | > what was missing from Dlang to achieve widespread adoption
           | 
           | This: https://godbolt.org/z/s49qzPn81
        
       | omoikane wrote:
       | > Hardening libc++ resulted in an average 0.30% performance
       | impact
       | 
       | Maybe what really happened is that compiler technology has
       | improved such that they are able to remove most redundant checks,
       | such that it only costs 0.30% today. I can imagine things going
       | the opposite direction 20 years ago, as in "we removed some
       | bounds checks and gained X% of performance".
        
         | panstromek wrote:
         | Probably yes, and branch prediction improved a lot since then,
         | too. Bounds checks are easily predictable.
        
           | Gibbon1 wrote:
           | Bounds checking feels to me like low hanging fruit for a
           | processor designer. A low cost operation that can run in
           | parallel or tossed away as the steam from the instruction
           | decoder gets optimized and scheduled.
           | 
           | Meanwhile the guys on the standards committee thinks of fixed
           | width RISC instructions being executed by jungle logic and
           | the ALU.
        
           | adgjlsfhk1 wrote:
           | the hard part about bounds checks is you need very specific
           | semantics for bounds errors to prevent them from preventing
           | vectorization. specifically, you don't want to promise that
           | they are thrown when they are executed
        
             | almostgotcaught wrote:
             | > preventing vectorization
             | 
             | No one that wants to emit vectorized code is relying on
             | auto-vectorization to emit that code.
        
         | masklinn wrote:
         | Bounds checks are trivially predictable though, I would hope
         | code density was the issue rather than branch prediction.
         | 
         | And as others note, bounds checking was the norm before the
         | STL.
        
         | cma wrote:
         | Unfortunately for many use cases like gamedev, debug builds
         | need to be fast too. So hopefully more of the improvement is
         | from branch prediction.
        
       | titzer wrote:
       | > We've begun by enabling hardened libc++, which adds bounds
       | checking to standard C++ data structures, eliminating a
       | significant class of spatial safety bugs.
       | 
       | Well, it's 2024 and remember arguing this 20+ years ago. Programs
       | have bugs that bounds checking catches. And making it a language
       | built-in exposes it to compiler optimizations specifically
       | targeting bounds checks, eliminating many and bringing the
       | dynamic cost down immensely. Just turning them on in libraries
       | doesn't necessarily expose all the compiler optimizations, but
       | it's a start. Safety checks should really be built into the
       | language.
        
         | pjmlp wrote:
         | Before C++98, this used to be pretty much table stakes in C++
         | compiler frameworks, e.g. Turbo Vision, AppToolbox, OWL,
         | MFC,....
         | 
         | I still don't get why the standard library went the other way,
         | other than starting the tradition of standardised wrong
         | defaults.
        
           | IshKebab wrote:
           | The C++ standards committee is still under the illusion
           | people can read, understand and remember the entire spec, and
           | write code without making mistakes. All these bugs are the
           | fault of the people making mistakes, not C++.
        
             | pjmlp wrote:
             | Spot on.
        
             | tialaramex wrote:
             | I don't think there's any such illusion. You do not see,
             | for example, WG21 members who are confident that they
             | understand the entire C++ language (on the contrary they'll
             | often accept corrections about the language from other
             | committee members) and it's not infrequent that a committee
             | member will agree with the statement that C++ is too large
             | and sprawling for any individual to attain such
             | comprehensive understanding. [Today I would guess maybe
             | Sean Baxter, who wrote his own compiler, has the best
             | individual understanding and I believe Sean is not a member
             | of the committee]
             | 
             | Instead WG21 has very clearly (but without ever admitting
             | it and that's important) taken the path of maintaining a
             | legacy language. Even as debate carried on about whether in
             | future C++ could end up like COBOL, the committee has acted
             | exactly as though it is for some years now. Compatibility
             | is King, no price is too high for compatibility, everything
             | must be sacrificed to make that happen and that's how you
             | end up like COBOL.
             | 
             | Three important opportunities to divert and pick other ways
             | forward should be highlighted here. P1863 "ABI: Now or
             | Never" by Titus Winters in 2020; P2137 "Goals and
             | priorities for C++" also in 2020 but with a long list of
             | authors and P1818 "Epochs" from 2019 by Vittorio Romeo.
             | 
             | In all these cases WG21 chose the "hope the problem goes
             | away" path, preferring not only not to address the critical
             | problem highlighted and take a new route forward, but to
             | specifically ignore the problem and press on anyway.
             | 
             | "Hope the problem goes away" is also, quietly, the
             | preferred strategy by WG21 for the safety problem.
             | 
             | There's a reason (albeit a terrible one) to prefer the C++
             | ISO document's language over the approach of Rust. These
             | are both general purpose languages (I might also write
             | separately in this thread about a non-general purpose
             | language which Google should use more, if I have time) and
             | so must wrestle with Rice's Theorem. Rust's solution is to
             | require the compiler to be conservative. This is _very
             | difficult_ and indeed there are known bugs in the code
             | doing this conservative check in the Rust official
             | compiler. But C++ has a much easier (but IMO fatal) path,
             | it says that 's the job of the _programmer_ and when the
             | _programmer_ writes C++ software which is nonsense as a
             | result that 's their fault, not the compiler's fault for
             | failing to reject the program.
             | 
             | It would be _extremely difficult_ to explain how a
             | "standards conforming" Rust compiler can correctly accept
             | all the programs Rust's actual compiler accepts and reject
             | all those it rejects without essentially having a black box
             | where the compiler implementation sits. We can explain the
             | _purpose_ of such rules without, but their detailed
             | behaviour not so much.
             | 
             | Take borrow checking. All the easy scoped borrows (which is
             | all that worked in Rust say eight years ago) can be
             | explained without too much trouble, but today a lot fancier
             | (but to a human obviously correct) borrowing will compile,
             | because the checker is smarter - now, how do you express,
             | not in Rust source code but in the English language, all
             | the checks to be performed, and neither miss things out nor
             | unknowingly accept programs a real Rust compiler will
             | reject ?
             | 
             | C++ just needn't do that, in effect the ISO document says.
             | "Don't do borrows that last longer than the thing borrowed,
             | if you do, that's not C++ but your compiler won't notice so
             | the result is arbitrary nonsense"
        
               | kibwen wrote:
               | _> how do you express, not in Rust source code but in the
               | English language, all the checks to be performed, and
               | neither miss things out nor unknowingly accept programs a
               | real Rust compiler will reject ?_
               | 
               | I think this is unintentionally stuck in the mindset of
               | "the purpose of a language specification document is to
               | enable armchair language lawyers to flame each other on
               | Usenet about whether or not such-and-such degenerate edge
               | case is technically valid". But a specification doesn't
               | need to be written in English, it can be written as a
               | formal proof, and indeed I would expect a theoretical
               | Rust spec to specify the behavior of the borrow checker
               | as just such a proof. Rust's borrow checking may no
               | longer be as simple as the lexically-scoped model that
               | existed as of Rust 1.0, but it's not like the extensions
               | that have been added since then are ad-hoc; they're all
               | still designed to result in a model that is provably
               | sound.
        
               | almostgotcaught wrote:
               | > But a specification doesn't need to be written in
               | English, it can be written as a formal proof, and indeed
               | I would expect a theoretical Rust spec to specify the
               | behavior of the borrow checker as just such a proof.
               | 
               | Did you miss the part where the person you're responding
               | to mentioned Rice's theorem? Do you know Rice's theorem
               | and hence understand what they're implying?
        
               | Ygg2 wrote:
               | Rice theorem doesn't say anything about humans.
        
               | almostgotcaught wrote:
               | No clue what this means
        
               | tialaramex wrote:
               | Some people believe that the Church-Turing intuition
               | doesn't tell us anything about humans, that what humans
               | are doing isn't computation but something more powerful.
               | In my experience their lack of evidence for this belief
               | just makes them believe it even harder, and they often
               | write whole books which are in effect the argument from
               | incredulity but expanded to book form.
        
               | carbotaniuman wrote:
               | There is no proof that humans are just glorified Turing
               | machines and even as a nonreligious person, I find such a
               | statement to be as lacking in evidence as those that
               | claim humanity has some soul or similar that cannot be
               | replicated.
               | 
               | The actual logic of gggp's statement also doesn't make
               | any sense. We as humans also under and overestimate the
               | soundness of programs.
               | 
               | Sometimes, a perfectly fine solution is massaged to
               | better adhere to best practices because we can't convince
               | ourselves that it's correct. Rust requires that we
               | convince the compiler, and then we know it's correct via
               | the compiler's proofs, instead of requiring us to do the
               | proof all the time.
        
               | Ygg2 wrote:
               | Humans are not Turning machines. I'm not talking how we
               | work on fundamental level.
               | 
               | I'm saying we don't obey axioms of Turing machine model.
               | So Rice theorem nor Godel theorem can apply to unsafe
               | code written by humans.
               | 
               | Even if borrow checker is limited by the Rice theorem,
               | you can create either safe abstractions provably or
               | unsound abstractions provably or potentially unsound
               | abstractions, which humans can reject or accept.
        
               | kibwen wrote:
               | Rice's theorem isn't relevant here. The goal is not to
               | create a system that produces no false positives, it's
               | perfectly fine to do a conservative syntactic analysis
               | that allows false positives but disallows false
               | negatives, and it's then possible to produce a formal
               | proof that this analysis is sound. It is this formal
               | proof that I would expect to be included in a
               | specification in lieu of English prose.
        
         | flohofwoe wrote:
         | Yeah. FWIW, we shipped PC games since the early 2000s written
         | in C++ where the C++ stdlib was banned (for various reasons,
         | not just memory safety), and our custom container classes were
         | bounds checked via custom asserts which stayed in the code for
         | the shipped game (and the rest of the code also peppered with
         | asserts).
         | 
         | ...and then you _still_ had to argue with some circles of the
         | C++ community why the game and engine code doesn 't use the
         | stdlib. It's crazy that it takes _decades_ to convince some
         | people that a bad idea is simply a bad idea.
        
           | pjmlp wrote:
           | Which is kind of ironic, given how performance minded the
           | game industry is, and then we have those circles with such
           | attitude.
        
       | dataflow wrote:
       | PSA: Perhaps this is stating the obvious, but if you want bounds
       | checking in your own code, start replacing T* with std::span<T>
       | or std::span<T>::iterator whenever the target is an array.
        
         | jpc0 wrote:
         | std::span is not bounds checked.
         | 
         | gsl::span is
        
           | dataflow wrote:
           | > std::span is not bounds checked. gsl::span is
           | 
           | https://godbolt.org/z/Pda9Me45P ?
        
             | debugnik wrote:
             | You've compiled with _LIBCPP_HARDENING_MODE_FAST, which
             | still adds some extra checks not required by the
             | standard.[1] You can also tell it's nonstandard because it
             | doesn't really throw out_of_range, it just traps.
             | 
             | > Fast mode, which contains a set of security-critical
             | checks that can be done with relatively little overhead in
             | constant time and are intended to be used in production.
             | 
             | > Using std::span as an example, setting the hardening mode
             | to fast will always enable the valid-element-access checks
             | when accessing elements via a std::span object, but whether
             | dereferencing a std::span iterator does the equivalent
             | check depends on the ABI configuration.
             | 
             | 1: https://libcxx.llvm.org/Hardening.html
        
               | dataflow wrote:
               | > You've compiled with _LIBCPP_HARDENING_MODE_FAST, which
               | still adds some extra checks not required by the
               | standard.
               | 
               | The standard doesn't require any checks to begin with.
               | 
               | It also doesn't require optimizations.
        
               | debugnik wrote:
               | It does, on explicitly bounds-checked accessors like .at,
               | which span is gaining for C++26.
               | 
               | But you originally implied using span was sufficient, you
               | didn't mention LLVM's libc++ hardening. (You even
               | mentioned iterators which, I just quoted, might not be
               | bounds-checked on fast mode either.)
        
               | dataflow wrote:
               | > It does, on explicitly bounds-checked accessors like
               | .at, which span is gaining for C++26.
               | 
               | When I said "the standard doesn't require this" I clearly
               | was not referring to C++26, which does not even exist
               | yet. In any case, I'm not sure what the point of this
               | pedantry is. I'm pretty sure the point was clear.
               | 
               | > But you originally implied using span was sufficient,
               | you didn't mention LLVM's libc++ hardening.
               | 
               | Because this isn't LLVM-specific, every major STL has
               | bounds checking. You just gotta enable it for your
               | toolchain. Sorry I didn't list every single flag, I
               | guess?
               | 
               | > (You even mentioned iterators which, I just quoted,
               | might not be bound-checked on fast mode either.)
               | 
               | Which is why I had _LIBCPP_ABI_BOUNDED_ITERATORS, right?
               | I'm not on HN to write comprehensive documentation for
               | every toolchain, I'm just writing a quick tip for people
               | to look into.
               | 
               | All this pedantic quibbling over "this isn't required by
               | the _standard_ by default " is just pointless arguing for
               | the sake of arguing on the internet. For all the
               | performance freaks who really care about this: _no_
               | language I know of guarantees optimizations _in the
               | standard_ , so if you're relying on optimized
               | performance, you're already doing nonstandard stuff.
               | 
               | And practically _every_ major compiled language you love
               | or hate has a way to enable or disable bounds checking,
               | letting you violate their  "standard" one way or another.
               | D itself has -boundscheck, C++ has toolchain-specific
               | flags, Go has -gcflags=-B, etc...
        
               | debugnik wrote:
               | So your first answer to being told your initial
               | suggestion is insufficient for bounds-checking was to
               | share a godbolt link without elaborating on where the
               | checking was actually coming from; and when I elaborate,
               | for other readers' sake, not yours, on your solution and
               | other comparable ones, you get defensive and repeatedly
               | call me pedant. Ok, but you know, these discussion are
               | for everyone else to read and maybe learn something too,
               | not just us.
               | 
               | As for the bounds-checked accessors, I mentioned them
               | because they already exist in current C++ for other
               | collections, they're coming to the one you suggested
               | using, and I thought them relevant to a discussion about
               | C++ lacking spatial safety.
        
               | carbotaniuman wrote:
               | I've used vendor-specific C++ compilers with no bounds
               | checking and a barely conforming stdlib, so by your logic
               | C++ has zero bounds checking... Defaults matter!
        
               | dataflow wrote:
               | > I've used vendor-specific C++ compilers with no bounds
               | checking and a barely conforming stdlib, so by your logic
               | C++ has zero bounds checking...
               | 
               | I literally said exactly that: "The standard doesn't
               | require any checks to begin with."
               | 
               | > Defaults matter!
               | 
               | Sigh... nobody claimed otherwise. You're really missing
               | the point of the thread.
               | 
               |  _All_ I did was give people a tip on how to improve
               | their code security. The exact sentence I wrote was:
               | 
               | >> "If you want bounds checking in your own code, start
               | replacing T* with std::span<T> or std::span<T>::iterator
               | whenever the target is an array."
               | 
               | "BUT DEFAULTS MATTER!!!", you rebut! Well OK, then I
               | guess keep your raw pointers in and _don 't_ migrate your
               | code? Sorry I tried to help!
        
               | carbotaniuman wrote:
               | Cool, let me know how to improve the code security on my
               | vendor compiler then, I'll be waiting.
        
               | dataflow wrote:
               | > Cool, let me know how to improve the code security on
               | my vendor compiler then, I'll be waiting.
               | 
               | Switch to std::span and add 1 line to
               | std::span::operator[] to check your bounds...
        
               | carbotaniuman wrote:
               | I don't think std::span is bounds checked. Try again.
        
               | dataflow wrote:
               | > I don't think std::span is bounds checked. Try again.
               | 
               | That's why I said _add 1 line to std::span::operator[]_
               | to check your bounds.
               | 
               | I'm telling you to modify the STL header. It's a text
               | file. Add 1 line to make it bounds-checked.
        
       | DLoupe wrote:
       | > The safety checks have uncovered over 1,000 bugs
       | 
       | In most implementations of the standard library, safety checks
       | can be enabled with a simple #define. In some, it's the default
       | behavior in DEBUG mode. I wonder what this library improves on
       | that and why these bugs have not been discovered before.
        
         | dataflow wrote:
         | It's a great question (_LIBCPP_DEBUG was already a thing in
         | libc++), and AFAIK the answer is supposedly "it used to be too
         | costly to enable these in production with libc++, and it no
         | longer is." I have no first-hand insight as to how accurate
         | this perception is.
        
           | alpire wrote:
           | That's exactly right. We've had extra hardening enabled in
           | tests, and that does catch many issues. But tests can't
           | exercise every potential out-of-bounds issue, which is why
           | enabling it prod enabled us to find & fix additional issues.
        
         | pjmlp wrote:
         | Being actually enforced, even in release.
         | 
         | Most folks don't use those #defines, and many still haven't
         | leaned about them.
        
         | saagarjha wrote:
         | They turned those on and 1. checked that the software using it
         | didn't break and 2. made sure it didn't tank performance.
         | 
         | Source: I worked on this apparently
        
       | vblanco wrote:
       | Game developers have been doing this since forever, its one of
       | their main reasons to avoid the STL.
       | 
       | EASTL has this as a feature by default, and unreal engine
       | container library has the boundchecks enabled on most games. The
       | performance cost of those boundchecks in practice is well worth
       | the reduction of bugs even on performance sensitive code.
        
         | pjmlp wrote:
         | Which is yet another reason to assert (pun intend), how far
         | from reality the anti-bounds check folks are, when even the
         | game industry takes them seriously.
        
       | TinkersW wrote:
       | I wonder if google really never had this turned on before? Like
       | this has been available in the C++ standard library for
       | decades(normally as a debug feature to catch errors in
       | development, but some implementations such as MS support it in
       | release also).
       | 
       | Might explain why they claimed 70% of exploits were memory
       | related..
        
         | alpire wrote:
         | The hardening mode we enabled is quite recent added to libc++.
         | It was proposed in 2022: https://discourse.llvm.org/t/rfc-c-
         | buffer-hardening/65734. It was designed to run in prod, so it's
         | quite fast. Previous debug modes I've seen came with a much
         | higher costs, and therefore weren't (usually) enabled in prod.
        
       ___________________________________________________________________
       (page generated 2024-11-16 23:02 UTC)