[HN Gopher] Some Were Meant For C (2017) [pdf]
       ___________________________________________________________________
        
       Some Were Meant For C (2017) [pdf]
        
       Author : fractalb
       Score  : 64 points
       Date   : 2021-03-01 06:20 UTC (16 hours ago)
        
 (HTM) web link (www.cs.kent.ac.uk)
 (TXT) w3m dump (www.cs.kent.ac.uk)
        
       | AlbertoGP wrote:
       | Relevant to recent discussions, even if it was published in 2017.
       | 
       | It is quite more elaborate than other publications I've seen
       | mentioned in those discussions.
       | 
       | I'll quote section 6.2, "What is Safety Anyway?":
       | 
       | > _I have learned to enjoy provoking indignant incredulity by
       | claiming that C can be implemented safely. It usually transpires
       | that the audience have so strongly associated "safe" with "not
       | like C" that certain knots need careful unpicking._
       | 
       | > _In fact, the very "unsafety" of C is based on an unfortunate
       | conflation of the language itself with how it is implemented._
       | 
       | > _Working from first principles, it is not hard to imagine a
       | safe C. As Krishnamurthi and Felleisen [1999] elaborated, safety
       | is about catching errors immediately and cleanly rather than
       | gradually and corruptingly. Ungar et al. [2005] echoed this by
       | defining a safety property as "the behavior of any program,
       | correct or not, can be easily understood in terms of the source-
       | level language semantics"--that is, with a clean error report,
       | not the arbitrary continuation of execution after the point of
       | the error._
        
       | jxy wrote:
       | > A final interesting property of this code is that its be-
       | haviour is undefined according to the C language standard. The
       | reason is that it calls memcpy() across a range of memory
       | comprising multiple distinct C objects, copying them all into
       | memory-mapped storage in a single operation.
       | 
       | What's wrong with memcpy here? As long as dst and src are both
       | non-zero and the ranges of memory are not overlapping, the
       | behavior of memcpy is well defined.
        
       | thesuperbigfrog wrote:
       | C and C++ code _CAN_ be secure, _but most of it is not_. It is
       | too easy to write or update C  / C++ code so that it is no longer
       | secure or has unexpected and unsafe results.
       | 
       | https://blog.regehr.org/archives/213 gives great insights into
       | how undefined behavior in C and C++ can be difficult to reason
       | about and cause problems.
       | 
       | The lack of bulletproof memory safety and easy-to-stray-into
       | undefined behavior of C and C++ make it easy to create code that
       | is difficult to fully grasp how it will behave, especially when
       | optimizing compilers are used. The C / C++ code runs really fast,
       | but there are hidden dangers lurking.
       | 
       | I don't doubt that C and C++ will be with us for a long time to
       | come, but the growing use of Rust, Zig, Ada, and others show that
       | better alternatives exist and that they will replace the use of C
       | and C++ for many domains and use cases.
       | 
       | Edit: Downvotes? Did you read my whole comment? I am saying that
       | C / C++ are not secure for real-world use cases.
        
         | icandoit wrote:
         | If you want to do something about it look into UBSan. Turn
         | vague concerns into bugs, and then into commits :).
         | 
         | https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
         | 
         | UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior
         | detector. UBSan modifies the program at compile-time to catch
         | various kinds of undefined behavior during program execution,
         | for example:
         | 
         | - Using misaligned or null pointer
         | 
         | - Signed integer overflow
         | 
         | - Conversion to, from, or between floating-point types which
         | would overflow the destination
         | 
         | GCC has similar features.
        
         | bawolff wrote:
         | > C and C++ code CAN be secure
         | 
         | Anything can be secure (and conversely anything can be
         | insecure). The theoretical potential doesn't matter because
         | real life is never the theoretical best case. What matters is
         | the overall risk (is liklihood * how bad < benefit?)
        
           | thesuperbigfrog wrote:
           | >> real life is never the theoretical best case
           | 
           | Exactly; real life C and C++ code that does real work tends
           | to be insecure.
        
         | coliveira wrote:
         | Undefined behavior is not the problem that people make it to
         | be. First of all, undefined behavior is well understood by
         | compilers, in fact it is exactly exploited by compilers to make
         | code run faster. The only thing you need to solve UB is to ask
         | compilers to stop exploiting it (which normally can be done by
         | reducing optimization). And of course you can rewrite your code
         | to stop relying on UB. Despite all the complains, I have never
         | seen code suffering from UB that couldn't be fixed.
        
           | MaxBarraclough wrote:
           | > Undefined behavior is not the problem that people make it
           | to be
           | 
           | No, undefined behaviour is a serious problem, especially
           | regarding security.
           | 
           | There's a long history of serious security problems in C and
           | C++ codebases due to unintended invocation of undefined
           | behaviour. These issues continue to arise even in well-
           | resourced C/C++ projects with highly skilled developers, such
           | as the Linux kernel and Chromium.
        
           | thesuperbigfrog wrote:
           | >> undefined behavior is well understood by compilers
           | 
           | The security implications of undefined behavior are poorly
           | understood by software developers:
           | https://arc.aiaa.org/doi/pdf/10.2514/1.I010699
        
           | citrin_ru wrote:
           | 1. Fixing/avoiding UBs requires discipline (and time) not all
           | programmers have 2. Many programmers pointed to UB in their
           | code would argue that it is not a problem at all - especially
           | if code works for them and users don't complain about bugs
           | caused by UB.
           | 
           | It would be interesting to make a following experiment: *
           | find and patch UBs in multiple opensource projects which are
           | relatively popular (at least used not only by an author) *
           | send pull requests wich UB fixes and see which fraction will
           | be closed as "won't fix".
           | 
           | It requires a lot of time, but may show that many developers
           | don't care about UB.
        
         | jerf wrote:
         | This is one of those rare cases where I can say "It's 2021, and
         | we know that's not true now." C and C++ can not be secure at
         | scale without unreasonable amounts of effort. It can't be
         | secure at any non-trivial scale through sheer discipline alone.
         | 
         | 40 years ago, the case could be made. But C and C++ are not new
         | languages, and the fact that just barely shy of no one can
         | demonstrate the existence of secure C or C++ code bases without
         | staggering levels of effort put into that process is data now,
         | not just anecdote.
         | 
         | (And let me emphasize the _effort_ as my yardstick. Writing
         | _truly_ secure code is arguably something nobody has ever done
         | at scale in any language... but C and C++ are certainly unique
         | in the sheer level of _effort_ it takes poured in to them to
         | even _match_ what a number of other languages come with out of
         | the box, let alone exceed them. If you aren 't using some very
         | high quality and fairly expensive tools like Coverity on a
         | routine basis, you aren't even close.)
        
           | thesuperbigfrog wrote:
           | >> C and C++ can not be secure at scale without unreasonable
           | amounts of effort. It can't be secure at any non-trivial
           | scale through sheer discipline alone.
           | 
           | Agreed. That is why C and C++ should be replaced with more
           | reliable alternatives.
        
           | pjmlp wrote:
           | Not even 40 years ago, because exactly in 1981, C.A. Hoare
           | stated on his Turing award speech,
           | 
           | "Many years later we asked our customers whether they wished
           | us to provide an option to switch off these checks in the
           | interests of efficiency on production runs. Unanimously, they
           | urged us not to--they already knew how frequently subscript
           | errors occur on production runs where failure to detect them
           | could be disastrous. I note with fear and horror that even in
           | 1980, language designers and users have not learned this
           | lesson. In any respectable branch of engineering, failure to
           | observe such elementary precautions would have long been
           | against the law."
        
         | kzhukov wrote:
         | Security is not a feature of the language or tool. Neither Rust
         | nor C++ are fully secured even though the former could find
         | more memory safety problems at compile time (but not all of
         | them).
         | 
         | Security is the process. It contains continuous risk
         | assessment, penetration testing, fuzzing and using various
         | other tools throughout the product development to eliminate
         | attack vectors. Only then you could build a secured product.
         | Just rewriting everything in Rust won't make it.
        
           | thesuperbigfrog wrote:
           | >> Security is the process.
           | 
           | Yes, but do the programming languages / tools we choose make
           | the security process easier?
           | 
           | There is no perfect solution, but some programming languages
           | / tools are better than others for preventing unexpected
           | behavior that can lead to insecure programs.
        
           | jjav wrote:
           | Very concisely and correctly put, agreed.
           | 
           | I sometimes work on the infosec side of the house. It's easy
           | to point at vulnerabilities due to endless memory access
           | problems in C code and fixate on that. And it's true, so it
           | feels satisfying.
           | 
           | Just rewrite everything in not-C and this is fixed! And
           | that's true as well. But remeber here the word "this" refers
           | to the memory access bugs. It doesn't refer to
           | vulnerabilities in that sentence.
           | 
           | Plenty of systems without a single line of C/C++ code and we
           | have no shortage of ways to break in anyway. So in one sense,
           | everything changed. But wearing the black hat, nothing really
           | changed since I compromised the system anyway. A new language
           | without all the engineering process parent post describes,
           | won't magically get there.
           | 
           | For an existing system "just rewrite everything" will
           | guarantee way more bugs for years to come simply because the
           | old system has been battle-tested and reinforced for years.
           | In the long haul the rewrite will converge to a better state
           | but that haul is long (and assumes budget will remain in
           | place long enough, which might not happen so you end up with
           | a half-baked rewrite).
           | 
           | For new systems starting from scratch that may in any way
           | ever be run in a security relevant context, sure, starting
           | with not-C is a good idea today.
           | 
           | Unfortunately Rust seems to be the only alternative for use
           | cases where C was actually needed (if it could be written in
           | Java or Go or Python or ... then it didn't really need to be
           | in C in the first place). And sadly rust is fairly user-
           | hostile language so my guess is plenty of new projects will
           | start with C for a long time due to lack of friendly
           | alternatives. And tooling.
        
       | RcouF1uZ4gsC wrote:
       | > Most obviously in C, we note that a malloc() implementa- tion
       | is usually written in C--or rather, in a subset of C that lacks
       | malloc() since malloc() is mandated by the C standard.
       | 
       | Not always. Google's tcmalloc is actually written in C++.
       | 
       | I actually would have agreed with a lot of this in 2017. However,
       | in the past several years, I think Rust has been a game changer.
       | It can interface with the C ABI. It can access the same low-level
       | abstract machine that C can. It doesn't have garbage collection
       | or virtual machines. And it provides memory safety out of the
       | box. In addition, features such as strong types and pattern
       | matching help less logical bugs as well (like the compiler
       | checking that you did not forget one arm of an Enum).
       | 
       | This is born out now that a lot of security facing software is
       | starting to do at least part of their internals in Rust, where
       | they had been in C before.
       | 
       | I think Rust is and will be even more in the future a game
       | changer in how we write the foundation programs and libraries
       | that the computing world is built on.
        
       | dleslie wrote:
       | All I really want added to C is Zig's comptime.
        
       | ciarcode wrote:
       | Can someone explain me why do we use C for writing code for
       | electronic control unit of a vehicle motor if it is so unsafe? It
       | is true that ECUs are programmed with code generated though model
       | based design,but there can be some parts manually programmed.
       | Maybe this is why they use only a subset of C (Misra C)
        
       | zokier wrote:
       | Is the second example (the auxv stuff in section 5.1) invoking
       | undefined behavior in here "at_null->a_type == AT_NULL"? As far
       | as I understand, in C you really generally can not pull out valid
       | pointer out of thin air like author is doing there. Isn't that
       | the whole idea behind "pointer provenance"?
        
       | xianwen wrote:
       | But is there some guides or books that teach people write safe C
       | codes? Is writing safe C codes possible?
        
         | sigjuice wrote:
         | https://wiki.sei.cmu.edu/confluence/display/c
        
       | jvanderbot wrote:
       | I've entered a mid-life zen w.r.t. languages. Most important is
       | the people and irreplaceable knowledge they have in their heads,
       | about the tricks, methods, and environments they've worked in.
       | Languages correlate with that, and so are a semi-useful indicator
       | of past skills. You want a person to write the bootloader for
       | your Mars helicopter? You hire a C programmer most likely, not a
       | C# programmer, but who knows? You want a person to bootstrap your
       | image processing pipelines for your scientists? Maybe Python?
       | Maybe? This area is much more loose.
       | 
       | If a valuable tool or method is expressed in (or even only
       | expressible in) a particular language, then so be it. Often,
       | there are many more choices than people believe, and what is
       | right for the person, so long as it serves the organization or
       | need appropriately, is fine by me.
       | 
       | A language is a tool. Most languages do pretty much the same
       | thing. Most languages' ecosystems, application adoption, and
       | developers are much more important than the languages' seatbelts
       | and headlights.
        
         | qznc wrote:
         | I disagree that languages "pretty much the same thing".
         | However, I do agree that language choice is overrated. Other
         | factors, e.g. which one you know better, weigh heavier.
        
           | jvanderbot wrote:
           | OK, over-simplification.
           | 
           | Perhaps: Language in-class variance is small (C# ~= Java,
           | C++, C, RUST are close-ish). Cross-class variance is big (JS
           | vs Rust). Therefore, recruiting a programmer from the same
           | problem / language class is more important than the
           | particular language.
        
         | derekp7 wrote:
         | What is also important is the ecosystem that comes along with
         | the language. Modern languages have a wide ranging list of
         | libraries and plugins for them which make certain programming
         | tasks easier. But that also means you have an ever shifting
         | stack that is required to support them. This became apparent to
         | me when trying to get some infrastructure software to play nice
         | with some of the older systems we need to keep around. There
         | were some nice Python based solutions that I had to reject
         | because the dependencies didn't exist for some older RHEL
         | installations we have (yes, we still need to keep a hand full
         | of RHEL 4.x systems around because management doesn't want to
         | tell customers "no we won't support you unless you upgrade to
         | our latest product that works on newer OS releases". So for
         | backup and management solutions, I have to stick with tools
         | that can be easily compiled on the older environments.
         | 
         | Another example, one of our architects at work is a strong
         | supporter of Apple, and wanted me to look into Swift. Well at
         | the time you could get Swift for Ubuntu, but couldn't for any
         | version of RHEL (that has finally changed now though). So
         | again, writing my code in plain C was more of a win.
        
       | 0xdeadfeed wrote:
       | NASA just sent a rover on Mars using software written in C.
       | Meanwhile some Rust fanatics are busy telling everyone how it
       | doesn't work.
        
       | dang wrote:
       | If curious, past threads:
       | 
       |  _Some Were Meant for C (2017) [pdf]_ -
       | https://news.ycombinator.com/item?id=19736214 - April 2019 (176
       | comments)
       | 
       |  _Some Were Meant for C: The Endurance of an Unmanageable
       | Language [pdf]_ - https://news.ycombinator.com/item?id=15179188 -
       | Sept 2017 (240 comments)
        
       | coliveira wrote:
       | I believe the main issue at play here is a paradox in software
       | engineering. The paradox is this: safe languages are more useful
       | on complex projects that in simple projects, but complex projects
       | suffer more from performance degradation and system integration
       | issues when these languages are used. Put from the other side, C
       | is perfectly fine to write short pieces of code, but despite its
       | problems it may be the only reasonable language to write large
       | pieces of system software (I'm including C++ here as a "kind of"
       | of C, just like Objective-C).
        
         | vmchale wrote:
         | > safe languages are more useful on complex projects that in
         | simple projects, but complex projects suffer more from
         | performance degradation
         | 
         | I don't think that's true. C is often slower than C++ because
         | of how inlining works, plus some domains (e.g. compilers) it's
         | best to just use a GC language from the get-go.
        
           | iainmerrick wrote:
           | _C is often slower than C++ because of how inlining works_
           | 
           | Hang on, you're cutting a few corners there! It's easy to use
           | inline functions in C too.
           | 
           | Are you thinking of C++ templates, and the fact that e.g.
           | std::sort is faster than qsort because it directly calls an
           | inlined comparison function rather than a function pointer?
           | 
           | That's true, although it's possible to achieve similar
           | performance levels in C via hand-rolled data structures or
           | macro hackery. I'll grant you that the efficient C++ code is
           | more idiomatic and likely safer. On the other hand, idiomatic
           | C code is likely smaller when compiled, which can be
           | important for performance too.
           | 
           | I don't believe "C is often slower than C++" is true in
           | general.
        
             | jstimpfle wrote:
             | std::sort is the poster child that is supposed to
             | demonstrate the power of C++ templates. When in fact it is
             | awkward to use (as an infrequent user of C++, why can't I
             | never seem to remember how to wrap / make the comparison
             | object?) and more importantly, sort performance is most
             | often completely irrelevant to the performance of a
             | program.
             | 
             | And when it's not irrelevant, it's almost 100% certain that
             | std::sort is not the right thing to use. Where it matters,
             | it's probably possible to examine the context a little more
             | closely and come up with a custom sort that runs in O(n) or
             | at least faster than std::sort.
        
               | cozzyd wrote:
               | Yeah, the real poster-child for C++ performance should be
               | something like Eigen
        
       | bachmeier wrote:
       | I remember when this paper came out, I tweeted at the author
       | about D's "Better C" mode since he used that very term. It really
       | is a better C, because C is almost a subset of D. You get nice
       | features like array bounds checking, but no runtime, no garbage
       | collector, etc. It's a good choice for those that prefer to stick
       | with C but wish there was a 2021 upgrade.
       | 
       | https://dlang.org/spec/betterc.html
        
         | pharke wrote:
         | What's the developer experience like in D? I keep looking at it
         | from time to time but haven't taken the time to learn it (even
         | though I've spent time learning a lot of new languages) I was
         | never sure if there was a big enough community around D to make
         | it worthwhile but it seems to have a lot of the features I
         | want.
        
           | bachmeier wrote:
           | Just for clarity, to keep on the topic of the article: If you
           | compile a D program with the -betterC flag, you give up many
           | features so that your program runs with only the C runtime.
           | It's great for a C programmer not wanting to learn a new
           | language or for being able to make incremental changes to a C
           | codebase while adding things like metaprogramming. If you're
           | satisfied with the experience writing C, you'll probably also
           | be satisfied writing D and compiling with the -betterC flag.
           | As an example, here's a -betterC hello world:
           | 
           | https://run.dlang.io/is/TKOBgA
           | 
           | Once you move on to other goals (using the whole language, as
           | is usually the case) there are numerous complaints. Some
           | don't think the VS Code plugin is good enough and that sort
           | of thing. Some argue that Dub, the package manager, is not
           | good enough for their needs. I suppose like every language
           | has people that try it and don't like it.
           | 
           | It doesn't take much to try it. You can use the online D
           | editor and read the official tutorial. If you like it, you
           | can dig in further to see if it has the ecosystem you need.
           | 
           | https://run.dlang.io/
           | 
           | http://ddili.org/ders/d.en/index.html
        
         | ducktective wrote:
         | Who are the designers/institution behind D and why they have
         | not promoted it as much as newer alternatives?
        
           | bachmeier wrote:
           | https://en.wikipedia.org/wiki/Walter_Bright
           | 
           | https://en.wikipedia.org/wiki/Andrei_Alexandrescu
           | 
           | They've promoted it (holding annual conferences and such) but
           | they don't have Mozilla or Google behind them, so resources
           | are limited.
           | 
           | Edit: And here are some blog posts by Walter Bright about D
           | as a Better C.
           | 
           | https://dlang.org/blog/2017/08/23/d-as-a-better-c/
           | 
           | https://dlang.org/blog/2018/02/07/vanquish-forever-these-
           | bug...
           | 
           | https://dlang.org/blog/2018/06/11/dasbetterc-converting-
           | make...
        
           | voldacar wrote:
           | Walter Bright (original creator of D and the D compiler)
           | posts here pretty frequently.
        
       | ncmncm wrote:
       | The article mentions C++ three times, all in contexts equating it
       | with C. But C++, as it is coded today, is a very different
       | language from C, and does not suffer from the problems that make
       | C an extremely poor choice for starting any new project that
       | might matter.
       | 
       | All the article's arguments for C apply substantially moreso to
       | C++. Thus, the article leaves us with no objectively plausible
       | reason ever to code in C, except where artificial constraints
       | mandate it, or where merit doesn't matter. (I leave to the reader
       | to decide where Linux and BSD kernels fit in that.)
       | 
       | There is especially no excuse for systemd to be coded in C.
        
         | coliveira wrote:
         | I disagree. First of all, most C code can be run as C++, so
         | everything bad you can say about C applies to C++ as well.
         | Second, C++ introduces its own problems that also make
         | programming unsafe compared to other managed languages and even
         | compared to C.
        
           | qznc wrote:
           | You would constraint yourself to a subset of C++ which is not
           | C. For example, forbid all raw pointers and only use smart
           | pointers. Now your memory is freed automatically via RAII and
           | use-after-free errors are much harder to create.
        
             | aromatic_dev wrote:
             | Though if one only uses smart pointers they will have to
             | accept the performance hit that that brings -- in high
             | performance situations, this may be unacceptable.
        
             | coliveira wrote:
             | As you said, this is a subset of C++. But everyone debates
             | what the right subset should be. In reality C++ is
             | incapable of solving the problem, it can only propose good
             | practices.
        
         | sweeneyrod wrote:
         | C++ is about 17 languages
        
         | qznc wrote:
         | C++ still has a lot of rope to hang yourself. For example,
         | there is the stuff Rusts borrow checker prevents. Or that
         | members are initialized by declaration order and a wrong
         | initializer list can lead to undefined behavior. Or all those
         | implicit conversion rules (ok, the annoying tendency to convert
         | to int is inherited from C).
         | 
         | The most ugly aspect is that C++ has a lock-in effect like
         | Whatsapp. As long as your codebase is in C, you can easily
         | interface with nearly all other languages. Once important parts
         | are in C++ though, only C++ can reasonably interface with it.
        
           | ncmncm wrote:
           | If "lock-in" is something wrong with C++, it is equally so
           | with Rust. But in fact anything you want callable from C, in
           | either C++ or Rust, is easy to keep that way.
           | 
           | To the other point, you can write bad code in any language,
           | Rust included. But you don't have to. A language can help by
           | making good code easier to write than bad code. C fails so
           | frequently by making good code much harder to write, instead.
        
       | coliveira wrote:
       | The author raises an important issue here: many people are lead
       | to believe that using C is inherently unsafe. That's not true,
       | and many of the most secure systems in the world were written in
       | C. The other direction also doesn't work: software written in
       | languages like Java can be effectively unsafe.
        
         | pjmlp wrote:
         | One of the most secure OSes is ClearPath MCP, zero lines of C
         | on its kernel, rather NEWP.
         | 
         | Azure Sphere, Solaris and latest versions of iOS all rely on
         | some variation of hardware memory tagging to tame C exploits.
        
         | bitwize wrote:
         | Once again.
         | 
         | We know from 40 years of discovering memory-related
         | vulnerabilities in even the most carefully written, rigorously
         | tested C programs that writing safe C is intractable for real,
         | human software engineers. So yes, C IS INHERENTLY UNSAFE. If
         | you claim otherwise you clearly haven't been paying attention
         | to what's going on.
        
       | FpUser wrote:
       | >"...I use because I'm stuck with it; I use it for positive
       | reasons. "
       | 
       | I absolutely agree with that statement. When I do firmware for
       | small MCUs I feel big fat zero need for any other language.
        
       ___________________________________________________________________
       (page generated 2021-03-01 23:01 UTC)