[HN Gopher] Some Were Meant For C (2017) [pdf]
___________________________________________________________________
Some Were Meant For C (2017) [pdf]
Author : fractalb
Score : 64 points
Date : 2021-03-01 06:20 UTC (16 hours ago)
(HTM) web link (www.cs.kent.ac.uk)
(TXT) w3m dump (www.cs.kent.ac.uk)
| AlbertoGP wrote:
| Relevant to recent discussions, even if it was published in 2017.
|
| It is quite more elaborate than other publications I've seen
| mentioned in those discussions.
|
| I'll quote section 6.2, "What is Safety Anyway?":
|
| > _I have learned to enjoy provoking indignant incredulity by
| claiming that C can be implemented safely. It usually transpires
| that the audience have so strongly associated "safe" with "not
| like C" that certain knots need careful unpicking._
|
| > _In fact, the very "unsafety" of C is based on an unfortunate
| conflation of the language itself with how it is implemented._
|
| > _Working from first principles, it is not hard to imagine a
| safe C. As Krishnamurthi and Felleisen [1999] elaborated, safety
| is about catching errors immediately and cleanly rather than
| gradually and corruptingly. Ungar et al. [2005] echoed this by
| defining a safety property as "the behavior of any program,
| correct or not, can be easily understood in terms of the source-
| level language semantics"--that is, with a clean error report,
| not the arbitrary continuation of execution after the point of
| the error._
| jxy wrote:
| > A final interesting property of this code is that its be-
| haviour is undefined according to the C language standard. The
| reason is that it calls memcpy() across a range of memory
| comprising multiple distinct C objects, copying them all into
| memory-mapped storage in a single operation.
|
| What's wrong with memcpy here? As long as dst and src are both
| non-zero and the ranges of memory are not overlapping, the
| behavior of memcpy is well defined.
| thesuperbigfrog wrote:
| C and C++ code _CAN_ be secure, _but most of it is not_. It is
| too easy to write or update C / C++ code so that it is no longer
| secure or has unexpected and unsafe results.
|
| https://blog.regehr.org/archives/213 gives great insights into
| how undefined behavior in C and C++ can be difficult to reason
| about and cause problems.
|
| The lack of bulletproof memory safety and easy-to-stray-into
| undefined behavior of C and C++ make it easy to create code that
| is difficult to fully grasp how it will behave, especially when
| optimizing compilers are used. The C / C++ code runs really fast,
| but there are hidden dangers lurking.
|
| I don't doubt that C and C++ will be with us for a long time to
| come, but the growing use of Rust, Zig, Ada, and others show that
| better alternatives exist and that they will replace the use of C
| and C++ for many domains and use cases.
|
| Edit: Downvotes? Did you read my whole comment? I am saying that
| C / C++ are not secure for real-world use cases.
| icandoit wrote:
| If you want to do something about it look into UBSan. Turn
| vague concerns into bugs, and then into commits :).
|
| https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
|
| UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior
| detector. UBSan modifies the program at compile-time to catch
| various kinds of undefined behavior during program execution,
| for example:
|
| - Using misaligned or null pointer
|
| - Signed integer overflow
|
| - Conversion to, from, or between floating-point types which
| would overflow the destination
|
| GCC has similar features.
| bawolff wrote:
| > C and C++ code CAN be secure
|
| Anything can be secure (and conversely anything can be
| insecure). The theoretical potential doesn't matter because
| real life is never the theoretical best case. What matters is
| the overall risk (is liklihood * how bad < benefit?)
| thesuperbigfrog wrote:
| >> real life is never the theoretical best case
|
| Exactly; real life C and C++ code that does real work tends
| to be insecure.
| coliveira wrote:
| Undefined behavior is not the problem that people make it to
| be. First of all, undefined behavior is well understood by
| compilers, in fact it is exactly exploited by compilers to make
| code run faster. The only thing you need to solve UB is to ask
| compilers to stop exploiting it (which normally can be done by
| reducing optimization). And of course you can rewrite your code
| to stop relying on UB. Despite all the complains, I have never
| seen code suffering from UB that couldn't be fixed.
| MaxBarraclough wrote:
| > Undefined behavior is not the problem that people make it
| to be
|
| No, undefined behaviour is a serious problem, especially
| regarding security.
|
| There's a long history of serious security problems in C and
| C++ codebases due to unintended invocation of undefined
| behaviour. These issues continue to arise even in well-
| resourced C/C++ projects with highly skilled developers, such
| as the Linux kernel and Chromium.
| thesuperbigfrog wrote:
| >> undefined behavior is well understood by compilers
|
| The security implications of undefined behavior are poorly
| understood by software developers:
| https://arc.aiaa.org/doi/pdf/10.2514/1.I010699
| citrin_ru wrote:
| 1. Fixing/avoiding UBs requires discipline (and time) not all
| programmers have 2. Many programmers pointed to UB in their
| code would argue that it is not a problem at all - especially
| if code works for them and users don't complain about bugs
| caused by UB.
|
| It would be interesting to make a following experiment: *
| find and patch UBs in multiple opensource projects which are
| relatively popular (at least used not only by an author) *
| send pull requests wich UB fixes and see which fraction will
| be closed as "won't fix".
|
| It requires a lot of time, but may show that many developers
| don't care about UB.
| jerf wrote:
| This is one of those rare cases where I can say "It's 2021, and
| we know that's not true now." C and C++ can not be secure at
| scale without unreasonable amounts of effort. It can't be
| secure at any non-trivial scale through sheer discipline alone.
|
| 40 years ago, the case could be made. But C and C++ are not new
| languages, and the fact that just barely shy of no one can
| demonstrate the existence of secure C or C++ code bases without
| staggering levels of effort put into that process is data now,
| not just anecdote.
|
| (And let me emphasize the _effort_ as my yardstick. Writing
| _truly_ secure code is arguably something nobody has ever done
| at scale in any language... but C and C++ are certainly unique
| in the sheer level of _effort_ it takes poured in to them to
| even _match_ what a number of other languages come with out of
| the box, let alone exceed them. If you aren 't using some very
| high quality and fairly expensive tools like Coverity on a
| routine basis, you aren't even close.)
| thesuperbigfrog wrote:
| >> C and C++ can not be secure at scale without unreasonable
| amounts of effort. It can't be secure at any non-trivial
| scale through sheer discipline alone.
|
| Agreed. That is why C and C++ should be replaced with more
| reliable alternatives.
| pjmlp wrote:
| Not even 40 years ago, because exactly in 1981, C.A. Hoare
| stated on his Turing award speech,
|
| "Many years later we asked our customers whether they wished
| us to provide an option to switch off these checks in the
| interests of efficiency on production runs. Unanimously, they
| urged us not to--they already knew how frequently subscript
| errors occur on production runs where failure to detect them
| could be disastrous. I note with fear and horror that even in
| 1980, language designers and users have not learned this
| lesson. In any respectable branch of engineering, failure to
| observe such elementary precautions would have long been
| against the law."
| kzhukov wrote:
| Security is not a feature of the language or tool. Neither Rust
| nor C++ are fully secured even though the former could find
| more memory safety problems at compile time (but not all of
| them).
|
| Security is the process. It contains continuous risk
| assessment, penetration testing, fuzzing and using various
| other tools throughout the product development to eliminate
| attack vectors. Only then you could build a secured product.
| Just rewriting everything in Rust won't make it.
| thesuperbigfrog wrote:
| >> Security is the process.
|
| Yes, but do the programming languages / tools we choose make
| the security process easier?
|
| There is no perfect solution, but some programming languages
| / tools are better than others for preventing unexpected
| behavior that can lead to insecure programs.
| jjav wrote:
| Very concisely and correctly put, agreed.
|
| I sometimes work on the infosec side of the house. It's easy
| to point at vulnerabilities due to endless memory access
| problems in C code and fixate on that. And it's true, so it
| feels satisfying.
|
| Just rewrite everything in not-C and this is fixed! And
| that's true as well. But remeber here the word "this" refers
| to the memory access bugs. It doesn't refer to
| vulnerabilities in that sentence.
|
| Plenty of systems without a single line of C/C++ code and we
| have no shortage of ways to break in anyway. So in one sense,
| everything changed. But wearing the black hat, nothing really
| changed since I compromised the system anyway. A new language
| without all the engineering process parent post describes,
| won't magically get there.
|
| For an existing system "just rewrite everything" will
| guarantee way more bugs for years to come simply because the
| old system has been battle-tested and reinforced for years.
| In the long haul the rewrite will converge to a better state
| but that haul is long (and assumes budget will remain in
| place long enough, which might not happen so you end up with
| a half-baked rewrite).
|
| For new systems starting from scratch that may in any way
| ever be run in a security relevant context, sure, starting
| with not-C is a good idea today.
|
| Unfortunately Rust seems to be the only alternative for use
| cases where C was actually needed (if it could be written in
| Java or Go or Python or ... then it didn't really need to be
| in C in the first place). And sadly rust is fairly user-
| hostile language so my guess is plenty of new projects will
| start with C for a long time due to lack of friendly
| alternatives. And tooling.
| RcouF1uZ4gsC wrote:
| > Most obviously in C, we note that a malloc() implementa- tion
| is usually written in C--or rather, in a subset of C that lacks
| malloc() since malloc() is mandated by the C standard.
|
| Not always. Google's tcmalloc is actually written in C++.
|
| I actually would have agreed with a lot of this in 2017. However,
| in the past several years, I think Rust has been a game changer.
| It can interface with the C ABI. It can access the same low-level
| abstract machine that C can. It doesn't have garbage collection
| or virtual machines. And it provides memory safety out of the
| box. In addition, features such as strong types and pattern
| matching help less logical bugs as well (like the compiler
| checking that you did not forget one arm of an Enum).
|
| This is born out now that a lot of security facing software is
| starting to do at least part of their internals in Rust, where
| they had been in C before.
|
| I think Rust is and will be even more in the future a game
| changer in how we write the foundation programs and libraries
| that the computing world is built on.
| dleslie wrote:
| All I really want added to C is Zig's comptime.
| ciarcode wrote:
| Can someone explain me why do we use C for writing code for
| electronic control unit of a vehicle motor if it is so unsafe? It
| is true that ECUs are programmed with code generated though model
| based design,but there can be some parts manually programmed.
| Maybe this is why they use only a subset of C (Misra C)
| zokier wrote:
| Is the second example (the auxv stuff in section 5.1) invoking
| undefined behavior in here "at_null->a_type == AT_NULL"? As far
| as I understand, in C you really generally can not pull out valid
| pointer out of thin air like author is doing there. Isn't that
| the whole idea behind "pointer provenance"?
| xianwen wrote:
| But is there some guides or books that teach people write safe C
| codes? Is writing safe C codes possible?
| sigjuice wrote:
| https://wiki.sei.cmu.edu/confluence/display/c
| jvanderbot wrote:
| I've entered a mid-life zen w.r.t. languages. Most important is
| the people and irreplaceable knowledge they have in their heads,
| about the tricks, methods, and environments they've worked in.
| Languages correlate with that, and so are a semi-useful indicator
| of past skills. You want a person to write the bootloader for
| your Mars helicopter? You hire a C programmer most likely, not a
| C# programmer, but who knows? You want a person to bootstrap your
| image processing pipelines for your scientists? Maybe Python?
| Maybe? This area is much more loose.
|
| If a valuable tool or method is expressed in (or even only
| expressible in) a particular language, then so be it. Often,
| there are many more choices than people believe, and what is
| right for the person, so long as it serves the organization or
| need appropriately, is fine by me.
|
| A language is a tool. Most languages do pretty much the same
| thing. Most languages' ecosystems, application adoption, and
| developers are much more important than the languages' seatbelts
| and headlights.
| qznc wrote:
| I disagree that languages "pretty much the same thing".
| However, I do agree that language choice is overrated. Other
| factors, e.g. which one you know better, weigh heavier.
| jvanderbot wrote:
| OK, over-simplification.
|
| Perhaps: Language in-class variance is small (C# ~= Java,
| C++, C, RUST are close-ish). Cross-class variance is big (JS
| vs Rust). Therefore, recruiting a programmer from the same
| problem / language class is more important than the
| particular language.
| derekp7 wrote:
| What is also important is the ecosystem that comes along with
| the language. Modern languages have a wide ranging list of
| libraries and plugins for them which make certain programming
| tasks easier. But that also means you have an ever shifting
| stack that is required to support them. This became apparent to
| me when trying to get some infrastructure software to play nice
| with some of the older systems we need to keep around. There
| were some nice Python based solutions that I had to reject
| because the dependencies didn't exist for some older RHEL
| installations we have (yes, we still need to keep a hand full
| of RHEL 4.x systems around because management doesn't want to
| tell customers "no we won't support you unless you upgrade to
| our latest product that works on newer OS releases". So for
| backup and management solutions, I have to stick with tools
| that can be easily compiled on the older environments.
|
| Another example, one of our architects at work is a strong
| supporter of Apple, and wanted me to look into Swift. Well at
| the time you could get Swift for Ubuntu, but couldn't for any
| version of RHEL (that has finally changed now though). So
| again, writing my code in plain C was more of a win.
| 0xdeadfeed wrote:
| NASA just sent a rover on Mars using software written in C.
| Meanwhile some Rust fanatics are busy telling everyone how it
| doesn't work.
| dang wrote:
| If curious, past threads:
|
| _Some Were Meant for C (2017) [pdf]_ -
| https://news.ycombinator.com/item?id=19736214 - April 2019 (176
| comments)
|
| _Some Were Meant for C: The Endurance of an Unmanageable
| Language [pdf]_ - https://news.ycombinator.com/item?id=15179188 -
| Sept 2017 (240 comments)
| coliveira wrote:
| I believe the main issue at play here is a paradox in software
| engineering. The paradox is this: safe languages are more useful
| on complex projects that in simple projects, but complex projects
| suffer more from performance degradation and system integration
| issues when these languages are used. Put from the other side, C
| is perfectly fine to write short pieces of code, but despite its
| problems it may be the only reasonable language to write large
| pieces of system software (I'm including C++ here as a "kind of"
| of C, just like Objective-C).
| vmchale wrote:
| > safe languages are more useful on complex projects that in
| simple projects, but complex projects suffer more from
| performance degradation
|
| I don't think that's true. C is often slower than C++ because
| of how inlining works, plus some domains (e.g. compilers) it's
| best to just use a GC language from the get-go.
| iainmerrick wrote:
| _C is often slower than C++ because of how inlining works_
|
| Hang on, you're cutting a few corners there! It's easy to use
| inline functions in C too.
|
| Are you thinking of C++ templates, and the fact that e.g.
| std::sort is faster than qsort because it directly calls an
| inlined comparison function rather than a function pointer?
|
| That's true, although it's possible to achieve similar
| performance levels in C via hand-rolled data structures or
| macro hackery. I'll grant you that the efficient C++ code is
| more idiomatic and likely safer. On the other hand, idiomatic
| C code is likely smaller when compiled, which can be
| important for performance too.
|
| I don't believe "C is often slower than C++" is true in
| general.
| jstimpfle wrote:
| std::sort is the poster child that is supposed to
| demonstrate the power of C++ templates. When in fact it is
| awkward to use (as an infrequent user of C++, why can't I
| never seem to remember how to wrap / make the comparison
| object?) and more importantly, sort performance is most
| often completely irrelevant to the performance of a
| program.
|
| And when it's not irrelevant, it's almost 100% certain that
| std::sort is not the right thing to use. Where it matters,
| it's probably possible to examine the context a little more
| closely and come up with a custom sort that runs in O(n) or
| at least faster than std::sort.
| cozzyd wrote:
| Yeah, the real poster-child for C++ performance should be
| something like Eigen
| bachmeier wrote:
| I remember when this paper came out, I tweeted at the author
| about D's "Better C" mode since he used that very term. It really
| is a better C, because C is almost a subset of D. You get nice
| features like array bounds checking, but no runtime, no garbage
| collector, etc. It's a good choice for those that prefer to stick
| with C but wish there was a 2021 upgrade.
|
| https://dlang.org/spec/betterc.html
| pharke wrote:
| What's the developer experience like in D? I keep looking at it
| from time to time but haven't taken the time to learn it (even
| though I've spent time learning a lot of new languages) I was
| never sure if there was a big enough community around D to make
| it worthwhile but it seems to have a lot of the features I
| want.
| bachmeier wrote:
| Just for clarity, to keep on the topic of the article: If you
| compile a D program with the -betterC flag, you give up many
| features so that your program runs with only the C runtime.
| It's great for a C programmer not wanting to learn a new
| language or for being able to make incremental changes to a C
| codebase while adding things like metaprogramming. If you're
| satisfied with the experience writing C, you'll probably also
| be satisfied writing D and compiling with the -betterC flag.
| As an example, here's a -betterC hello world:
|
| https://run.dlang.io/is/TKOBgA
|
| Once you move on to other goals (using the whole language, as
| is usually the case) there are numerous complaints. Some
| don't think the VS Code plugin is good enough and that sort
| of thing. Some argue that Dub, the package manager, is not
| good enough for their needs. I suppose like every language
| has people that try it and don't like it.
|
| It doesn't take much to try it. You can use the online D
| editor and read the official tutorial. If you like it, you
| can dig in further to see if it has the ecosystem you need.
|
| https://run.dlang.io/
|
| http://ddili.org/ders/d.en/index.html
| ducktective wrote:
| Who are the designers/institution behind D and why they have
| not promoted it as much as newer alternatives?
| bachmeier wrote:
| https://en.wikipedia.org/wiki/Walter_Bright
|
| https://en.wikipedia.org/wiki/Andrei_Alexandrescu
|
| They've promoted it (holding annual conferences and such) but
| they don't have Mozilla or Google behind them, so resources
| are limited.
|
| Edit: And here are some blog posts by Walter Bright about D
| as a Better C.
|
| https://dlang.org/blog/2017/08/23/d-as-a-better-c/
|
| https://dlang.org/blog/2018/02/07/vanquish-forever-these-
| bug...
|
| https://dlang.org/blog/2018/06/11/dasbetterc-converting-
| make...
| voldacar wrote:
| Walter Bright (original creator of D and the D compiler)
| posts here pretty frequently.
| ncmncm wrote:
| The article mentions C++ three times, all in contexts equating it
| with C. But C++, as it is coded today, is a very different
| language from C, and does not suffer from the problems that make
| C an extremely poor choice for starting any new project that
| might matter.
|
| All the article's arguments for C apply substantially moreso to
| C++. Thus, the article leaves us with no objectively plausible
| reason ever to code in C, except where artificial constraints
| mandate it, or where merit doesn't matter. (I leave to the reader
| to decide where Linux and BSD kernels fit in that.)
|
| There is especially no excuse for systemd to be coded in C.
| coliveira wrote:
| I disagree. First of all, most C code can be run as C++, so
| everything bad you can say about C applies to C++ as well.
| Second, C++ introduces its own problems that also make
| programming unsafe compared to other managed languages and even
| compared to C.
| qznc wrote:
| You would constraint yourself to a subset of C++ which is not
| C. For example, forbid all raw pointers and only use smart
| pointers. Now your memory is freed automatically via RAII and
| use-after-free errors are much harder to create.
| aromatic_dev wrote:
| Though if one only uses smart pointers they will have to
| accept the performance hit that that brings -- in high
| performance situations, this may be unacceptable.
| coliveira wrote:
| As you said, this is a subset of C++. But everyone debates
| what the right subset should be. In reality C++ is
| incapable of solving the problem, it can only propose good
| practices.
| sweeneyrod wrote:
| C++ is about 17 languages
| qznc wrote:
| C++ still has a lot of rope to hang yourself. For example,
| there is the stuff Rusts borrow checker prevents. Or that
| members are initialized by declaration order and a wrong
| initializer list can lead to undefined behavior. Or all those
| implicit conversion rules (ok, the annoying tendency to convert
| to int is inherited from C).
|
| The most ugly aspect is that C++ has a lock-in effect like
| Whatsapp. As long as your codebase is in C, you can easily
| interface with nearly all other languages. Once important parts
| are in C++ though, only C++ can reasonably interface with it.
| ncmncm wrote:
| If "lock-in" is something wrong with C++, it is equally so
| with Rust. But in fact anything you want callable from C, in
| either C++ or Rust, is easy to keep that way.
|
| To the other point, you can write bad code in any language,
| Rust included. But you don't have to. A language can help by
| making good code easier to write than bad code. C fails so
| frequently by making good code much harder to write, instead.
| coliveira wrote:
| The author raises an important issue here: many people are lead
| to believe that using C is inherently unsafe. That's not true,
| and many of the most secure systems in the world were written in
| C. The other direction also doesn't work: software written in
| languages like Java can be effectively unsafe.
| pjmlp wrote:
| One of the most secure OSes is ClearPath MCP, zero lines of C
| on its kernel, rather NEWP.
|
| Azure Sphere, Solaris and latest versions of iOS all rely on
| some variation of hardware memory tagging to tame C exploits.
| bitwize wrote:
| Once again.
|
| We know from 40 years of discovering memory-related
| vulnerabilities in even the most carefully written, rigorously
| tested C programs that writing safe C is intractable for real,
| human software engineers. So yes, C IS INHERENTLY UNSAFE. If
| you claim otherwise you clearly haven't been paying attention
| to what's going on.
| FpUser wrote:
| >"...I use because I'm stuck with it; I use it for positive
| reasons. "
|
| I absolutely agree with that statement. When I do firmware for
| small MCUs I feel big fat zero need for any other language.
___________________________________________________________________
(page generated 2021-03-01 23:01 UTC)