[HN Gopher] Null References: The Billion Dollar Mistake (2009) [...
___________________________________________________________________
Null References: The Billion Dollar Mistake (2009) [video]
Author : colinprince
Score : 28 points
Date : 2023-02-28 18:03 UTC (4 hours ago)
(HTM) web link (www.infoq.com)
(TXT) w3m dump (www.infoq.com)
| [deleted]
| brundolf wrote:
| (2009)
| dang wrote:
| Added. Thanks!
| WalterBright wrote:
| Nah. The billion dollar mistake is actually C arrays decaying to
| pointers, enabling buffer overflows, the #1 cause of bugs and
| malware injection in shipped C programs.
|
| https://www.digitalmars.com/articles/C-biggest-mistake.html
|
| It's simple to fix this in C, too.
| marcodiego wrote:
| Genuinely curious: why don't suggest that to WG14?
|
| (WG14) is the ISO workgroup which maintains the C
| specification.
| Arch-TK wrote:
| And what would it achieve to "fix" this (without re-designing
| the rest of the language)?
|
| Every piece of code such as: int a[SIZE];
| foo(a, SIZE);
|
| Would have to be rewritten to: int a[SIZE];
| foo(&a[0], SIZE);
|
| And this additional noise would just make C harder to write
| for no reason. Rather than making C harder to write, just
| pick a different programming language.
| WalterBright wrote:
| The article I referenced says how this is fixed. It is not
| harder to write at all.
| Arch-TK wrote:
| Nothing is decaying. It is an implicit conversion. To say
| something "decays" implies something else is lost, the array is
| still there.
| WalterBright wrote:
| Something is lost - the array length. "Decays" is the correct
| word.
| Arch-TK wrote:
| * * *
| slaymaker1907 wrote:
| I don't think it's quite so simple to fix since you still need
| to decide when to actually check the size vs. when it is safe
| to omit. While branch predictors are great, you can end up
| having so many potential branches that it starts to lose
| effectiveness. In practice, I rarely see anything besides
| strings that lack the size variable, it's just that the bounds
| may not be checked.
|
| Null terminated strings were a horrible mistake though and
| really should have been fat pointers.
| nicoburns wrote:
| They're probably BOTH quite literally billion dollars mistakes.
| WalterBright wrote:
| I remember the bad old DOS days where a null pointer write
| would scramble DOS. I absolutely hated that, and it cost me a
| _lot_ of extra work. Enter protected mode programming. It was
| a miracle! Null pointer writes now meant a seg fault with a
| traceback, and voila! A few minutes of fix and I 'm on my
| way. I immediately switched all my dev work to a protected
| mode system, and ported to real mode only as a last step.
|
| Buffer overflows are the primary entry point for malware. Seg
| faults are not. Hence the former are _far_ more costly.
| unxdfa wrote:
| I can confirm I have fucked up on numerous occasions which
| resulted in NPEs which caused business processes to fail
| spectacularly and write off millions of dollars instantly.
| Fortunately _in every case_ it was possible to recover,
| replay or repair the data so the only real cost was a little
| bit of reputation and time and money. But that adds up across
| tens of thousands of engineers, probably to billions by now
| globally.
|
| I learned a lot from this and discovered that the real issue
| is that a process can fail spectacularly and do any damage at
| all. There are so many other concerns other than NPEs which
| need to be considered.
| xjay wrote:
| Not to forget that classic increment of a signed integer
| waiting to overflow and trigger an exception on some critical
| (unpatched) hard disk drive controller out there..
| ++countdown;
| live_video wrote:
| the choice of von neumann architecture over harvard
| architecture enabled this
| dpkirchner wrote:
| We could go further: the billion dollar mistake was allowing
| values intended to be used as data to be executed (pre-NX bit).
| Zero-terminated C strings is up there as well.
| renox wrote:
| > Zero-terminated C strings is up there as well.
|
| I disagree here, remember that at the time some strings with
| length implementations used only one or two bytes for the
| length, this can creates lots of issues that zero terminated
| strings don't have.
|
| Of course nowadays zero terminated strings don't make sense
| anymore.
| GuB-42 wrote:
| The code vs data distinction is a hardware thing, this is not
| about C. More precisely, it is the difference between a
| Harvard and a Von Neumann architecture. A Harvard
| architecture has completely separate paths between
| instructions and data: different buses, different memories. A
| Von Neumann architecture has common instruction and data
| paths, and therefore, naturally, data is executable. You can
| write C code for both.
|
| Modern PC-style hardware is kind of a hybrid, acts like
| Harvard with regard to cache, and like Von Neumann with
| regard to RAM. Furthermore, it has a fancy MMU that allows
| for things like the NX bit. All C compilers/linkers I am
| aware of know the difference between code and data and are
| able to put each one in the appropriate section, what is done
| after that is the OS/hardware responsibility.
|
| As for zero-terminated strings, I also think it is mostly a
| mistake, though it does have a few advantages. You can still
| work with size+pointer though, using mem- instead of the str-
| functions, and "%.*s" in printf(), not ideal though.
| Mindless2112 wrote:
| Absolutely. Null pointers doesn't even make the list of things
| that I worry about when writing C code.
| LanceH wrote:
| Which would all be a rounding area if C gets to take credit for
| everything produced using it.
| asguy wrote:
| This is what C/C++ haters, anti x86/amd64 snobs, and Rust
| elitists always forget: what works, works. Sure, it could be
| a local maximum, hopefully there will be better, but who
| knows?
|
| To quote Sean Connery in The Rock:
|
| Losers always whine about their best. Winners go home and
| fuck the prom queen!
|
| https://m.youtube.com/watch?v=gXDSxgDUv-c
| Oxidation wrote:
| It doesn't even have to be a local maximum, just higher up
| some hillside will do (maybe it's even on a different
| hillside to the one you're on at the moment).
| pjmlp wrote:
| Some of us are old enough to be coding when C was only
| relevant for university departments privileged to have UNIX
| boxes.
|
| So we know there are other ways, we used systems with zero
| lines of C into them.
|
| The prom queen came naked offering herself to everyone and
| the party was done for the other folks.
| asguy wrote:
| That excuse sounds like the guy in highschool who would
| have treated the girl so much better, but she was into
| assholes.
|
| If you are old enough to remember those days, then you
| remember COBOL, Algol, Fortran, Pascal, BASIC, Ada,
| Oberon, Lisp/Scheme, Forth, O'Caml etc. They're all great
| languages, some still have their uses. There's a reason
| all of the major operating systems have cores written in
| C/C++. It's entirely because they're pragmatic and
| "work", and not some conspiracy.
|
| Edit: although now that I write it, what if C/C++ was
| planted on earth by an alien intelligence in order to
| slow down the development of the human race.
| pjmlp wrote:
| The power of free beer with source tapes is very mighty.
|
| Thankfully governments have finally start paying
| attention regarding software liability.
| WalterBright wrote:
| > Thankfully governments have finally start paying
| attention regarding software liability.
|
| And we'll see a great slowdown in the software industry.
| pjmlp wrote:
| When people buy damaged goods, they ask for a refund,
| they don't expect to close and reopen the box and have
| the product reappear in perfect shape.
|
| The industry has miseducated them, and now it is finally
| happening, software products aren't a special snowflake.
|
| Digital stores with returns, consulting contracts with
| warranty clauses with fixes at the expense of provider,
| and naturally cyber security bills.
|
| Move fast and break things only works due to lack of
| liability.
| renox wrote:
| While I agree with you. It's still infuriating that these
| (serious) flaws in C apparently can't be fixed/evolved inside
| the language and that instead you have to use a different
| language instead..
|
| That's throwing the baby with the bathwater :-(
| munchler wrote:
| That's a problem that's pretty much limited to C. Null pointers
| have infected many other languages as well.
| mjevans wrote:
| I think golang's slices are a better solution than the linked
| article.
|
| https://go.dev/ref/spec#Slice_types
|
| Slices can still be nil (null), but it isn't an unsafe memory
| access operation, just another type of potentially useful or
| potentially errant invocation to handle.
| kristoff_it wrote:
| Go's slices are more akin to ArrayLists / Vectors in other
| languages, since they also manage the underlying buffer.
| josephg wrote:
| Zig and Rust also support array slices. But they can't be
| null, because that's - as the article says - a mistake.
|
| https://ziglang.org/documentation/master/#Slices
|
| https://doc.rust-lang.org/book/ch04-03-slices.html
| WalterBright wrote:
| D slices can be null, but they're not a mistake, as the
| runtime will not let you read/write a 0 length array.
| pjmlp wrote:
| Other languages avoid the mistake by preventing direct
| access unless either there is a null check or they are
| declared as non nullable.
| convolvatron wrote:
| I think this is actually an artifact of the 'check the single
| return value for errors' that requires that the return domain
| contain at least one error code point.
|
| how else you might structure errors without changing much of
| the rest of C left for an exercise
| amenghra wrote:
| just require the programmer to check errno after every
| function call? /s
| Gibbon1 wrote:
| Strange but true you can create functions where you pass a
| pointer to an error variable. error_t oops;
| int x = foo(10, &oops); if(oops) goto whoops;
| int x = foo(10, 0); // yolo!
| convolvatron wrote:
| omg. I forgot about that. the one thing even worse than
| overloading the return domain.
| revskill wrote:
| The compiler should check for null reference before deferencing
| here.
| josephg wrote:
| Why? So the program can crash? It'll usually crash anyway when
| you read from the first memory page.
| Arch-TK wrote:
| At the cost of basically all performance or incredible compiler
| complexity.
|
| A better solution: Implement a different language with a better
| type system instead. Or pick one of the hundreds that already
| exist and can represent the concept of a tagged union without
| having to implement it manually.
| pjmlp wrote:
| As proven by languages like Eiffel or Kotlin, it is quite
| alright.
| kaba0 wrote:
| I am absolutely in favor of languages that exclude nulls from
| their type systems, bur your first point is a very simplictic
| take.
|
| Null pointer checks are _very, very_ cheap, no additional
| memory fetch since the pointer value is needed either way,
| easy to predict the slow path _and_ if we are talking about
| languages that can deoptimize, then it is literally free
| (checked by hardware either way) - > a null value will cause
| a segfault, which will deoptimize the code on the slow path
| and continue from there. So it is not a problem from a
| performance point of view.
| pjmlp wrote:
| Besides the approach taken by some FP languages, others like
| Eiffel already fixed it in the 1990's, while having nullable
| types, by making it a compiler error to access reference types
| without checking for null, unless the types are declared a non
| nullable.
|
| So it just took some time to get mainstream.
| mkoubaa wrote:
| I have a hard time taking this seriously. The alternative to null
| pointers isn't no null pointers, it's programmers hand rolling
| their own conventions around optional pointers, and it would have
| been a nightmare ( at least in C)
| petilon wrote:
| If null references are a mistake, isn't initializing an array
| index variable to -1 a mistake too?
| itronitron wrote:
| Yes, I suppose null references are only a problem if you are
| unable to write a conditional statement.
| josephg wrote:
| The problem with null references is you _have to_ write those
| conditional statements everywhere, otherwise your program
| might crash. You generally know as the programmer which
| pointers you expect to be nullable and which should always be
| an object. But the compiler has no idea, so it can't help
| make sure you have null checks in all the places you need
| them.
| dang wrote:
| Related:
|
| _Tony Hoare 's Null References: The Billion Dollar Mistake_ -
| https://news.ycombinator.com/item?id=30719472 - March 2022 (13
| comments)
|
| _Null References: The Billion Dollar Mistake_ -
| https://news.ycombinator.com/item?id=22019627 - Jan 2020 (150
| comments)
|
| _Null References: The Billion Dollar Mistake - Tony Hoare (2009)
| [video]_ - https://news.ycombinator.com/item?id=11798518 - May
| 2016 (79 comments)
|
| _Tony Hoare / Historically Bad Ideas: "Null References: The
| Billion Dollar Mistake"_ -
| https://news.ycombinator.com/item?id=473158 - Feb 2009 (2
| comments)
| vippy wrote:
| Big facts. Write Scala, stay frosty.
| dang wrote:
| Could you please stop posting unsubstantive comments? You've
| unfortunately done that repeatedly, and we're trying for
| something else here.
|
| Fortunately you've also posted good comments, so this should be
| easy to fix.
|
| If you wouldn't mind reviewing
| https://news.ycombinator.com/newsguidelines.html and taking the
| intended spirit of the site more to heart, we'd be grateful.
___________________________________________________________________
(page generated 2023-02-28 23:01 UTC)