[HN Gopher] Essential C [pdf]
___________________________________________________________________
Essential C [pdf]
Author : graderjs
Score : 87 points
Date : 2022-03-05 10:08 UTC (1 days ago)
(HTM) web link (cslibrary.stanford.edu)
(TXT) w3m dump (cslibrary.stanford.edu)
| sylware wrote:
| Could you tell the linux kernel devs to stop pouring tons of
| extensions and non C89 (with benign bits of c99/c11) code
| everywhere?
|
| For instance in the whole network stack, some "agents of the
| planned obsolescence" did not use constant expressions in
| switch/case statements or initializers. Also tell those who are
| using that toxic c11 "Generic" keyword to use explicit code
| instead. I think, the main issue for C now is the standard body
| trying to obsolete stable in time C code and forcing upon
| devs/integrators the usage of the only few compilers which are
| able to follow fast enough their tantrums (latest gcc/latest
| clang). Results: all attempts to provide "working" alternative C
| compilers is beyond reasonable. This is becoming so accute, it is
| more likely to be a worldwide scam than anything else.
|
| Yep, open source software is not enough anymore, it needs to be
| lean, simple but able to do reasonably "the job", and ofc very
| stable in time hence C89 only with benign bits of c99/c11, or
| "standard assembly".
| properparity wrote:
| Standard C is not suitable for writing kernels anymore due to
| all the ridiculous hostile interpretions of the standard
| (mainly typed based aliasing rules).
|
| I mean you can't even write an allocator in standard C.
| shadowofneptune wrote:
| Then in this case the portability offered by C is false.
| Could be an equally valid approach to place the required
| behavior into separate modules written with an assembler.
| woodruffw wrote:
| > I mean you can't even write an allocator in standard C.
|
| I might be missing something, but I don't believe this is
| true? The type aliasing rules have an explicit carve-out for
| `char *` aliases of other types, which is intended to solve
| this exact issue.
|
| Similarly, C99 and C11 both allow aliasing through a union of
| pointers without violating the strict aliasing rule.
| monocasa wrote:
| Both sides are true IMO. The kernel isn't capable of being
| written totally in standards compliant C, _and_ it 'd
| probably be better off if that fact wasn't used as carte
| blanche to write non standards compliant code where it
| doesn't provide a tangible benefit.
| jstimpfle wrote:
| Why can't you write an allocator???
| monocasa wrote:
| Casting pointers to different object types and then
| dereferencing them is undefined behavior.
| rurban wrote:
| well, alternative compilers can excel in not providing broken
| optimizations based on UB. -Oboring is a thing in the kernel.
| chibicc e.g
| ChuckMcM wrote:
| I have often wondered if a C89 standard with no UB would be
| the definitive go to language for systems. I mean you COULD
| define things if you chose to, right?
| Ar-Curunir wrote:
| How do you eliminate ub without memory safety?!
| isomel wrote:
| For example, as mentioned in another comment, by making
| all read and write `volatile`, that way, dangling pointer
| and out of bounds are "defined" to be memory corruption
| or crash, and not the compiler optimizing the code in a
| way that the programmer did not enticipate.
| znpy wrote:
| Wouldn't that bring a performance penalty though?
| MaxBarraclough wrote:
| _Some_ kinds of UB can easily be handled that way. GCC
| offers a flag to guarantee wrapping on signed arithmetic
| overflow, for instance. [0] It wouldn 't be easy to nicely
| handle _every_ instance of UB though. The current state-of-
| the-art for that kind of thing would be something like
| Valgrind, an extremely 'intrusive' runtime system.
|
| This is because of the particulars of the C language. For
| instance, the way it handles arrays and pointer arithmetic
| make it difficult to detect every instance of out-of-bounds
| access.
|
| [0] _-fwrapv_ , see
| https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
| ChuckMcM wrote:
| I think we're talking about different things, but maybe
| not.
|
| Using out of bounds references in arrays and pointer
| arithmetic are _unsafe_ but they can be _defined._ The
| definition of "array accesses using indexes that are
| outside the range of 0 - the declared array size will
| access memory in the data segment. The compiler will
| treat that access as if the addressed memory had the type
| specified for the array." The behavior is defined, it is
| unsafe, but it is defined. Not like "the compiler may or
| may not choose to optimize out loads and stores, or re-
| order them." which especially on embedded systems
| _creates_ bugs where the things like "read the status
| register THEN read the data register (which clears the
| status register when read)" those get re-ordered and
| suddenly your loop never exits because your status bit is
| never set. That kind of UB needs to die in a fire IMHO.
| Thiez wrote:
| > which especially on embedded systems creates bugs where
| the things like "read the status register THEN read the
| data register (which clears the status register when
| read)" those get re-ordered and suddenly your loop never
| exits because your status bit is never set.
|
| Why are the embedded systems programmers not using
| `volatile`, which exists for this exact reason?
| compiler-guy wrote:
| "volatile" is not a synchronization primitive and doesn't
| forbid reordering amongst access to different memory
| locations.
|
| All it does is forbid reordering or removing accesses to
| a particular memory location.
|
| Historically, many compilers implemented it as a hard
| memory barrier, but that isn't how the standard defines
| it.
| Thiez wrote:
| Volatile operations are considered a side effect and as
| such may not be reordered with other volatile operations.
| How is that not literally sufficient for the operations
| that GGP described?
| compiler-guy wrote:
| I spoke perhaps a little strongly. But that said:
|
| The compiler is free to reorder accesses to multiple
| volatile variables if they happen prior to the same
| sequence point. So (roughly) expressions involving two
| volatile variables do not have those accesses sequenced.
|
| But you are right that the usage described earlier is OK,
| as long as both variables are marked volatile, and the
| accesses straddle a sequence point.
|
| The more common mistake with volatiles is to use the for
| multithreading primitives.
| pjmlp wrote:
| Out of bounds is only defined for length + 1, for
| anything else anything goes.
| MaxBarraclough wrote:
| That doesn't sound right. C (and C++) permit you to do
| pointer arithmetic to derive a pointer value pointing to
| one element beyond the final element of an array. This
| special treatment doesn't extend to dereferencing though
| - if you deference that pointer, that's UB.
|
| If you do pointer arithmetic to derive a pointer value
| pointing 2 or more elements beyond the final element of
| an array, that's undefined behaviour, even if you never
| dereference that pointer.
| pjmlp wrote:
| Sure, I just didn't want to spend too much time
| explaining all details, my failure.
| MaxBarraclough wrote:
| > "[...] The compiler will treat that access as if the
| addressed memory had the type specified for the array."
| The behavior is defined, it is unsafe, but it is defined.
|
| I believe performance (compiler optimisation) is the
| reason the language isn't defined that way. Permitting
| the compiler to assume that the runtime error will never
| arise, opens the door to all sorts of optimisations. (At
| least, that's the idea.)
|
| C permits the 'union trick' to (roughly speaking) access
| the bit-pattern of a value as another type, which is to
| say an escape-hatch is offered in the language.
|
| Similarly the strict aliasing rule is surprising to
| people who are new to C, but the C standard committee
| seem to be committed to keeping it, presumably for
| performance reasons.
|
| > Not like "the compiler may or may not choose to
| optimize out loads and stores, or re-order them." which
| especially on embedded systems creates bugs
|
| Right, but it's defined that way to enable compiler
| optimizations, not to spite the programmer. As others
| have mentioned, C has features like _volatile_
| specifically to address this kind of thing. If the C
| standard required memory fences to be inserted
| _everywhere_ , performance would be ruined.
|
| > That kind of UB needs to die in a fire IMHO.
|
| C cannot easily be made into a safe language, and I think
| the committee is doing the right thing in declining to
| try to make C into something it isn't. On the plus side,
| there are plenty of other languages around, many of them
| with compelling advantages over C. Ada, Rust, and Zig,
| for instance.
| tialaramex wrote:
| It's downright _perverse_ to insist that you should be
| able to assume all accesses are volatile reads / stores
| just because that was momentarily convenient.
|
| C lacks the intrinsics you'd actually want for this
| (explicit load/ store of various machine sizes) but it
| does provide the "volatile" storage qualifier which is
| what you should be using to do what you apparently wanted
| in C today. It makes no sense to demand everybody else
| writes some sort of "not-volatile" qualifier in front of
| every variable to tell the compiler that actually this is
| just a variable and it's OK to optimise.
| eqvinox wrote:
| > "Could you tell the linux kernel devs to stop [...]"
|
| As you seem to care about this, have you made this argument on
| LKML yourself? What Linux kernel related work are you active on
| that drives your position?
| froh wrote:
| you did follow the decision to explicitly move to C11, did you?
|
| https://lwn.net/SubscriberLink/885941/01fdc39df2ecc25f/
| https://news.ycombinator.com/item?id=30459634
| unwind wrote:
| How is _Generic toxic?
| voldacar wrote:
| it's ugly and doesn't feel like a first class citizen of the
| language
| CyberDildonics wrote:
| This is a vague and emotional argument to a technical
| question.
| kevin_thibedeau wrote:
| It has some annoying misfeatures. Closely related types
| aren't treated as the same and you can't put multiple related
| types in a _Generic association list. You're forced to resort
| to explicit casts or call the type specific function directly
| which subverts the intended magic of using _Generic in the
| first place.
| eqvinox wrote:
| Some of this was fixed by DR 481 [http://www.open-
| std.org/jtc1/sc22/wg14/www/docs/n2396.htm#dr...], can you
| give some specific cases/examples at issue?
| [deleted]
| emmelaich wrote:
| > _char ASCII character -- at least 8 bits. Pronounced "car"._
|
| Never heard it pronounced _car_!
| huqedato wrote:
| Thank you! Most useful C book I ever got.
___________________________________________________________________
(page generated 2022-03-06 23:00 UTC)