[HN Gopher] Lesser known tricks, quirks and features of C
___________________________________________________________________
Lesser known tricks, quirks and features of C
Author : jandeboevrie
Score : 218 points
Date : 2023-02-19 07:27 UTC (15 hours ago)
(HTM) web link (blog.joren.ga)
(TXT) w3m dump (blog.joren.ga)
| vocram wrote:
| Yet another proof that C is simple but not easy.
| SAI_Peregrinus wrote:
| The simplest languages tend to be the most difficult.
| Brainfuck, Binary Lambda Calculus, Unlambda, and other "Turing
| tarpits" are all extremely difficult to use for anything even
| mildly complex.
| elcritch wrote:
| Forth. It's easy for a small cute script. But quickly becomes
| a PITA.
| mtlmtlmtlmtl wrote:
| I always felt like unlambda and other SKI calculus esque
| esolangs(iota comes to mind) could have some kinda strange
| use case in some kind of generalised genetic programming. It
| should be possible to create a binary notation for SKI
| calculus where arbitrary bitstrings will be valid, and so one
| could randomly mutate and recombine arbitrary programs.
| Though I've never delved deeper into genetic algorithms and
| evolutionary programming, my sense is that genetic algorithms
| tend to be restricted to parameterised algorithms where the
| "genes" determine the various parameters. Which can be great
| for optimisation problems.
|
| It's one of those weird ideas I've had kicking about for
| years but never did anything about, and yet I keep coming
| back to it.
| alcover wrote:
| This must be an avenue for very exciting explorations. I'm
| quite ignorant about this stuff but have some questions :
| > It should be possible to create a binary notation for SKI
| calculus where arbitrary bitstrings will be valid
|
| What if it's not ? How will your genetic petri dish spot
| and eliminate invalid programs ? > one
| could randomly mutate and recombine arbitrary programs
|
| What if non-halting programs get generated ?
|
| In this vein I've seen magnificent images of 1D cellular
| automatons that use the surrounding pattern to decide on
| the local rule for next gen.
| mtlmtlmtlmtl wrote:
| I assume that an invalid program would not compile/parse,
| and so would die and fail to reproduce. The issue is more
| that if the space of invalid programs is too large
| compared to the space of valid ones, generating valid
| offspring by combining two programs would be too rare and
| the population would die off.
|
| Though if the space is small enough I imagine you could
| get past that. It's a bit of a gnarly point, hard to tell
| how this would turn out without trying I suppose.
|
| As for the halting problem there's of course no clever
| solution there other than limiting CPU time. So I guess
| pick a reasonable limit that makes sense for whatever
| you're trying to do.
| kevin_thibedeau wrote:
| > The 0 width field tells that the following bit fields should be
| set on the next atomic entity (char).
|
| This isn't correct since int can't be less than 16-bits. Fields
| are placed on the nearest natural alignment for the target
| platform, which might not support unaligned access.
| [deleted]
| Jorengarenar wrote:
| I think I'll use other example. Thanks!
| kevin_thibedeau wrote:
| You can just expand your example to use 16-bit values or
| switch to uint8_t. Bitfields with signed integers are also a
| minefield so it's best to never attempt it.
| titzer wrote:
| C is fundamentally confused, because it offers (near) machine-
| level specifications but then leaves just enough wiggle room
| for compilers to "optimize" (through alignment and such) while
| ruining the precision of a specification. You end up not
| getting exactly what you want at the machine level. It's
| infuriating.
|
| The bitfield stuff in C would be fantastic if it weren't
| fundamentally broken. E.g. some Microsoft compilers in the past
| interpreted bit fields as signed... _always_. In V8 we had a
| work around with templates to avoid bitfields altogether. Fail.
| [deleted]
| rerdavies wrote:
| int isn't a bitfield.
| milgra wrote:
| Very nice collection. My favorite C feature is actually a
| gcc/clang feature : the __INCLUDE_LEVEL__ predefined macro. It
| made me code&maintain my C projects exactly twice as fast as
| before because file count dropped to half :
| https://github.com/milgra/headerlessc .
| zwieback wrote:
| Be interesting to see when these features showed up. I learned C
| from the K&R book back in the day and it doesn't mention most of
| these.
|
| Designated initializer is something I'll try to remember, seems
| handy.
| AceJohnny2 wrote:
| Yeah the K&R, while being a masterpiece of clarity and
| conciseness, is severely outdated in many important ways.
|
| I wish there was some effort to create a modern version while
| preserving the clarity and conciseness of Kernighan and
| Ritchie.
|
| Designated initializers in particular are extremely useful. I
| once halted a factory line for days because of a mistake they
| would have avoided.
| suprjami wrote:
| Designated initialisers were added in C99
| ikran03 wrote:
| Too young to know about anything in there, but these look so
| interesting. Can't wait to show off '%n' in my next uni project
| LegionMammal978 wrote:
| > volatile type qualifier
|
| > This qualifier tells the compiler that a variable may be
| accessed by other means than the current code (e.g. by code run
| in another thread or it's MMIO device), thus to not optimize away
| reads and writes to this resource.
|
| It's dangerous to mention cross-thread data access as a use case
| for volatile. In standard C, modifying any non-atomic value on
| one thread, while accessing it on another thread without
| synchronization, is always UB. Volatile variables do not get any
| exemption from this rule. In practice, the symptoms of such a
| data race include the modification not being visible on the other
| thread, or the modified value getting torn between its old and
| new states.
| clnq wrote:
| Do we have something like this for C++ (parts not shared with C)?
| dantle wrote:
| Nice article. Saw a few things I wish I'd known about.
|
| 1. %n in printf would be handy when writing CLIs dealing w/
| multiple lines or precise counts of backspaces.
|
| 2. Using enums as a form of static_assert() is a great idea
| (triggering a div by zero compiler error).
| jcelerier wrote:
| using enums as a form of static_assert is very bad when C
| nowadays literally has static_assert (_Static_assert:
| https://gcc.godbolt.org/z/bfv6rKdKM)
| tom_ wrote:
| The enum idea is interesting. I've previously used an extern
| with a conditional size of either 1 (valid) or -1 (invalid).
| This requires no additional boilerplate, and is #define-able
| into a static assert when built with a recent enough compiler.
| Something like this, from memory: #define
| STATIC_ASSERT(COND) extern char
| static_assert_cond_[(COND)?1:-1] /* C99 or earlier */
| #define STATIC_ASSERT(COND) _Static_assert(COND) /* C11 or
| later */
|
| As both are declarations, I don't think you'll end up in a
| situation where one is valid and the other isn't - but I could
| be wrong, and I suspect it would rarely matter in practice
| anyway.
| torstenvl wrote:
| %n is an extremely poor fit for CLI manipulation or
| tokenization for backspacing.
|
| %n is for _bytes_ , not user-perceived characters.
| nstbayless wrote:
| Here's another one. Handy "syntax" that makes it possible to
| iterate an unsigned type from N-1 to 0. (Normally this is
| tricky.)
|
| for (unsigned int i = N; i --> 0;) printf("%d\n", i);
|
| This --> construction also works in JavaScript and so on.
| titzer wrote:
| AFAICT this would parse as "(i--) > 0", there's no "-->"
| operator.
| texaslonghorn5 wrote:
| https://stackoverflow.com/questions/1642028/what-is-the-
| oper...
| skribanto wrote:
| how would you iterate over every possible value of a unsigned
| int?
| mtklein wrote:
| Usually I use a do-while loop, unsigned
| char x = 0; do { printf("%d\n", x);
| } while (++x);
| [deleted]
| mtklein wrote:
| It's worth noting that this does also work on signed types, so
| it can be a kind of handy idiom to see while
| (N --> 0) { ... }
|
| and know it will execute N times no matter the details of the
| type of N.
| Jorengarenar wrote:
| I was hesitant to put it on the list, but fine, you convinced
| me
| stonegray wrote:
| If you're gonna test the i--, shouldn't it fall through on zero
| anyway? for (unsigned int i = N; i--;){}
| unsigned int i = N; while(i--){ ... }
|
| Also I think I'm missing the tricky part. Couldn't this be a
| bog-standard for loop? for (unsigned int i = N
| - 1; i > 0; i--){ ... }
|
| The "downto" pseudooperator definitely scores some points for
| coolness and aesthetics, but there's no immediately obvious use
| case for me.
| Jorengarenar wrote:
| The former executes loop when `i` is 0.
|
| And we cannot change the comparison to `>=` in the later,
| because unsigned is always bigger or equal 0, thus we would
| get infinite loop.
| Miserlou57 wrote:
| "Quirks and features"
| aranchelk wrote:
| Where's the DougScore?
| int_19h wrote:
| One non-obvious thing about named function types is that they can
| also be used to declare (but not define) functions:
| typedef void func(int); func f; void f(int) {}
|
| I don't think I've ever seen a practical use for this in C,
| though. In C++, where this also works, and extends to member
| functions, this can be very occasionally useful in conjunction
| with decltype to assert that a function has signature identical
| to some other function - e.g. when you're intercepting and
| detouring some shared library calls: int foo();
| decltype(foo) bar;
|
| I suppose with typeof() in C23 this might also become more
| interesting.
| mtklein wrote:
| I have found this pretty handy for declaring a bunch of
| functions of all the same type, e.g. steps in a direct-threaded
| interpreter. typedef void Step(whatever...);
| Step add,sub,mul,div, load,store,
| etc...;
| localplume wrote:
| I remember once upon a time I thought C was fairly simple, so I
| decided to write a program to generate ASTs from C programs. I
| was very wrong and it was kind of a nightmare. There are so many
| weird little quirks or lesser-used features that I never saw in
| the wild even in large production codebases; I feel like you
| really don't _need_ a lot of these features. I can't imagine
| doing proper compiler work, especially for something like C++.
| Nice article.
| HybridCurve wrote:
| > I remember once upon a time I thought C was fairly simple, so
| I decided to write a program to generate ASTs from C programs.
|
| Oh man, I think we all have been this young and naive at some
| point.
|
| I have spent time working with compilers for this purpose
| (having realized I did _not_ want to attempt parsing source and
| generating the AST) and decided it is much easier to let them
| do the work. That being said, it can still be more than a
| handful (both GCC and Clang have their eccentricities) and
| depending on how you are using it you still might be in over
| your head.
|
| When you start a project like this and end up failing because
| you simply do not have the depth of knowledge or time to see it
| to completion it often feels a bit demoralizing from the loss
| of investment. Truthfully though, having started many such
| ventures (emulators for 6502 and 80386 to name a few), you get
| all the benefit of experience from working on a difficult
| problems without the misery of debugging and model checking
| until everything until is more/less perfect. It's great fun,
| you learn a lot, and you should never avoid trying simply
| because it might be too much to handle.
| 6451937099 wrote:
| [dead]
| AceJohnny2 wrote:
| Compound Literals in C are great. They're no surprise to anyone
| coming from more sophisticated languages, but I've never seen
| them used in the C codebases I've worked on.
|
| What with C also allowing structures as return values, another
| rarely-used feature, they're really useful for allowing a richer
| API than the historical `int foo(...)` that so many people are
| used to seeing.
|
| C has so much legacy that it's really hard for even decades-old
| (C99!) feature to impose themselves. Or perhaps that's MSVC's
| lagging support that's to blame :p
| Joker_vD wrote:
| Never quite understood why compound literals are lvalues, but
| fine, whatever, I guess, it's so that you can write "&(struct
| Foo){};" instead of "struct Foo tmp; &tmp;"... which, on a
| tangential note, reminds me about Go: the proposals to make
| things like &5 and &true legal in Go were rejected because "the
| implied semantics would be unclear" even though &structFoo{} is
| legal and apparently has obvious semantics.
| cataphract wrote:
| It's useful when a function has a out or in/out struct
| parameter whose value at the end you're not interested in. Or
| in functions where the struct is an input parameter, but they
| return it as a return value too, which you can then assign to a
| pointer variable or immediately pass to another function.
|
| Note that the struct values thus created have longer lifetimes
| than temporary C++ objects created directly inside the argument
| list of a function call.
| leni536 wrote:
| In C compound literals have a relatively long lifetime compared
| to C++ temporaries. With these lifetime rules it makes sense
| that they are lvalues, although I like C++ rvalues (especially
| prvalues) more.
|
| https://cigix.me/c17#6.5.2.5.p5
|
| > If the compound literal occurs outside the body of a
| function, the object has static storage duration; otherwise, it
| has automatic storage duration associated with the enclosing
| block.
| 6451937099 wrote:
| [dead]
| ufo wrote:
| Fun fact about %n:
|
| Mazda cars used to have a bug where they used printf(str) instead
| of printf("%s", str) and their media system would crash if you
| tried to play the "99% Invisible" podcast in them. All because
| the "% In" was parsed as a "%n" with some extra modifiers.
| https://99percentinvisible.org/episode/the-roman-mars-mazda-...
| rerdavies wrote:
| Fun fact about %n:
|
| The %n functionality also makes printf accidentally Turing-
| complete even with a well-formed set of arguments. A game of
| tic-tac-toe written in the format string is a winner of the
| 27th IOCCC.
|
| - sez wiki.
|
| A not so fun fact:
|
| Because the %n format is inherently insecure, it's disabled by
| default.
|
| - MSVC reference.
| gdprrrr wrote:
| [dead]
| [deleted]
| tom_ wrote:
| "format not a string literal" is one warning I always upgrade
| to an error. Dear reader: you should do this, too!
| wrigby wrote:
| Thanks! This prompted me to look up the flag to enable this.
| For GCC it's: -Werror=format-security
| Gigachad wrote:
| Why are these not compiler errors by default? Opting in to
| such important safety features seems like broken design.
| zabzonk wrote:
| the c training course at a popular uk training company (the
| instruction set) had duff's device on something like page 5 of
| their c course - expunging it was one of the first things i did
| when i joined them. there were many others.
| zabzonk wrote:
| i don't ask this too often - but what is wrong with this
| comment?
| gallier2 wrote:
| Cool. Two of the tricks shown are from my contribution in
| stackoverflow.
| neverrroot wrote:
| It's cool to have these, it's fun to use them for fun. But please
| don't use them in production code. Also don't assume most of them
| will he known by other developers.
| suprjami wrote:
| Professional C developers definitely should be using at least
| designated init and FAM, standard features both added in C99
| and currently 24 years old.
| Jorengarenar wrote:
| >Also don't assume most of them will he known by other
| developers.
|
| Given the title of the article, one ought to assume the
| opposite ;)
| Gigachad wrote:
| Please don't use C at all in production if you can help it.
| foobiekr wrote:
| Most of these are pretty familiar if old enough but this is a
| wonderful list.
|
| I didn't know C23 was getting rid of trigraphs. That's probably a
| good thing and easy to clean up if needed.
| int_19h wrote:
| The bit about "register" is old enough that I don't think it's
| meaningful anymore.
|
| The stock verbiage about how modern compilers ignore "register"
| because they can do better but it may be useful on simpler
| ones, has been around in this exact form 20 years ago already.
| And one curious thing is that even back then, such statements
| would never list specific compilers where "register" still did
| something useful.
|
| So far as I can tell, "register" was in actual use back when
| many C compilers were still single-pass, or at least didn't
| have a full-fledged AST, and thus their ability to do things
| like escape analysis was limited. With that in mind, "register"
| was basically a promise to such a compiler to not take the
| address of a local in the function body (this is the only
| standard way in which it affects C semantics!). But we haven't
| had such compilers for a very long time now, even when
| targeting embedded - the compilers themselves run on full-power
| hardware, so there's no reason for them to take shortcuts.
| camel-cdr wrote:
| I think register is closer to const, as in: it's a hint to
| the programmer not the compiler.
|
| So if you want to make absolutely sure that a variable can
| always be in a register then you should consider adding the
| register specifier to stop other programmers from taking the
| address of that variable.
| tastysandwich wrote:
| "Expert C Programming: Deep C Secrets" is a really good book to
| learn a lot of C tricks and quirks, plus some history. I read it
| a few years ago and loved it.
|
| I was a grad when I read it and remember annoying my older
| coworkers for a few weeks with little gotchas I picked up. "hey
| what do you think THIS example prints?" "Stop sending me these!"
| somewhereoutth wrote:
| I'm a bit better at English than c, and in the spirit of language
| peculiarities, this jumped out at me:
|
| > It's possible, because C cares less than more about whitespace
|
| Idiomatically we'd say 'couldn't care less'. I guess we should be
| glad it wasn't the diabolical and illogical 'could care less'
| Jedd wrote:
| I don't believe those are functionally / semantically
| equivalent - couldn't care less does imply a min() value of
| care.
|
| In contrast, the author is suggesting a comparative only.
|
| And, on careful re-reading, I suspect the author is having a
| play on syntax & semantics here -- the context of the quote is:
|
| > You may ask, since when C has such operator and the answer
| is: since never. --> is not an operator, but two separate
| operators -- and > written in a way they look like one. It's
| possible, because C cares less than more about whitespace.
|
| Given that '--' is decrement (kind of 'lessen') and > is
| greater than (kind of 'more'). Perhaps I am reading too much
| into that.
|
| (I feel 'couldn't care less' is perhaps more common in northern
| America than elsewhere, and while TFA has a Gabon TLD, appears
| to be resident in Poland, so automatically receives a lot of
| leeway in their use of idiomatic English.)
___________________________________________________________________
(page generated 2023-02-19 23:00 UTC)