[HN Gopher] Maximal min() and max()
       ___________________________________________________________________
        
       Maximal min() and max()
        
       Author : immibis
       Score  : 109 points
       Date   : 2024-08-07 16:26 UTC (1 days ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | baggy_trough wrote:
       | 47MB of code from a min/max macro is simply hilarious!
        
         | dfox wrote:
         | The effects of preprocessor in C are quite significant. Old DOS
         | Turbo C had shown amount of lines compiled and the speed in the
         | compilation progress window in the IDE. IIRC straight
         | #include <stdio.h>         #include <stdlib.h>         int
         | main(int argc, char**argv){           puts("hello, world");
         | return EXIT_SUCCESS;         }
         | 
         | came to something ridiculous on the order of 100k lines.
        
       | jonathrg wrote:
       | That series of macros is a nice demonstration of the incredible
       | effort it takes to attempt the most basic generic programming in
       | C. Perhaps it would be more productive to just accept the
       | limitations of the language and define a version of `min` and
       | `max` for each type.
        
         | lifthrasiir wrote:
         | That will make `typedef` much less useful, though. C is the
         | real problem; not the attempt to do generic programming in C.
         | (And `_Generic` won't solve this problem anyway. Only a GCC
         | extension of statement expressions will.)
        
         | asplake wrote:
         | Also a demonstration of how much work other compilers do for
         | you behind the scenes.
        
           | Groxx wrote:
           | Ehhh... not much more so than how much `npm install` hides.
           | These macros are part of a gigantic tower of absurdity that
           | costs many, _many_ times more than they need to, but we use
           | them because dev time is more valuable than CPU time.
        
           | jonathrg wrote:
           | A compiler for a language with generic types will definitely
           | do less work to implement generic min/max than the C
           | preprocessor has to do in order to perform all the text
           | substitutions generated by these macros.
        
             | asplake wrote:
             | Oh totally! It has the information to do it both properly
             | and efficiently.
        
           | usr1106 wrote:
           | Or a demonstration about the poor state of programming
           | languages in 2024.
           | 
           | Of course there are others than C. But not many suitable for
           | writing a kernel. And no obvious choice for what Linux could
           | do today. Of course they are starting with Rust, but nobody
           | can predict when that will make the last C macro unnecessary.
           | Unless your prediction is never...
        
             | pjmlp wrote:
             | Plenty of kernels have been written in other languages,
             | naturally UNIX/POSIX folks can hardly think of something
             | else.
             | 
             | Even Linus accepting Rust is kind of interesting, because
             | his C++ rants also apply to plenty of Rust code bases.
        
         | peheje wrote:
         | Why don't they define a min/max for each primitive type? How
         | many are there, 15?
        
           | kzrdude wrote:
           | Even use a macro to define all typed min max macros
        
           | wahern wrote:
           | The reason the current macro is so complex is because it
           | supports _mixed_ types while avoiding (failing on) integer
           | promotion bugs. A version supporting arguments of all the
           | same type would be just as trivial as in C++ (albeit relying
           | on GCC extensions like statement expressions).
        
         | colonwqbang wrote:
         | Genericity is not really the issue as I see it. The basic K&R
         | macro using ?: is fully generic. The problem is that C has
         | implicit conversion between essentially any numeric types. The
         | main point of the kernel macro was to prevent such conversions.
         | 
         | I think implicit type conversion is a mistake, perhaps one of
         | the few true design flaws in C. Languages like haskell and rust
         | went with explicit conversions which is probably a better idea
         | overall, even if it does increase the code verbosity a bit. C++
         | instead doubled down and added many more ways for implicit
         | conversions to happen.
        
       | wakamoleguy wrote:
       | Why are these defined as macros at all? A function call would
       | come with overhead, of course, but wouldn't compilers be able to
       | inline that anyways?
        
         | lifthrasiir wrote:
         | C doesn't have any type-generic function declaration. So any
         | viable solution had to be at least partially powered by a macro
         | and resulting (one-directional) type inference.
        
           | kevin_thibedeau wrote:
           | C11 does with _Generic but it is limited in ways that make
           | min/max implementations flaky.
        
             | pyth0 wrote:
             | Even with _Generic you can't declare generic functions.
             | You'd still need a macro call that uses _Generic to
             | dispatch different implementations depending on the
             | parameter types.
        
               | dwattttt wrote:
               | Generic dispatching all the type combinations (or warning
               | or erroring) wouldn't be a problem in a project the scale
               | of the Linux kernel.
               | 
               | I'm unfamiliar with the incantations needed to try
               | preserve constant expressions though, that might be too
               | much for them.
        
         | ufo wrote:
         | Another issue is that some of these macros are intended to be
         | used in a constant context, such as array dimensions.
        
           | LeifCarrotson wrote:
           | *constant
        
       | account42 wrote:
       | What C programmers will do to avoid using even a little bit of
       | C++.
        
         | pjmlp wrote:
         | Ever since CFront was born on the same UNIX building at Bell
         | Labs.
        
         | Sharlin wrote:
         | To be fair, it's not like you can just add a "little bit of
         | C++" to the kernel.
        
           | jonathrg wrote:
           | C codebases can often be updated with minor changes to
           | compile with a C++ compiler, after which C++ features can be
           | gradually introduced. Is the Linux kernel different in this
           | regard?
        
             | creeble wrote:
             | Read the LWN comments.
        
               | Groxx wrote:
               | From a moderate skim, I'm not seeing much in there that
               | really addresses this beyond FUD and over-simplification.
               | 
               | Which, I mean... it's the Internet. That's kinda
               | expected. But if there's something specific you're
               | seeing, could you link to it? I'm curious as well what
               | the "stick to C" crowd's reasons are. "C plus this one
               | feature of C++" seems rather defensible at a glance
               | (ignoring the social difficulty in choosing _which_
               | feature), but I 'm sure it's much more complicated than
               | that in practice.
        
               | dzaima wrote:
               | Even if a project can choose some specific feature(s) to
               | switch to C++ for, it'd severely reduce the barrier for
               | adding reliance on more features; why did feature X get a
               | pass, but not this other one? And C++ has a lot of such
               | potential features that are harmless and easy to justify
               | at their best, but can become headaches when used more
               | broadly, requiring everyone and everything to deal with
               | them.
        
               | Groxx wrote:
               | The social aspects are _very_ large and it wouldn 't
               | surprise me at all if that was by far the main reason...
               | but that's much less of an issue in a kernel-like context
               | where there are already oodles of rules beyond "write
               | valid C code". They can and do impose significant
               | limitations on the languages they use, successfully, for
               | decades. Seems like they'd be able to do that with C++
               | too.
        
               | jonathrg wrote:
               | There is a comment purporting to show a difference
               | between compilation speeds in C and C++ which uses
               | iostream for the C++ example, which completely misses the
               | point. Sure C++ has a lot of warts, but in terms of
               | compilation speed it should be completely reasonable to
               | use a few simple template functions.
        
               | Groxx wrote:
               | Yeah, that's the main thing I saw and it's just plain
               | completely wrong. The fact that C++ can _more easily_
               | bloat into large build times, and people frequently make
               | larger-build-time projects in it, implies absolutely
               | nothing about its behavior in replacing simple macros
               | like max /min.
               | 
               | There's ample evidence that it'll build just a fast as C
               | there, so it's not an issue in this context, and that
               | both can build quickly with care. That's kinda the point
               | of C++: you can write plain C code plus [this one thing]
               | and you basically don't pay for the rest, and it
               | generally achieves that.
        
               | jonathrg wrote:
               | They do not address the question at all. Please write an
               | answer yourself.
        
               | wahern wrote:
               | The specific macro discussed in the article is using
               | various GCC builtins to safely support mixed integer
               | types; i.e. failing on unsafe type promotions or
               | coercion, but otherwise working automagically. C++
               | std:min, by contrast, requires all the values to be the
               | same type. AFAIU, to accomplish the same semantics in C++
               | would require either template metaprogramming, or doing
               | something similar to what the current version is doing
               | with macros and GCC builtins. A C++ solution might
               | ultimately be cleaner, but I don't think there's anything
               | in the standard C++ library that is a drop-in
               | replacement.
        
             | jcelerier wrote:
             | $ cd /usr/src/linux         $ rg ' class;'
             | vmlinux.h         8541:struct class;         13515:   long
             | unsigned int class;         16416: unsigned int class;
             | 16917: u8 class;         17351: u8 class;
        
               | jenadine wrote:
               | Yes, with minor changes. A few variables need to be
               | renamed, a few cast need to be added. Some churn for sure
               | on such big codebase, but doable nevertheless. GCC did
               | it, other projects did it, I don't see why the kernel
               | can't.
        
             | samatman wrote:
             | Yes.
             | 
             | https://harmful.cat-v.org/software/c++/linus
        
         | cozzyd wrote:
         | Imagine what the equivalent templated C++ code will expand to!
        
           | OskarS wrote:
           | It will be MUCH less, and much less complex than this.
        
         | alerighi wrote:
         | It would be far easier to add as builtins the features that are
         | missing to GCC than change the language, that would involve
         | rewriting a ton of code (even switching from C to C++ they are
         | not 100% compatible, also, they may introduce bugs difficult to
         | spot cause their incompatibility). The Linux kernel already
         | uses a ton of GCC extensions (even the min/max macros suggested
         | in the article) that is not compatible with other compilers
         | anyway (and I don't see a reason to be, since GCC is the
         | compiler of the GNU project anyway, unlikely to compile the
         | Linux kernel with MSVC, or even I don't see much reasons to use
         | clang anyway).
        
           | gpderetta wrote:
           | > The Linux kernel already uses a ton of GCC extensions
           | 
           | You'll be happy to know that GCC happily compiles C++ then!
           | No need to switch to MSVC.
           | 
           | I'm fact GCC itself successfully switched to C++ from C a few
           | years ago.
        
         | microtherion wrote:
         | To paraphrase Henry Spencer, "Those who dislike C++ are
         | condemned to reinvent it, poorly".
        
       | up2isomorphism wrote:
       | Every time when such thing happens, then it becomes a language
       | suggestion opportunity. But for those who suggest another
       | language, are you going to replace a 50M line code base because
       | of you want have a fancy min / max? I don't think this is a
       | responsible suggestion.
        
         | orf wrote:
         | That's a strawman argument: nobody seriously considers a
         | different language because of a "fancy min/max".
         | 
         | However, hygienic macros that fix entire classes of issues
         | (including this) is a more compelling argument.
         | 
         | Not to say that it makes the case, only that the strawman you
         | wrote is not good.
        
           | tuveson wrote:
           | How would hygienic macros fix this? It seems like the macros
           | were working fine, but the amount of generated code increased
           | compile time significantly. Wouldn't a more sophisticated
           | macro system still generate a bunch of extra code and result
           | in slow compile times? It even seems like the solution was to
           | fall back to "dumb" macros when feasible for compile-time
           | performance reasons.
        
             | tetha wrote:
             | > Wouldn't a more sophisticated macro system still generate
             | a bunch of extra code and result in slow compile times?
             | 
             | Not necessarily.
             | 
             | The preprocessor code here picks up the original source,
             | and blows up the initial code (which is about "min3(long_a,
             | long_b, long_c)") to 47 MB of code. no fancy stuff, just
             | 47MB of C-Code on the disk. That's a lot of code the
             | compiler then has to parse and handle.
             | 
             | If hygienic macros are a first-class citizen in the
             | compiler, the compiler parses the original macro code once
             | and then just modifies it in-memory. There is no reason to
             | write 47MB of code somewhere and read it back, this would
             | just happen as an AST modification in memory.
             | 
             | But that is also a much smaller reason. First-class macros
             | allow the compiler to reason based off of the types and
             | structure of the macro inputs. You don't have to guess if
             | something is constant, bounded, unbounded and such. Strong
             | types can enforce this safely and macros and optimization
             | can use these strong guarantees. And sufficiently strong
             | type information can open doors for far, far more powerful
             | optimizations overall.
             | 
             | Just for the record - I'm fully aware why the kernel is
             | where it is, and why it will stay there, but there is far
             | improved compiler and language theory from there.
        
         | bluedino wrote:
         | I've seen things like Boost included to only use a single
         | function.
        
       | Borg3 wrote:
       | This also shows how programmers sometimes use far too fancy stuff
       | for what they do. If you are doing some intense computation,
       | split it, or do it and later slap sanity check inside. It will be
       | even more readable. If you just play with bunch of vars, sure..
       | min/max macros can be easier to read.
        
       | wood_spirit wrote:
       | Could not the main compilers get involved and add builtins so an
       | ifdef makes the common path on the main compilers like gcc use
       | builtins and the slowdown only hurts those using the less
       | mainstream compilers? If it takes off the other compilers would
       | quickly add the feature.
        
         | sapiogram wrote:
         | What you're suggesting is basically a worse version of adding
         | built-in min() and max() to the C spec. Which, to be fair,
         | would be quite nice, but I guess the working group didn't want
         | it in the standard.
        
       | layer8 wrote:
       | This reminds me of how ~25 years ago I wrote little C macro
       | library to perform safe (non-overflowing, and/or saturating)
       | integer arithmetics and comparisons, however the small
       | application I wanted to use it in had the compiler crash after
       | 2-3 hours due to insufficient RAM+swap when compiling a single
       | source file using those macros. It turned out the macro
       | expansions made the translation unit grow to GB size, which must
       | have been 4-5 orders of magnitude over its original size.
        
         | sapiogram wrote:
         | How did that end up happening? Did you define addition
         | recursively or something?
        
           | layer8 wrote:
           | The macros automatically derived the signedness and the
           | minimum and maximum value of the integral types involved, in
           | a way that didn't made platform-specific assumptions like
           | two's complement or no padding bits. Cases like comparing
           | signed long to unsigned long also needed extra logic, due to
           | there being no larger type that encompasses both. I don't
           | remember the details, but it did have a significant number of
           | nested macro invocations.
        
       | usefulcat wrote:
       | > some of the changes to the macros made some developers
       | (including Bergmann) nervous
       | 
       | These macros are now so complex that they're reluctant to touch
       | them. Seems like there is a clear need for some thorough tests
       | here? This is exactly the sort of thing that is eminently
       | testable.
        
         | Vecr wrote:
         | Maybe. Most C code of non-trivial length is UB or at least
         | implementation defined, you really don't want to "tickle"
         | something in a low-level highly used macro.
        
           | usefulcat wrote:
           | Exactly why those macros ought to have tests.
        
             | Vecr wrote:
             | You're exactly right, sorry. I meant that even with tests
             | it's probably too subtle to do the standard test driven
             | design method of programming, you have to be way more
             | careful.
        
       | Joker_vD wrote:
       | (void) (&_x == &_y);
       | 
       | Is this... a check for type-compatibility? I don't think actually
       | produces a compilation error if the types are incompatible.
        
         | cmovq wrote:
         | It produces a warning for incompatible pointer types.
        
           | Joker_vD wrote:
           | Okay, it's better than nothing.
        
             | lokar wrote:
             | Which you can make an error
        
             | mananaysiempre wrote:
             | FWIW, (&_x - &_y) will give you an error instead.
        
       | kibwen wrote:
       | The most pleasant C codebases that I have read essentially banned
       | macros outside of includes, conditional compilation, and named
       | constants. Please just stop trying to use macros to metaprogram
       | in C.
        
       ___________________________________________________________________
       (page generated 2024-08-08 23:00 UTC)