[HN Gopher] Maximal min() and max()
___________________________________________________________________
Maximal min() and max()
Author : immibis
Score : 109 points
Date : 2024-08-07 16:26 UTC (1 days ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| baggy_trough wrote:
| 47MB of code from a min/max macro is simply hilarious!
| dfox wrote:
| The effects of preprocessor in C are quite significant. Old DOS
| Turbo C had shown amount of lines compiled and the speed in the
| compilation progress window in the IDE. IIRC straight
| #include <stdio.h> #include <stdlib.h> int
| main(int argc, char**argv){ puts("hello, world");
| return EXIT_SUCCESS; }
|
| came to something ridiculous on the order of 100k lines.
| jonathrg wrote:
| That series of macros is a nice demonstration of the incredible
| effort it takes to attempt the most basic generic programming in
| C. Perhaps it would be more productive to just accept the
| limitations of the language and define a version of `min` and
| `max` for each type.
| lifthrasiir wrote:
| That will make `typedef` much less useful, though. C is the
| real problem; not the attempt to do generic programming in C.
| (And `_Generic` won't solve this problem anyway. Only a GCC
| extension of statement expressions will.)
| asplake wrote:
| Also a demonstration of how much work other compilers do for
| you behind the scenes.
| Groxx wrote:
| Ehhh... not much more so than how much `npm install` hides.
| These macros are part of a gigantic tower of absurdity that
| costs many, _many_ times more than they need to, but we use
| them because dev time is more valuable than CPU time.
| jonathrg wrote:
| A compiler for a language with generic types will definitely
| do less work to implement generic min/max than the C
| preprocessor has to do in order to perform all the text
| substitutions generated by these macros.
| asplake wrote:
| Oh totally! It has the information to do it both properly
| and efficiently.
| usr1106 wrote:
| Or a demonstration about the poor state of programming
| languages in 2024.
|
| Of course there are others than C. But not many suitable for
| writing a kernel. And no obvious choice for what Linux could
| do today. Of course they are starting with Rust, but nobody
| can predict when that will make the last C macro unnecessary.
| Unless your prediction is never...
| pjmlp wrote:
| Plenty of kernels have been written in other languages,
| naturally UNIX/POSIX folks can hardly think of something
| else.
|
| Even Linus accepting Rust is kind of interesting, because
| his C++ rants also apply to plenty of Rust code bases.
| peheje wrote:
| Why don't they define a min/max for each primitive type? How
| many are there, 15?
| kzrdude wrote:
| Even use a macro to define all typed min max macros
| wahern wrote:
| The reason the current macro is so complex is because it
| supports _mixed_ types while avoiding (failing on) integer
| promotion bugs. A version supporting arguments of all the
| same type would be just as trivial as in C++ (albeit relying
| on GCC extensions like statement expressions).
| colonwqbang wrote:
| Genericity is not really the issue as I see it. The basic K&R
| macro using ?: is fully generic. The problem is that C has
| implicit conversion between essentially any numeric types. The
| main point of the kernel macro was to prevent such conversions.
|
| I think implicit type conversion is a mistake, perhaps one of
| the few true design flaws in C. Languages like haskell and rust
| went with explicit conversions which is probably a better idea
| overall, even if it does increase the code verbosity a bit. C++
| instead doubled down and added many more ways for implicit
| conversions to happen.
| wakamoleguy wrote:
| Why are these defined as macros at all? A function call would
| come with overhead, of course, but wouldn't compilers be able to
| inline that anyways?
| lifthrasiir wrote:
| C doesn't have any type-generic function declaration. So any
| viable solution had to be at least partially powered by a macro
| and resulting (one-directional) type inference.
| kevin_thibedeau wrote:
| C11 does with _Generic but it is limited in ways that make
| min/max implementations flaky.
| pyth0 wrote:
| Even with _Generic you can't declare generic functions.
| You'd still need a macro call that uses _Generic to
| dispatch different implementations depending on the
| parameter types.
| dwattttt wrote:
| Generic dispatching all the type combinations (or warning
| or erroring) wouldn't be a problem in a project the scale
| of the Linux kernel.
|
| I'm unfamiliar with the incantations needed to try
| preserve constant expressions though, that might be too
| much for them.
| ufo wrote:
| Another issue is that some of these macros are intended to be
| used in a constant context, such as array dimensions.
| LeifCarrotson wrote:
| *constant
| account42 wrote:
| What C programmers will do to avoid using even a little bit of
| C++.
| pjmlp wrote:
| Ever since CFront was born on the same UNIX building at Bell
| Labs.
| Sharlin wrote:
| To be fair, it's not like you can just add a "little bit of
| C++" to the kernel.
| jonathrg wrote:
| C codebases can often be updated with minor changes to
| compile with a C++ compiler, after which C++ features can be
| gradually introduced. Is the Linux kernel different in this
| regard?
| creeble wrote:
| Read the LWN comments.
| Groxx wrote:
| From a moderate skim, I'm not seeing much in there that
| really addresses this beyond FUD and over-simplification.
|
| Which, I mean... it's the Internet. That's kinda
| expected. But if there's something specific you're
| seeing, could you link to it? I'm curious as well what
| the "stick to C" crowd's reasons are. "C plus this one
| feature of C++" seems rather defensible at a glance
| (ignoring the social difficulty in choosing _which_
| feature), but I 'm sure it's much more complicated than
| that in practice.
| dzaima wrote:
| Even if a project can choose some specific feature(s) to
| switch to C++ for, it'd severely reduce the barrier for
| adding reliance on more features; why did feature X get a
| pass, but not this other one? And C++ has a lot of such
| potential features that are harmless and easy to justify
| at their best, but can become headaches when used more
| broadly, requiring everyone and everything to deal with
| them.
| Groxx wrote:
| The social aspects are _very_ large and it wouldn 't
| surprise me at all if that was by far the main reason...
| but that's much less of an issue in a kernel-like context
| where there are already oodles of rules beyond "write
| valid C code". They can and do impose significant
| limitations on the languages they use, successfully, for
| decades. Seems like they'd be able to do that with C++
| too.
| jonathrg wrote:
| There is a comment purporting to show a difference
| between compilation speeds in C and C++ which uses
| iostream for the C++ example, which completely misses the
| point. Sure C++ has a lot of warts, but in terms of
| compilation speed it should be completely reasonable to
| use a few simple template functions.
| Groxx wrote:
| Yeah, that's the main thing I saw and it's just plain
| completely wrong. The fact that C++ can _more easily_
| bloat into large build times, and people frequently make
| larger-build-time projects in it, implies absolutely
| nothing about its behavior in replacing simple macros
| like max /min.
|
| There's ample evidence that it'll build just a fast as C
| there, so it's not an issue in this context, and that
| both can build quickly with care. That's kinda the point
| of C++: you can write plain C code plus [this one thing]
| and you basically don't pay for the rest, and it
| generally achieves that.
| jonathrg wrote:
| They do not address the question at all. Please write an
| answer yourself.
| wahern wrote:
| The specific macro discussed in the article is using
| various GCC builtins to safely support mixed integer
| types; i.e. failing on unsafe type promotions or
| coercion, but otherwise working automagically. C++
| std:min, by contrast, requires all the values to be the
| same type. AFAIU, to accomplish the same semantics in C++
| would require either template metaprogramming, or doing
| something similar to what the current version is doing
| with macros and GCC builtins. A C++ solution might
| ultimately be cleaner, but I don't think there's anything
| in the standard C++ library that is a drop-in
| replacement.
| jcelerier wrote:
| $ cd /usr/src/linux $ rg ' class;'
| vmlinux.h 8541:struct class; 13515: long
| unsigned int class; 16416: unsigned int class;
| 16917: u8 class; 17351: u8 class;
| jenadine wrote:
| Yes, with minor changes. A few variables need to be
| renamed, a few cast need to be added. Some churn for sure
| on such big codebase, but doable nevertheless. GCC did
| it, other projects did it, I don't see why the kernel
| can't.
| samatman wrote:
| Yes.
|
| https://harmful.cat-v.org/software/c++/linus
| cozzyd wrote:
| Imagine what the equivalent templated C++ code will expand to!
| OskarS wrote:
| It will be MUCH less, and much less complex than this.
| alerighi wrote:
| It would be far easier to add as builtins the features that are
| missing to GCC than change the language, that would involve
| rewriting a ton of code (even switching from C to C++ they are
| not 100% compatible, also, they may introduce bugs difficult to
| spot cause their incompatibility). The Linux kernel already
| uses a ton of GCC extensions (even the min/max macros suggested
| in the article) that is not compatible with other compilers
| anyway (and I don't see a reason to be, since GCC is the
| compiler of the GNU project anyway, unlikely to compile the
| Linux kernel with MSVC, or even I don't see much reasons to use
| clang anyway).
| gpderetta wrote:
| > The Linux kernel already uses a ton of GCC extensions
|
| You'll be happy to know that GCC happily compiles C++ then!
| No need to switch to MSVC.
|
| I'm fact GCC itself successfully switched to C++ from C a few
| years ago.
| microtherion wrote:
| To paraphrase Henry Spencer, "Those who dislike C++ are
| condemned to reinvent it, poorly".
| up2isomorphism wrote:
| Every time when such thing happens, then it becomes a language
| suggestion opportunity. But for those who suggest another
| language, are you going to replace a 50M line code base because
| of you want have a fancy min / max? I don't think this is a
| responsible suggestion.
| orf wrote:
| That's a strawman argument: nobody seriously considers a
| different language because of a "fancy min/max".
|
| However, hygienic macros that fix entire classes of issues
| (including this) is a more compelling argument.
|
| Not to say that it makes the case, only that the strawman you
| wrote is not good.
| tuveson wrote:
| How would hygienic macros fix this? It seems like the macros
| were working fine, but the amount of generated code increased
| compile time significantly. Wouldn't a more sophisticated
| macro system still generate a bunch of extra code and result
| in slow compile times? It even seems like the solution was to
| fall back to "dumb" macros when feasible for compile-time
| performance reasons.
| tetha wrote:
| > Wouldn't a more sophisticated macro system still generate
| a bunch of extra code and result in slow compile times?
|
| Not necessarily.
|
| The preprocessor code here picks up the original source,
| and blows up the initial code (which is about "min3(long_a,
| long_b, long_c)") to 47 MB of code. no fancy stuff, just
| 47MB of C-Code on the disk. That's a lot of code the
| compiler then has to parse and handle.
|
| If hygienic macros are a first-class citizen in the
| compiler, the compiler parses the original macro code once
| and then just modifies it in-memory. There is no reason to
| write 47MB of code somewhere and read it back, this would
| just happen as an AST modification in memory.
|
| But that is also a much smaller reason. First-class macros
| allow the compiler to reason based off of the types and
| structure of the macro inputs. You don't have to guess if
| something is constant, bounded, unbounded and such. Strong
| types can enforce this safely and macros and optimization
| can use these strong guarantees. And sufficiently strong
| type information can open doors for far, far more powerful
| optimizations overall.
|
| Just for the record - I'm fully aware why the kernel is
| where it is, and why it will stay there, but there is far
| improved compiler and language theory from there.
| bluedino wrote:
| I've seen things like Boost included to only use a single
| function.
| Borg3 wrote:
| This also shows how programmers sometimes use far too fancy stuff
| for what they do. If you are doing some intense computation,
| split it, or do it and later slap sanity check inside. It will be
| even more readable. If you just play with bunch of vars, sure..
| min/max macros can be easier to read.
| wood_spirit wrote:
| Could not the main compilers get involved and add builtins so an
| ifdef makes the common path on the main compilers like gcc use
| builtins and the slowdown only hurts those using the less
| mainstream compilers? If it takes off the other compilers would
| quickly add the feature.
| sapiogram wrote:
| What you're suggesting is basically a worse version of adding
| built-in min() and max() to the C spec. Which, to be fair,
| would be quite nice, but I guess the working group didn't want
| it in the standard.
| layer8 wrote:
| This reminds me of how ~25 years ago I wrote little C macro
| library to perform safe (non-overflowing, and/or saturating)
| integer arithmetics and comparisons, however the small
| application I wanted to use it in had the compiler crash after
| 2-3 hours due to insufficient RAM+swap when compiling a single
| source file using those macros. It turned out the macro
| expansions made the translation unit grow to GB size, which must
| have been 4-5 orders of magnitude over its original size.
| sapiogram wrote:
| How did that end up happening? Did you define addition
| recursively or something?
| layer8 wrote:
| The macros automatically derived the signedness and the
| minimum and maximum value of the integral types involved, in
| a way that didn't made platform-specific assumptions like
| two's complement or no padding bits. Cases like comparing
| signed long to unsigned long also needed extra logic, due to
| there being no larger type that encompasses both. I don't
| remember the details, but it did have a significant number of
| nested macro invocations.
| usefulcat wrote:
| > some of the changes to the macros made some developers
| (including Bergmann) nervous
|
| These macros are now so complex that they're reluctant to touch
| them. Seems like there is a clear need for some thorough tests
| here? This is exactly the sort of thing that is eminently
| testable.
| Vecr wrote:
| Maybe. Most C code of non-trivial length is UB or at least
| implementation defined, you really don't want to "tickle"
| something in a low-level highly used macro.
| usefulcat wrote:
| Exactly why those macros ought to have tests.
| Vecr wrote:
| You're exactly right, sorry. I meant that even with tests
| it's probably too subtle to do the standard test driven
| design method of programming, you have to be way more
| careful.
| Joker_vD wrote:
| (void) (&_x == &_y);
|
| Is this... a check for type-compatibility? I don't think actually
| produces a compilation error if the types are incompatible.
| cmovq wrote:
| It produces a warning for incompatible pointer types.
| Joker_vD wrote:
| Okay, it's better than nothing.
| lokar wrote:
| Which you can make an error
| mananaysiempre wrote:
| FWIW, (&_x - &_y) will give you an error instead.
| kibwen wrote:
| The most pleasant C codebases that I have read essentially banned
| macros outside of includes, conditional compilation, and named
| constants. Please just stop trying to use macros to metaprogram
| in C.
___________________________________________________________________
(page generated 2024-08-08 23:00 UTC)