[HN Gopher] Macros on Steroids: How pure C can benefit from meta...
       ___________________________________________________________________
        
       Macros on Steroids: How pure C can benefit from metaprogramming
        
       Author : todsacerdoti
       Score  : 69 points
       Date   : 2021-07-22 14:41 UTC (8 hours ago)
        
 (HTM) web link (hirrolot.github.io)
 (TXT) w3m dump (hirrolot.github.io)
        
       | gsmecher wrote:
       | There are good reasons to reach for macro libraries. They're
       | rare, but they're out there. I make heavy use of
       | boost::preprocessor in a C project and it was fundamentally
       | enabling. I have occasionally cursed at it but never regretted
       | it.
       | 
       | It is basically impossible to provide a "kind" development
       | environment using deep macro trickery. The error messages you get
       | from this kind of library are a major handicap. This failing is
       | intrinsic to the landscape and not something you can really fix.
       | 
       | Historically, macro processors in major compilers were
       | inconsistent, limited, and arbitrary, making heavy macro use
       | frustrating, limited, non-portable. I think this is mostly
       | historical now.
       | 
       | IMO people should prefer boost::preprocessor for purposes like
       | this (even in pure C), since it is battle-tested and more
       | polished than "little" metaprogramming libraries will likely ever
       | be. (This is advice to possible users, and not a knock on the
       | author. Props for exploring the space and building something
       | useful.)
        
         | ska wrote:
         | > It is basically impossible to provide a "kind" development
         | environment using deep macro trickery.
         | 
         | Unless it's in a language actually designed for meta-
         | programming. Obviously if you are already in a C project this
         | might not be practical, but if you need a lot of it you may
         | just be using the wrong tool for the job.
        
           | gsmecher wrote:
           | I was specifically commenting on cpp-style macros. Apologies
           | if this was unclear.
        
         | nicoburns wrote:
         | If you need these kinds of advanced metaprogramming
         | capabilities then I feel like that's a sign that maybe plain C
         | isn't suitable for your project and you'd better off switching
         | to Zig, Rust, D or C++. These all provide powerful compile-time
         | execution capabilities and the ability to expose C interfaces.
        
           | lhorie wrote:
           | Even those may not be powerful enough. I have a use case
           | where I was trying to read specification files to generate
           | unicode tables. But using something like zig's @embedFile to
           | get a handle to a []u8 in a comptime block doesn't work
           | because I don't want the file to be embedded into the binary;
           | I want something more akin to turing complete code
           | generation. And I don't need it to run it on every compile,
           | as the entire flow involves downloading said files off the
           | internet, but only when a new version of the spec is
           | released. I ended up just using code generation.
        
             | samatman wrote:
             | Zig is a work in progress, and I bet the community is
             | interested in supporting this sort of use case.
             | 
             | Might be nice to bring it to their attention, if you
             | haven't already, and if you have the time and inclination.
        
           | Hirrolot wrote:
           | Only if you can switch to something like Zig, Rust, etc. But
           | a tremendous amount of code bases are already written in
           | plain C. Using preprocessor metaprogramming allows you to
           | invoke macros right in the same files in which you write
           | ordinary C code, without the need for integration with other
           | languages.
        
             | nicoburns wrote:
             | > Using preprocessor metaprogramming allows you to invoke
             | macros right in the same files in which you write ordinary
             | C code
             | 
             | C++ effectively lets you do this too. And Zig will let you
             | import and compile C files seamlessly. I suppose you have
             | to switch compilers, but that's not so different to adding
             | the additional pre-processor compiler.
        
               | Hirrolot wrote:
               | C++ can, fair. Still there is a huge amount of developers
               | sticking to plain C, even if they can use C++ (FFmpeg,
               | VLC, Linux, etc.), why do they do that? I think an answer
               | to this question would be pretty much the answer for the
               | initial question. There might be cultural, historical,
               | and other reasons for it.
               | 
               | The thing with Zig is that, if I understand it correctly,
               | Zig can't be interleaved with other C code in the same
               | file, which forces you to separate things, loosing
               | convenience. On the other hand, to use Datatype99, you
               | can just #include <datatype99.h> and you're ready to go.
        
             | gsmecher wrote:
             | To date, C's most proximate (management-friendly,
             | technically similar, programmer-skill-adjacent) competition
             | has been c++. (I'm trying to lay the groundwork for the
             | next statement: I'm not claiming this is good, and excited
             | to see Rust and friends as serious contenders. Stay with
             | me.)
             | 
             | Even in c++20, there are things you can do with macros you
             | can't do in the language. This may change when reflection
             | lands in c++2x, but I won't be able to use it for years and
             | widespread adoption won't be possible for another decade.
        
               | spaetzleesser wrote:
               | When I look at a lot of things that are done with
               | reflection in C# I personally would much prefer the
               | preprocessor and macros. Things that could be done in a
               | type safe manner with a macro in a few lines become way
               | more complex when using reflection.
        
               | touisteur wrote:
               | I'm a huge fan of java _because_ of reflection. It can be
               | so empowering and awesome. But lately I 've come to
               | prefer codegen+JIT when possible. So, discovery through
               | introspection, and to call/read/modify stuff I'll
               | generate the code.
        
               | mumblemumble wrote:
               | Could I beg of you to give an example?
               | 
               | Back when I was working in C#, there were quite a few
               | times when I'd encounter things being done with
               | reflection that I wish had been done with less code and a
               | more type safe manner by just... not using reflection. I
               | always just assumed a lot of that was just bad habits
               | left over from the .NET 1.0 days, back before generics.
               | So I'm curious if there's much overlap there? If it's
               | only a subset, I don't know C as well as I'd like, so it
               | would be interesting to see something that the C
               | preprocessor handles better.
        
           | Zababa wrote:
           | If you want ADTs and can afford the (slight) performance hit
           | and the higher resource consumption, you can even look into
           | OCaml.
        
         | Hirrolot wrote:
         | > IMO people should prefer boost::preprocessor for purposes
         | like this (even in pure C), since it is battle-tested and more
         | polished than "little" metaprogramming libraries will likely
         | ever be.
         | 
         | In fact, I used to program in boost::preprocessor before,
         | trying to implement ADTs [1]. At the end of the game, I just
         | realised I didn't have enough patience and whatever else to
         | continue using it. I could accidentally see that some macro got
         | blocked and then debugging it for a half of a day to eventually
         | come up with ugly workarounds.
         | 
         | Then I made Metalang99. It has many neat features but really
         | the very motivation for it was to get rid of macro recursion
         | blocking. For example, if you call FOO, then FOO calls BAR, BAR
         | calls JAR, and JAR calls FOO, everything should work as
         | expected. This was a complete nightmare with
         | boost::preprocessor.
         | 
         | So in conclusion, I agree that boost::preprocessor is more
         | polished and can work on ancient compilers too. However, since
         | nowadays the situation with preprocessors is better, I would
         | endeavour to use more modern tools which offer greater
         | convenience and flexibility.
         | 
         | [1]: https://github.com/Hirrolot/poica
        
           | gsmecher wrote:
           | You're absolutely correct, boost::preprocessor has an
           | obligation towards backwards compatibility that independent
           | efforts do not. This obligation is a double-edged sword: when
           | I pick a library, I want it to be modern; 5 years later, I
           | shouldn't be forced to refactor my code to match a library
           | update. There's a good parallel here with boost::python vs.
           | pybind11; there's always room for more and newer entrants,
           | but most of them will not overcome the first-mover advantage.
           | 
           | On the boost front, I should have noted the difference
           | between VMD [1] and boost::preprocessor [2]. I think VMD [1]
           | trades some backwards compatibility for coherence and
           | consistency and is a better starting point for new code.
           | 
           | [1]: https://www.boost.org/doc/libs/1_70_0/libs/vmd/doc/html/
           | inde...
           | 
           | [2]: https://www.boost.org/doc/libs/1_76_0/libs/preprocessor/
           | doc/...
        
       | IncRnd wrote:
       | You could create this as a frontend macro preprocessor that
       | implements new language constructs. Maybe it could be called
       | cfront. I just hope that the usefulness of that doesn't lead to
       | bloated and complex codebases and binaries as features are added
       | to the preprocessor.
        
         | Hirrolot wrote:
         | There are several reasons why not to resort to third-party code
         | generators [1]. This includes the burden of maintenance (think
         | about the build process) and seamless integration of native
         | macros with other code.
         | 
         | Do you have some code to show bloated binaries with
         | Datatype99/Interface99? The last time I checked the generated
         | assembly code, it was nearly of the same size as hand-written
         | code [2].
         | 
         | [1]: https://github.com/Hirrolot/metalang99#q-why-not-third-
         | party... [2]: https://godbolt.org/z/ns6Ma7csd
        
           | Banana699 wrote:
           | Original comment was satirizing C++, it started as a
           | (relatively) simple code generator that consumed a custom OOP
           | variant of C and produced pure C. It grew into... well C++.
        
           | IncRnd wrote:
           | Thank you. I wasn't talking about what you mentioned. Cfront
           | was the preprocessor that implemented c++ for the first
           | decade of its existence.
        
       | [deleted]
        
       | enhray wrote:
       | The C preprocessor is definitely ill-suited for metaprogramming,
       | but it was never the C way, was it?
       | 
       | The traditional way to do any kind of meaningful metaprogramming
       | in C is just a printf() to a .h or .c file which then is included
       | to your build.
       | 
       | There are a lot of projects doing that. Bison is supposed to be
       | used this way, other projects are doing build configuration like
       | that - by emitting a header with a ton of #define's, and there's
       | a ton of languages which use C as a compilation target - and you
       | can see what they are doing and get inspiration from that.
       | 
       | In my opinion it's an extremely powerful model, much better than
       | anything you can do with the preprocessor.
        
         | Hirrolot wrote:
         | If we are talking about such abstractions as Datatype99, this
         | model is less convenient than native macros [1]. You have to
         | write code in separate files, IDE support is lacking,
         | sophisticating a build procedure, etc etc.
         | 
         | [1]: https://github.com/Hirrolot/metalang99#q-why-not-third-
         | party...
        
           | enhray wrote:
           | Native macros were never supposed to be used that way. If
           | anything goes wrong, you still have to deal with that, and no
           | IDE will save you from having to invoke your compiler with
           | "preprocess-only" flag to see what you're dealing with. Been
           | there, done that, don't want to do that ever again.
           | 
           | Compared to that, debugging generated code is a breeze.
           | 
           | Also, there's no "third-party" generators - everything just
           | lives in your own source tree. If I ever need to go meta,
           | it's just a printf away; I can even commit the generated
           | files to my VCS and be able to see what had changed in them
           | between commits in a simple and understandable diff.
           | 
           | Regarding the integration, I'll take setting up an additional
           | build phase (once) over having to debug C macros any day.
        
             | Hirrolot wrote:
             | > Native macros were never supposed to be used that way. If
             | anything goes wrong, you still have to deal with that, and
             | no IDE will save you from having to invoke your compiler
             | with "preprocess-only" flag to see what you're dealing
             | with. Been there, done that, don't want to do that ever
             | again.
             | 
             | ~95% of errors from Datatype99 can be observed from the
             | console, I hardly ever run my compiler with -E. What I mean
             | by IDE support is that you invoke macros in the same files
             | in which you write ordinary C code, you can't do that with
             | printf. Imagine that you write your tagged unions
             | (datatype(...)) inside separate files, it's clearly less
             | convenient than embedded definitions.
             | 
             | > Regarding the integration, I'll take setting up an
             | additional build phase (once) over having to debug C macros
             | any day.
             | 
             | I can't remember the time when I debugged already written
             | and tested macros from Datatype99/Interface99, to be
             | honest.
        
       | Koshkin wrote:
       | I have never had a problem generating C code using C itself. (A
       | fancy way to do this would be to have an ASP-like preprocessor,
       | i.e. a simple program that processes <% ... %> code blocks; this
       | is essentially "inverse quoting," whereby everything that is
       | _outside_ the brackets becomes the contents of a string literal.)
        
       | jjnoakes wrote:
       | Can anyone with some real experience with this comment on the
       | debug visibility? In gdb or lldb or otherwise? How easy is it to
       | see what is going on, partially expand macros, get at
       | intermediate values, or evaluate things at the debug prompt? At
       | first glance it seems like it'd be quite obfuscated but perhaps I
       | will be pleasantly surprised by some experience reports.
        
         | Hirrolot wrote:
         | If you're asking about debugging macros, I've put a huge effort
         | to this [1]. There are several macros that help you test,
         | debug, and report errors at compile-time. For example, you can
         | make calls to ML99_abort and see the expansion with -E. Usually
         | I debug macros in this way.
         | 
         | If you're asking about debugging code generated by macros,
         | Interface99 has no problems with it since the generated code is
         | trivial, pretty much as if you wrote by hand. Datatype99 can
         | introduce a little inconvenience into it as it generates a
         | single-step for-loop for each variable binding [2], but this
         | should not be a big problem too.
         | 
         | [1]: https://hirrolot.gitbook.io/metalang99/testing-debugging-
         | and... [2]: https://hirrolot.github.io/posts/compiling-
         | algebraic-data-ty...
        
       | xvilka wrote:
       | There's also libCello[1] which provides useful abstractions.
       | Though, at this point I think choosing Zig[2] programming
       | language which allows to write compilation-time logic in the same
       | language as the runtime is preferred.
       | 
       | [1] http://libcello.org/
       | 
       | [2] https://ziglang.org
        
         | Hirrolot wrote:
         | Zig is an interesting alternative to C.
         | 
         | libcello has quite different philosophy than Metalang99 and the
         | accompanying libraries like Datatype99. The thing is that the
         | former comes with its own runtime environment, but the latter
         | ones even don't require the standard library. I believe the
         | philosophy of C is closer to Metalang99/Datatype99/Interface99.
        
       | WalterBright wrote:
       | My not-so-humble opinion is that if you're using the C
       | preprocessor to do metaprogramming, you have outgrown C and need
       | a more powerful language.
        
         | d_tr wrote:
         | I humbly agree unless, of course, there is no other solution,
         | e.g. lack of compilers on your platform.
        
           | WalterBright wrote:
           | Even so, I used to do a lot of metaprogramming with the C
           | preprocessor back in the day. Over time, I grew disillusioned
           | with it and gradually removed it from my C code. It just
           | wasn't worth it.
           | 
           | P.S. I also did metaprogramming in C by writing C programs to
           | generate C code, such as converting an array of structs to
           | multiple arrays, one for each field (for efficiency). This
           | worked out fairly cleanly, but with D's ability to run
           | functions at compile time, I was able to put this back into
           | the main program.
        
             | touisteur wrote:
             | I'm curious of your opinion about AdaCore CCG, an SPARK to
             | C compiler, for targets that don't have an Ada compiler
             | yet.
        
               | WalterBright wrote:
               | I do not know enough about them to have a worthwhile
               | opinion.
        
               | touisteur wrote:
               | OK sorry :-) thanks
        
         | wudangmonk wrote:
         | I agree that you shouldn't use the preprocessor for
         | metaprogramming, it sucks. Switching to C++ is not a solution
         | either since its template system is garbage for metaprogramming
         | aswell.
         | 
         | Starting small with a small utility such as this one or
         | creating one of your own is the best aproach in my opinion.
        
         | Hirrolot wrote:
         | See this comments branch:
         | https://news.ycombinator.com/item?id=27921095.
        
       | lhorie wrote:
       | Something about macros that I don't see discussed much is the
       | trade-off between improved clarity of intent vs learning curve
       | for contributors.
       | 
       | I don't have a ton of experience w/ C, but the large projects
       | I've seen appear to have a tendency to avoid heavy macro usage
       | (e.g. using `void *` for hashmap implementations instead of
       | reaching for macros, for example). I see macros appear more
       | frequently among those that prioritize machiavellian cleverness,
       | often under the pretext of squeezing performance: code written by
       | (g|x)ooglers, for example.
       | 
       | I'm curious what more experienced C developers think of jumping
       | into projects with heavy usage of macro-based DSLs, especially
       | the syntax-bending variety being suggested here. Does it get in
       | the way of contributing? Or of reasoning about performance? Etc.
        
         | nomel wrote:
         | > vs learning curve for contributors
         | 
         | Early on in my programming career, I implemented all sorts of
         | beautiful magic, but I was alone. Later, I realized that
         | beautiful magic was keeping me alone. Now, much later in my
         | career, I program mostly like a novice. I don't write clever
         | code, I unroll powerful one liners into boring for loops, and I
         | use all sorts of temporary variables to make intent perfectly
         | clear, because I want and need as much help as I can get.
         | Confusing contributors for some wanky obsession with
         | "terseness" or "elegance" is a good indication of either a
         | great team, or someone that works alone.
        
         | bumbada wrote:
         | I have seen complex macros a lot and talked with the people
         | that made them.
         | 
         | Most of them just were not conscious of the consequences of
         | their behavior. E.g this behavior is typical of someone that
         | programs for a living but does not debug her own code, because
         | someone else in the company does. So she is isolated from the
         | consequences of her actions.
         | 
         | For making them understand it is as easy as making the debugger
         | people to program and the programming people to debug for a
         | while.
         | 
         | Nothing like making people miserable suffering from bad source
         | code they have themselves created for making them understand.
         | 
         | Then you give them solutions for their misery and they learn
         | pretty fast.
         | 
         | Experienced C programmers tend to avoid the trap of using heavy
         | macros because they have learned. But at the same time they
         | look for alternatives that make them more productive with some
         | king of real metaprogramming.
        
         | Hirrolot wrote:
         | I'm currently using Datatype99 & Interface99 at work.
         | 
         | What can I say? Everything goes well, nothing critical has
         | appeared soon, my coworkers are able to understand the code.
         | 
         | Regarding macro-based eDSLs, I would not resort to them for
         | needs other than abstractions to be considered as a part of the
         | language (ADTs, interfaces, etc.), since they are like
         | languages on their own; when you try to understand/contribute
         | to eDSL-based code, you have to understand its syntax and
         | semantics, aside from how does it actually solve a particular
         | problem in the problem domain.
         | 
         | In fact, as for now, I don't see much need in something like
         | Metalang99 except for Datatype99 & Interface99 (and probably my
         | feature request to libmprompt [1]).
         | 
         | [1]: https://github.com/koka-lang/libmprompt/issues/8
        
           | lhorie wrote:
           | That's interesting. The reason I'm asking is that I'm
           | interested in the open source adoption angle. For example,
           | say Linux went ham on DSLs, would people unwillingly learn it
           | for the sake of being able to be part of the large community,
           | would they embrace it as if it's the next best thing after
           | sliced bread, or is that just never going to happen because
           | of some strong conceptual aversion to the paradigm? Or say a
           | not-so-known C projects wants to gain adoption/popularity,
           | would the choice of using DSLs be a showstopping
           | consideration?
        
             | Hirrolot wrote:
             | I'd wager no, as long as you remain sane and do not pollute
             | everything with fancy macro eDSLs. Specifically Datatype99
             | & Interface99 can be good for newcomers because they make
             | code more concise, less intimidating and intricate.
        
             | Zababa wrote:
             | I'd say it can be the opposite. If the DSL is addressing a
             | real and common problem in C codebase, it's better to use
             | it and that everyone use it rather than have everyone
             | create their own special DSL.
        
       | bumbada wrote:
       | >Have you ever envisioned the daily C preprocessor as a tool for
       | some decent metaprogramming?
       | 
       | Never had such a nightmare.
       | 
       | For metaprogramming C you should use real macros like Lisp. We
       | have been doing that for a long time.
       | 
       | Using the C preprocesor is a terrible idea. With the preprocessor
       | you could create code, but a professional environment requires
       | things like being able to go backwards, not just from Macro
       | source code to executable but from the executable to source code.
       | 
       | What do I mean with that?
       | 
       | In a professional environment, when something happens, for
       | example your program goes too slow for a customer you need to
       | understand what is happening as fast as possible. C MACROS and
       | preprocessor are evil for that.
       | 
       | The c preprocessor replaces something by something else and the
       | process is completely opaque. If you have multiple layers of
       | macros, codes becomes impossible to follow, isolate and
       | understand with a debugger or a profiler.
       | 
       | All our code has C MACROS of any type forbidden, only permitted
       | in external libraries. Our build process detects C Macros and
       | stops compilation if it finds them.
       | 
       | Usually the way things work someone creates a easy C macro to
       | automate some small thing, then a month later someone else
       | creates another macro that uses the macro in a two layer system,
       | then someone else creates another macro over the macros and
       | leaves the old macros there.
       | 
       | That makes code extremely hard to understand, isolate,
       | modularize, trace or debug. C programmers could have the
       | temptation not to learn different tools for metaprogramming. We
       | don't let you do that, if you want to do metaprogramming you are
       | forced to learn the proper tools for the job.
       | 
       | That has made our codebase extremely robust. We use real
       | metaprogramming with our own tools, not hacks, and that gives us
       | a tremendous competitive advantage because problems takes 1/10th
       | or 1/100th the time for being solved.
        
         | Hirrolot wrote:
         | > The c preprocessor replaces something by something else and
         | the process is completely opaque. If you have multiple layers
         | of macros, codes becomes impossible to follow, isolate and
         | understand with a debugger or a profiler.
         | 
         | It's not true for Datatype99 & Interface99. Their code
         | generation semantics are completely transparent to a user of
         | these macros [1] [2].
         | 
         | > That has made our codebase extremely robust. We use real
         | metaprogramming with our own tools, not hacks, and that gives
         | us a tremendous competitive advantage because problems takes
         | 1/10th or 1/100th the time for being solved.
         | 
         | If you use third-party tools and you're okay with that, I'm not
         | saying you should stop using them, I'm saying that there is
         | another solution with advantages over third-party tools [3].
         | 
         | If native macros haven't worked for your codebase, it doesn't
         | mean they don't work for others. I would not say that the
         | preprocessor is a thing to always avoid -- there are many
         | examples why it is helpful, and even more helpful than any kind
         | of third-party tools you can come up with.
         | 
         | > Usually the way things work someone creates a easy C macro to
         | automate some small thing, then a month later someone else
         | creates another macro that uses the macro in a two layer
         | system, then someone else creates another macro over the macros
         | and leaves the old macros there.
         | 
         | How third-party tools are different from native macros in this
         | case?
         | 
         | [1]: https://github.com/Hirrolot/datatype99#semantics [2]:
         | https://github.com/Hirrolot/interface99#semantics [3]:
         | https://github.com/Hirrolot/metalang99#q-why-not-third-party...
        
       ___________________________________________________________________
       (page generated 2021-07-22 23:01 UTC)