[HN Gopher] Type-Safe Printf for C
___________________________________________________________________
Type-Safe Printf for C
Author : tinkersleep
Score : 48 points
Date : 2021-12-12 10:47 UTC (2 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| Animats wrote:
| GCC has had printf checking for, what, 20 years?
| WalterBright wrote:
| D supports calling C functions directly, including printf. When
| we added printf format checking against the arguments, many bugs
| were exposed and fixed. It was a big win.
| kazinator wrote:
| GCC has this also; this work is different because it removes
| the type errors. Instead of having an error like "%d expects a
| parameter of type int, not char _" , it just lets %d print the
| string anyway. It's more like _format* in Lisp, say:
| [1]> (format t "~05,'0d" 5) 00005 NIL [2]>
| (format t "~05,'0d" "abc") 00abc NIL
| AlexanderDhoore wrote:
| How is this safer than enabling all warnings in GCC or clang?
| 'Type-safe' in this context does not mean that you get more
| compile errors, but that the format specifier does not need to
| specify the argument type, but just defines the print format. In
| fact, format strings with this library will have less compile-
| time checking (namely none) than with modern compilers for
| standard printf. This approach is still safer.
|
| EDIT The answer is at the bottom apparently. Maybe put that up
| higher?
| tinkersleep wrote:
| Ok, thanks for the hint. I put the most important infos into
| the intro: you just don't need 'll', 'l', 'z' modifiers for
| specifying sizeof(operand), as the compiler does that via
| _Generic.
| eps wrote:
| Also put an example in the first pageful. I almost lost hope
| while scrolling through the wall of format spec when I
| finally saw the first example.
| edflsafoiewq wrote:
| > There is absolutely no chance to give a wrong format
| specifier and access the stack (like printf does via stdarg.h)
| in undefined ways. This is particularly true for multi-arch
| development where with printf you need to be careful about
| length specifiers, and you might not get a warning on your
| machine, but the next person will and it will crash there. I
| usually need to compile for a few times on multiple
| architectures to get the integer length correct, e.g., %u vs
| %lu vs. %llu vs. %zu.
| kevin_thibedeau wrote:
| This is really annoying on architectures like ARM32 where
| size_t is closely related to unsigned int but uint32_t is
| _long_ unsigned int and gets flagged as a different type. It
| becomes a real problem when using a stripped down printf like
| the one in newlib that doesn 't support %zu.
| tinkersleep wrote:
| Exactly! Or on Windows 64-bit, where 'long' is 32-bit and
| 'size_t' is 'unsigned long long'.
| GeorgeTirebiter wrote:
| The Real Problem (tm) is: specifiers like 'char' and
| 'int' etc should not be allowed; they 'should be' things
| like c8 or i16 or u64 --- that is, specify the #of bits
| for that dataype in the type specifier. This is what
| sys/stdint.h is trying to fix.
|
| What maybe 'should' happen in C2x is: 'int' is defined as
| i16, 'long' as i32, 'long long' as i64 etc and then see
| which programs break. Because it's perfectly OK to have
| 16-bit 'ints' on a 64-bit arch. (size_t is what you use
| to deal with architecture-specific chunks). And then
| _remove_ all this 'int' etc crap from C. (Obv, some
| 'compat switch' would need to exist, but you get the
| idea.)
| kevin_thibedeau wrote:
| No that should not happen. Integer types that adapt to
| the platform word size enhance portability. Nobody wants
| a 32-bit default int on an 8-bit platform and using
| uint8_t or uint16_t can introduce performance regressions
| on wider platforms. The traditional integer types are
| perfectly suited for scenarios where the exact width
| doesn't matter and you know the guaranteed minimum is
| good enough.
| arka2147483647 wrote:
| I would argue that most code nowdays iplicitly assumes
| that int is 32bit's long, and wont work correctly in a
| 8bit platform anyways. If 'platform size conforming' ints
| are used, they probably should be opt-in, instead of opt-
| out.
| kevin_thibedeau wrote:
| For Windows the reason they couldn't switch to LP64 is
| because they screwed up the type system with LONG and
| allowed it to be incorporated into OS structs. That
| prevents long from being 64-bit for the sake of
| rationality.
| thebruce87m wrote:
| Can't you just use the inttypes.h along with stdint fixed
| width types to avoid the multiple compiles?
|
| This stackoverflow answer gives an example:
| https://stackoverflow.com/questions/7597025/difference-
| betwe...
| jhallenworld wrote:
| So I also have a custom printf, but there is a limitation: if you
| ask gcc to check it with "__attribute__((__format__ (__printf__",
| then you are forced into using gcc's idea of what the printf
| format string syntax.
|
| How can I have strict type checking, but a user defined format
| string?
| kevin_thibedeau wrote:
| Write your own linter.
| kazinator wrote:
| It's a big mistake that the format language looks like that of
| printf.
|
| If you use this in a big code base, there will still be the old
| printf all over the place.
|
| Now you have to think: is this custom logging function here based
| on the safe printf from github, or is it vsprintf under the hood?
| 37ef_ced3 wrote:
| If you want a modern C, use Go.
|
| Unless you need maximum performance (SIMD, GPUs, etc.) you should
| use a developer-efficient, productive language.
|
| Well-written Go executes almost as fast as C, and you will be
| more productive as a programmer.
| baybal2 wrote:
| Beware of Go. Google may use it to do the "Embrace Extend
| Extinguish" move. It might be type safe, but not ideologically
| safe.
|
| This is on top of Go being an unstable, immature language.
| 37ef_ced3 wrote:
| Go is very stable, and 12 years old.
| nikki93 wrote:
| Have you compared the performance and generated binary size of
| C vs. Go on WebAssembly?
| 37ef_ced3 wrote:
| For WebAssembly, use the TinyGo Go compiler:
|
| https://tinygo.org/
| vladharbuz wrote:
| Has anyone benchmarked Go's garbage collector lately? I like a
| lot of stuff about Go, but a lot of my work is in video games
| and real time audio, and I am extremely hesitant to use a
| garbage collected language for those things.
| nikki93 wrote:
| I've been working on a Go -> C++ compiler pretty much mainly
| for this use case, that skips the GC and concurrency stuff --
| https://www.reddit.com/r/golang/comments/r2795t/i_wrote_a_si.
| .. -- Includes a demo video of a game I'm making with it and
| a built-in scene editor that uses reflection etc.
|
| Repo for compiler itself: https://github.com/nikki93/gx (no
| README.md etc. yet, will be getting to that when I next have
| a chance (it's a side project)). It just takes around 1500
| lines of Go thanks to the parser and typechecker in the
| standard library.
|
| Go's perf was definitely non-trivially bad for me on
| WebAssembly.
| remexre wrote:
| WebAssembly is notably a pathological case for _any_ stack-
| scanning GC, since the stack isn't addressable.
| pphysch wrote:
| > I know I can "do things to maybe cause the GC to run
| less" or such, but then that immediately starts to detract
| from the goal of having a language where I can focus on
| just the gameplay code.
|
| Did you try implementing pooling (e.g. sync.Pool) for game
| objects/entities/components/etc? How did that go perf-wise?
| nikki93 wrote:
| I think the main thing is it starts to become a
| distraction from just writing the gameplay code. I don't
| have to implement the pooling stuff now that I have this
| language. But yeah if I did go further with the game in
| vanilla Go I might have to try the pool approach. Having
| worked on game engines with GC language runtimes (using
| Lua etc.) before, you always ultimately hit a perf
| ceiling due to lack of memory control and wish you could
| move out of it, but the runtimes don't give you a way to
| do that incrementally.
| einpoklum wrote:
| Go is not a "modern C". It may or may not be a swell language,
| but it differs fundamentally from C:
|
| 1. Go is a garbage-collected language, C is not.
|
| 2. Go is a single-company-managed language, while C is managed
| by an international standards committee within ISO. You might
| not care about this difference, but its quite significant
| w.r.t. how future language developments happen.
|
| 3. C types are intentional, Go types are extentional
| ("structural typing").
|
| These fundamental differences are not cases of one language
| being superior, or further advanced, than the other - they're
| about going in different directions.
| 37ef_ced3 wrote:
| I have been writing C for decades, but now I almost
| exclusively use Go.
|
| What I mean is that if you like C99, you will probably like
| Go. Go can be understood as a modernization of C that doesn't
| abandon C's simplicity but adds many useful facilities that C
| lacks.
|
| Go obviously derives from C. It's a very C-like language. It
| makes sense to view Go as an enhanced C that makes slightly
| different trade-offs and that is applicable to a slightly
| different set of purposes.
| tinkersleep wrote:
| Probably the 1e6th approach, but anyway, I also wanted to play
| with this myself: here's a _Generic and macro based approach to
| get printf type-safe in C. It needs C11, and uses some gcc
| extensions.
| kzrdude wrote:
| Do you have a usage example? One early in thee readme maybe.
| Seeing is believing
| marcodiego wrote:
| I, a few times, got reasonably far implementing a generic,
| type-safe, variadic, macro-based and using _Generic "print" for
| C.
|
| I copied some examples of how to implement variadic macros, and
| expanded on that for C basic types. It mostly worked, you'll
| always have difficulty for corner cases like separating
| pointers and arrays, but it worked well for the basic C types.
|
| I gave up for a few reasons: - I wanted a form
| to register new types, so it could work for user-defined types;
| - the C pre-processor knows nothing about lists that can be
| expanded multiple times; - variadic C macros are
| ugly hacks.
|
| Maybe one day I'll get back to it and publish it.
|
| The interesting part is that _Generic combined with macros
| allows some very interesting tools for implementing primitive
| forms of polymorphism. Actually, if the C pre-processor
| supported lists, it would be possible to implement RTTI in C.
| bumblebritches5 wrote:
| > - I wanted a form to register new types, so it could work
| for user-defined types;
|
| > - the C pre-processor knows nothing about lists that can be
| expanded multiple times;
|
| I'm actually working on both features as Clang extensions.
|
| #repeat, a preprocessor directive to loop, can be combined
| with _Pragma(push_macro/pop_macro) to create lists by
| redefining a macro.
|
| and currently #increment, though I think I want to expand on
| this so that other macros can be redefined more easily to
| create lists via push/pop macro.
|
| The reason push_macro/pop_macro pragmas can't work, is the
| macro has to be undefined and redefined, and the value then
| pushed onto a stack in the compiler.
|
| and you can't redefine a macro in the body of another macro
| directly.
|
| so I've been thinking about maybe a
| _Pragma(redefine_macro(MacroToRedefine,
| NewValueForRedefinedMacro))
|
| but I don't want it to be limited to the _Pragma area of the
| compiler, I want it to be eventually standardized.
|
| I've been talking to a friend at WG14 who suggested making it
| a "Preprocessor Expression, like `__has_c_attribute` and
| `defined()`
|
| So that's the area I've been working on recently for the
| Increment/Redefine PE lately.
| bumblebritches5 wrote:
| As for _Pragma(redefine_macro()) I don't want it to be a
| pragma, is the problem.
|
| I want it to be either a compile-time operator like sizeof,
| or a Preprocessor Expression so it can be used correctly.
|
| and it would eclipse #increment pretty easily;
| __redefine_macro(MacroNameToRedefine,
| ReplacementExpression)
|
| if ReplacementExpression is a macro identifier it would be
| expanded first, so like `MacroToRedefine + 1` should work,
| I see no reason it shouldn't work.
|
| maybe it would be ugly, but I think it would work.
|
| ----
|
| My motivation is compile time registration for codecs, test
| suites, test cases being registered to suites, etc.
| marcodiego wrote:
| I spent a long time thinking about this. My conclusion is
| that the simplest way to achieve this, at least in GCC, is
| to create a #copy directive that allows a macro, together
| with its stack, to be copied to another. GCC already allows
| stack expansion with push and pop but it can only be
| expanded once; the #copy directive would fix that.
|
| If you get anything close to that working, that would be a
| godsend. It is the last remaining piece of the puzzle for
| me to implement complete RTTI in C. It would certainly help
| to minimize glib boiler plate code too.
|
| I'd really like it to be part of c2x, but I think it is too
| late now. If it is implemented by either GCC or Clang, the
| remaining other would certainly it too since it is too
| useful. So getting it to work in any of these would be good
| enough for me.
|
| How can I track/follow your progress?
| tinkersleep wrote:
| There is __VA_OPT__ in C++2a, which handles recursion
| termination in macro expansion. This will probably be in
| future C, too, right?
|
| And if there was also __EVAL__ to force the macro
| preprocessor into another evaluation level, you could write
| recursive macros quite easily, e.g., to wrap every argument
| into a function call: #define
| EACH(f,x,...) f(x) __VA_OPT__(, __EVAL__(EACH(f,
| __VA_ARGS__)))
|
| This would make the macro magic for this library trivial:
| you could process lists recursively.
|
| Edit: added missing paren
| tinkersleep wrote:
| > - I wanted a form to register new types, so it could work
| for user-defined types;
|
| Yes, I had the same urge. You can easily fall into the trap
| of too many features on the list. I settled on keeping user
| types out: you can always write a stringify() and pass that
| to the printf. Not the same, I know. But a more finite
| project.
|
| > - the C pre-processor knows nothing about lists that can be
| expanded multiple times;
|
| Yeah, that's a hack. Look at the 'VA_EXP()' macros in
| include/va_print/base.h. Ugly. Incomprehensible.
|
| > - variadic C macros are ugly hacks.
|
| Absolutely. But I think there is no other way in C.
|
| > Actually, if the C pre-processor supported lists, it would
| be possible to implement RTTI in C.
|
| I couldn't resist to put in '%t' which prints the C type of
| the argument...
___________________________________________________________________
(page generated 2021-12-14 23:01 UTC)