[HN Gopher] My personal C coding style as of late 2023
___________________________________________________________________
My personal C coding style as of late 2023
Author : zdw
Score : 491 points
Date : 2023-10-09 00:30 UTC (22 hours ago)
(HTM) web link (nullprogram.com)
(TXT) w3m dump (nullprogram.com)
| al_be_back wrote:
| >> typedef all structures. I used to shy away from it
|
| it seems this may be more an issue with being shy, than a coding
| issue :P
|
| in my view, language is a Style, so coding in C is c-style
| coding.
|
| when i started coding in python (small projects initially), my
| code cried-out Java Java (typing, packaging, naming, oop), and
| took nearly as long to write - I laughed my head of when I
| realized, just because I could doesn't mean I should.
| antiquark wrote:
| > #define sizeof(x) (size)sizeof(x)
|
| Technically, it's illegal to #define over a language keyword.
| oh_sigh wrote:
| Also, `#define countof(a) (sizeof(a) / sizeof(*(a)))` is unsafe
| since the arg is evaluated twice.
| david2ndaccount wrote:
| Arguments to sizeof aren't actually evaluated (except for
| variably-modified-types, AKA VLAs, but don't use those).
| uecker wrote:
| Please use those. They are useful, make code clearer,
| improve bounds checking, ...
|
| Don't let attackers influence the size of a buffer (neither
| for VLAs nor for heap allocations).
| f33d5173 wrote:
| The arg is expanded twice but evaluated zero times, since
| sizeof gives the size of its argument type without executing
| anything.
| Izkata wrote:
| countof(foo()) looks like foo() is only called once, but
| would actually be called twice. That's what GP is talking
| about, it's evaluated twice after the expansion when the
| code is actually running, not during the expansion.
| tpush wrote:
| Again no, foo isn't called at all since countof() only
| uses its argument in a sizeof expression.
| [deleted]
| uecker wrote:
| It is not evaluated for regular arrays. It is evaluated
| for arrays with variable size, you need to be careful a
| bit. But this is rarely happens to be a problem.
|
| The general rule for sizeof is to apply it only to
| variable names or directly to typenames.
| badsectoracula wrote:
| This is a very common macro to get static array lengths and
| i'm not sure there is any other way to do the same thing
| (i.e. give a static array, get back the number of items in
| it) in any other way.
| chii wrote:
| really? then how do you #define things like types like `int`
| and `char`?
| antiquark wrote:
| For example, it would be illegal to do the following:
|
| > #define int long
|
| Because you're replacing the int keyword with something else.
|
| The standard says:
|
| > 17.6.4.3.1 [macro.names] paragraph 2: A translation unit
| shall not #define or #undef names lexically identical to
| keywords, to the identifiers listed in Table 3, or to the
| attribute-tokens described in 7.6.
| jenadine wrote:
| That's C++. But I couldn't find the same restriction in C.
| In fact it seems that C allows it as long as you don't
| include any of the standard C header.
|
| > 7.1.2 "Standard headers" SS5 [...] The program shall not
| have any macros with names lexically identical to keywords
| currently defined prior to the inclusion of the header or
| when any macro defined in the header is expanded.
| Joker_vD wrote:
| And so you still can re-define keywords, but only _after_
| you 've included all the standard headers you want. Which
| makes sense: the meaning of including a standard header
| is entirely standard-mandated (they are not even required
| to be actual files) so making anything that could
| potentially mess with the implementation's implementation
| of standard headers UB is reasonable.
| antiquark wrote:
| Searching around gave me this from stackoverflow:
|
| In C standard, it is not allowed.
|
| > C11, SS 6.4.1 Keywords
|
| > [...]
|
| > The above tokens (case sensitive) are reserved (in
| translation phases 7 and 8) for use as keywords, and
| shall not be used otherwise.
|
| https://stackoverflow.com/questions/12286691/keywords-
| redefi...
| tom_ wrote:
| You don't.
| wannacboatmovie wrote:
| If you're dead set on doing this, the correct way would be to
| name the macro in all caps e.g. #define SIZEOF(x) as C is case
| sensitive. It is somewhat self-documenting to the next guy that
| SIZEOF() != sizeof().
| loeg wrote:
| Any name is fine, as long as it isn't literally "sizeof".
| mianos wrote:
| Considering '#defines' are done in a textual pre-precessing by
| the C pre-processor, they don't know much at all about the C
| language. You can define out int, long, struct or anything.
|
| I have seen many people redefine 'for' and 'while'. These
| people often argue that it is an improvement.
| cpp #define sizeof(x) (size)sizeof(x) sizeof(UU)
|
| ^D # 1 "<stdin>" # 1 "<built-in>" #
| 1 "<command-line>" # 31 "<command-line>" # 1
| "/usr/include/stdc-predef.h" 1 3 4 # 32 "<command-line>"
| 2 # 1 "<stdin>" (size)sizeof(UU)
| orthoxerox wrote:
| > I have seen many people redefine 'for' and 'while'.
| #define while if
| MBCook wrote:
| A lot of this makes sense to me.
|
| I've started writing a bare metal OS for Arm64. It's very early
| but I've done some similar things. I'm using pascal strings, I've
| also renamed the types (though I'm using "int8" style, not "i8").
|
| I quickly decided that I never intend to port real software to
| it, so I really don't have to conform to standard C library
| functions or conventions. That's given me more freedom to play
| around. C is old enough to have a lot of baggage from when every
| byte was precious, even in function names.
|
| It's nice to get away from that. Much like the contents of this
| post, that plus other small renamed just ended up feeling like a
| nice cleanup.
| eddd-ddde wrote:
| i never thought about that, saving bytes even in symbols
| Eggsellence wrote:
| I think he is suggesting the opposite - use more verbose
| names for clarity.
| trealira wrote:
| Yeah, old C compilers would only look at the first 6
| characters of a name, and the rest were insignificant. That's
| how you get nanrs like "strcpy" and "malloc" instead of
| something like "string_copy" or "mem_allocate" (I still think
| "memory_allocate" would be long enough to be annoying to
| type).
| nxobject wrote:
| IIRC this mirrored the behavior of MACRO-11, DEC's first-
| class PDP-11 assembler.
| lifthrasiir wrote:
| One of last vestiges of this fact AFAIK was libjpeg, which
| had a macro NEED_SHORT_EXTERNAL_NAMES that shortens all
| public identifiers to have unique 6-letter-long prefixes.
| Libjpeg-turbo nowadays has removed them though [1].
|
| [1] https://github.com/libjpeg-turbo/libjpeg-
| turbo/commit/52ded8...
| vdqtp3 wrote:
| > I never intend to port real software to it, so I really don't
| have to conform to standard C library functions or conventions.
|
| So you're just building it as just a hobby, won't be big and
| professional like gnu?
| 1f60c wrote:
| Context: Linus Torvalds' announcement of Linux to
| comp.os.minix: https://www.cs.cmu.edu/~awb/linux.history.html
| uxp8u61q wrote:
| Honest question: why C? I've been writing C++ firmware for SoCs,
| supposedly an area where C is supposed to be king, and C++ was
| just as good, except it came with all the batteries included. So,
| why C?
| habibur wrote:
| In my case :
|
| - I prefer functions over classes.
|
| - no mangling of exported names, the binary is re-usable as
| API.
|
| - in the long term, the C source code is more re-usable in
| other projects than C++ ones.
|
| and more like this.
| rtz121 wrote:
| > - I prefer functions over classes.
|
| I also prefer apples over brooms.
| kovacs_x wrote:
| sure... and what about time spent debugging yet another
| unallocated/untimely freed/off by 1 pointer mistake?
|
| ps. you know you can write C++ code that is functional and
| use free functions primarily instead of putting everything in
| classes?
| habibur wrote:
| you can run valgrind and it will just point to you "you
| freed the memory on this line and then tried to reused it
| 10 lines bellow here, fix it." -- every time.
|
| And once you fix it, you have built a light weight library
| that you can use from any other language.
|
| Also these pointer manipulation is what gives C its power.
| uxp8u61q wrote:
| > I prefer functions over classes.
|
| Then use functions?
|
| > no mangling of exported names, the binary is re-usable as
| API.
|
| Not once in my life have I seen someone use a binary as an
| API.
|
| > in the long term, the C source code is more re-usable in
| other projects than C++ ones.
|
| I don't even know what you mean by that.
| lelanthran wrote:
| > I don't even know what you mean by that.
|
| He means that if you write your library in C it is callable
| from Python, Java, C++, Ruby, PHP, Python, Perl and more.
| uxp8u61q wrote:
| You can also do that with C++.
| lelanthran wrote:
| No, you cannot.
|
| Because C++ without exceptions is not C++, and those
| languages cannot catch exceptions, nor call overloaded
| functions, nor delete or create objects using new and
| delete, nor refer to fields with classes, nor call
| methods on objects.
| uxp8u61q wrote:
| > Because C++ without exceptions is not C++
|
| Write exception-free external APIs. It's not that hard.
|
| > call overloaded functions
|
| Write an API that doesn't overload functions?
|
| > nor delete or create objects using new and delete
|
| Write an API around that. You'd need to do it in C
| anyway.
|
| > nor refer to fields with classes
|
| What?
|
| > call methods on objects
|
| If you can call a function, you can call a method.
|
| Your complaint is basically "you cannot write C++ in
| Python". Duh.
| lelanthran wrote:
| > Your complaint is basically "you cannot write C++ in
| Python". Duh.
|
| Maybe it is a complaint, but it's still a fact of life:
| your code is not reusable without wrapping it in C.
|
| The reality is that C++ is not as reusable as C is: ever
| wonder why there are so few reused libraries written in
| C++, while they are so many in C?
|
| Look on your system now - it's filled with C libraries,
| while the C++ libraries are probably a rounding error.
| lelanthran wrote:
| Because the can read someone else C code without needing to
| look up almost all the quirks they used, while I cannot read
| someone's C++ code without having Google handy.
|
| Simplicity beats complexity almost every time.
| sebastianz wrote:
| > it came with all the batteries included
|
| Sometimes you do not want, or need, all the batteries.
| uxp8u61q wrote:
| Smart pointers alone make the switch worth it. Why wouldn't
| you want that?
| harpiee wrote:
| Smart pointers for what?
|
| I don't use any dynamically allocated memory in my firmware
| projects.
|
| I do sometimes use statically allocated pools for things
| like packet buffers and allocate chunks out of them, but
| their lifetimes are not scope based, so automatic call of
| constructors/destructors would not be of any help.
| uxp8u61q wrote:
| What do you want me to say? If nothing in your code owns
| a resource and needs to dispose of it when it's done,
| then obviously, you don't need smart pointers. I'm not
| here to evangelize, I'm trying to understand why someone
| would straight out refuse to use a language that offered
| more options if needed.
|
| I'm also wondering if you've heard about std::span, given
| your use case. I would be surprised if you weren't
| rebuilding much of that functionality.
| lelanthran wrote:
| Because it comes with everything else, notably it includes
| all the footguns or C and then adds orders of magnitudes
| more.
|
| Unless you're a solo developer, this is not a win.
|
| It's no accident that C++ is the only language in which its
| proponents have to self police their own teams to only use
| a subset of the language.
| uxp8u61q wrote:
| Sometimes I feel like I'm taking crazy pills when I go on
| HN. What footguns are there in C++ that aren't there in
| C?
| lelanthran wrote:
| Everything around the rules of destruction in derived
| classes from a base class with/without a virtual
| destructor.
|
| How about capture of values in a lambda within a loop?
| How to prevent a template expansion from killing you
| build process? When are move semantics sufficient for the
| std container classes and when are they not? What's the
| order of construction of multiple objects at file scope?
| When should a copy constructor be written so that the
| default shallow copy is prevented?
|
| All of those are footguns, because if you do them wrong
| the program has runtime bugs without any compiler
| warnings.
|
| They are all absent from C.
|
| I'm typing on a phone, so won't go into detailw, but if
| you want to make such an insane claim, bear in mind that
| Scott Meyers himself said that C++ is too complex for him
|
| You are not disagreeing with me when you make that claim,
| you're disagreeing with one of the world's foremost
| experts on C++.
| uxp8u61q wrote:
| > I'm typing on a phone
|
| But you have an uncontrollable urge to write here?
| Someone's holding a gun to your head?
|
| > You are not disagreeing with me when you make that
| claim, you're disagreeing with one of the world's
| foremost experts on C++.
|
| I guess that settles everything then. Never mind that
| you're misquoting him.
|
| Look, if you hadn't written the last two paragraphs, I'd
| have replied to your points, but they strongly indicate
| it would fall into deaf ears. You're clearly more
| interested in entertaining the peanut gallery more than
| actual discussion.
| lelanthran wrote:
| > I guess that settles everything then. Never mind that
| you're misquoting him.
|
| I'm not misquoting him - you are free to provide a link
| to the context in which he said what he said.
|
| The worlds foremost expert in C++, author of dozens of
| books on C++, disagrees with you. I'm merely agreeing
| with _him_.
|
| > Look, if you hadn't written the last two paragraphs,
| I'd have replied to your points, but they strongly
| indicate it would fall into deaf ears. You're clearly
| more interested in entertaining the peanut gallery more
| than actual discussion.
|
| The fact that you entered a thread about C practices,
| then got all salty when you tried to go with the "but why
| not use C++?" argument, then devolved into personal
| attacks is ... well "classy" is not the word I'd use.
|
| EDIT: You _can 't_ respond to those points - those are
| all well-known footguns that are present in C++ but not
| in C. What were you going to respond with? "No, C++
| doesn't have those!"?
| ForkMeOnTinder wrote:
| > To beginners it might seem like "wasting memory" by using a
| 32-bit boolean
|
| Maybe I'm a beginner then. He lists a few cases where it's not
| worse than sticking to 8-bit bools, but no cases where it's
| actually an improvement. It still wastes memory sometimes, e.g.
| if you have adjacent booleans in a struct, or boolean variables
| in a function that spill out of registers onto the stack. Sure
| it's only a few bytes here and there, but why pessimize? What do
| you gain from using a larger size?
| Quekid5 wrote:
| Computer architecture is optimized for 32+ bit _aligned_ access
| to most things. The gain is (usually, but not always!)
| performance.
| gavinhoward wrote:
| I'm afraid you are only slightly correct.
|
| Architectures are generally optimized for aligned access (or
| disallow unaligned access), but what counts as "aligned" is
| different for each type.
|
| A char type that is used for a bool can be accessed on any
| byte boundary because the alignment of a char is 1. The
| alignment of a 32-bit value is 4.
|
| However, architectures are generally more optimized for
| 32-bit operations _in registers_. If you 're dealing with a
| char in a register, the compiler will generally treat it as a
| 32-bit value, clearing the top bits. (This is one of those
| places where C's UB can bite you.)
|
| However, there are architectures where 32-bit access _is_
| optimized.
| tedunangst wrote:
| What's an example of a function where a boolean variable spills
| to the stack and the 3 bytes are important?
| gavinhoward wrote:
| And to add to that, if you use an actual bool type, sanitizers
| will warn you if they are ever any value other than false (0)
| or true (1).
| defrost wrote:
| It depends _entirely_ on the architectures | CPUs, that said
| the obvious case from past experience is numeric processsing
| jobs where (say) you flow data into "per cycle" structs that
| lead with some conditionals and fill out with (say) 512 | 1024
| | 2048 sample points for that cycle (32 or 64 bit ints or
| floats) .. the 'meat' of the per cycle job.
|
| My specific bug bear here was a junior who _insisted_ "saving
| space" by packing the structs and using a single 8 bit byte for
| the conditionals.
|
| Their 'improved' code ground throughput on intel chips by a
| factor of 10 or so and generated BUS ERRORs on SPARC RISC
| architectures.
|
| By packing the header of the structs they misaligned the array
| of data values such that the intel chips were silently fetching
| two 32 bit words (say) to get half a word from each to splice
| together to form a 32 bit data value (that was passed
| straddling a word boundary) to pipe into the ALU and then do
| something similar to repack on the other end - SPARC's quite
| sensibly were throwing a fit at non aligned data.
|
| Point being - _sometimes_ it makes sense to fit data to the
| architecture and not pack data to "save" space (this is all
| for throughput piped calculations not long term file storage in
| any case)
| bandrami wrote:
| At one point a loooooong time ago we said "let's give every
| struct its own page" as a joke but... holy crap, it was so
| much faster.
| xvedejas wrote:
| This is the use case for `uint_fast8_t` (part of the C99
| standard); it should use whatever width of unsigned integer
| is enough to store a byte, but fastest for the platform. You
| always know that the type can be serialized as 8 bits, but it
| might be larger in memory. So long as you don't assume too
| much about your struct sizes across platforms, it should be a
| good choice for this. Although, if alignment is an issue, it
| might be a bit more complicated depending on platform.
| stefan_ wrote:
| 10 years ago when ATmegas were still around and your 32 bit
| variable was generating 3 instructions for addition I would
| say ,,right on" but now everything is a 32 bit Cortex-M and
| please stop polluting your code with this nonsense
| seabird wrote:
| "Everything is 32-bit Cortex-M" isn't true and it's not
| even close.
| paulddraper wrote:
| IDK it seems semantically right
| xvedejas wrote:
| Please understand, I am still in a position where I am
| writing new code for a platform which only has one
| compiler, a proprietary fork of GCC from nearly 20 years
| ago. I assume other C programmers might have similar
| situations.
| kstrauser wrote:
| > a proprietary fork of GCC
|
| A what now?
| CapsAdmin wrote:
| I think it's not a GPL violation if you keep the fork
| non-public.
|
| Though I'm entirely sure not when something is considered
| private or public. You can obviously make changes to a
| GPL repo, compile it and run the executable yourself and
| just never release the source code.
|
| But what happens when you start sharing the executable
| with your friends, or confine it to a company?
|
| "I made this GCC fork with some awesome features. You can
| contact me at joe@gmail.com if you're intere$ted ;)"
| bdw5204 wrote:
| My understanding is that the GPL only requires the source
| code to be made available _on request_ for at least 3
| years (or as long as you support the software, if more
| than 3 years). If you want to require people who want the
| source to write to you via the Post Office and pay
| shipping+handling+cost of a disc to receive the source
| code, I believe this is permitted by the GPL as long as
| you don 't profit off of the cost.
|
| Of course, for almost all practical cases, the source
| code for a GPLed program is made available as a download
| off the Internet because the mail order disc route seems
| really archaic these days and probably would be removed
| altogether in a GPL version 4 if some prominent company
| used this loophole to evade the spirit of the GPL. Either
| that or somebody would jump through your hoops to get the
| source and just stick it on a public GitHub repo. If you
| then DMCA that repo, you'd be in violation of the GPL.
|
| If you share an GPLed executable with your friends or
| with other people at a company, then they'd presumably be
| able to request the source code. But if you run a Cloud
| GCC service with your fork, you could get away with
| keeping your source code proprietary because GCC isn't
| under AGPL.
| rags2riches wrote:
| My understanding is the violation happens when you share
| the binary without the license and sources, or
| information on how to request the sources.
| rowyourboat wrote:
| All the GPL says on source code access is that you need
| to make the source code available to whoever you
| distributed your program to. If the program never leaves
| a closed circle of people, neither does the source code.
| Cerium wrote:
| For example, Microchip XC16 [1]. It is GCC with changes
| to support their PIC processors. Some of the changes
| introduce bugs, for example (at least as of v1.31) the
| linker would copy the input linker script to a temporary
| location while handling includes or other pre-processor
| macros in the linker script. Of course if you happen to
| run two instances at exactly the same time one of them
| fails.
|
| As far as the licensing part goes they give you the
| source code, but last time I tried I could not get it to
| compile. Kind of lame and sketchy in my opinion.
|
| [1] https://www.microchip.com/en-us/tools-
| resources/develop/mpla...
| comex wrote:
| But if you don't use packed attributes, then the compiler
| will still add padding as necessary to avoid misalignment,
| while not wasting space when that's not necessary.
| defrost wrote:
| The key part (for myself) of ForkMeOnTinder's comment was:
|
| > Maybe I'm a beginner then. He lists a few cases where
| it's not worse than sticking to 8-bit bools, but no cases
| where it's actually an improvement. It still wastes memory
| sometimes
|
| They key part of my response is _sometimes_ "wasting
| memory" (to gain alignment) is a good thing.
|
| If someone, a beginner, is concerned about percieved wasted
| memory then of course they will use "packed".
|
| As for the guts of your comment, I agree with your
| sentiment but would exercise caution about _expecting_ a
| compiler to do what you expect in practice - _especially_
| for cross architectural projects that are intended to be
| robust for a decade and more - code will be put through
| muliple compilers across multiple architectures and
| potentially many many flags will be appied that may
| conflict in unforseen ways with each other.
|
| In general I supported the notion of sanity check routines
| that double check assumptions at runtime, if you want data
| aligned, require data to be big endian or small endian etc
| then have some runtime sanity checks that can verify this
| for specific executables on the target platform
| junon wrote:
| If you have three chars next to each other in a struct,
| there's a good chance they'll take 4 bytes of memory due to
| padding. 4 32-bit bools guarantee it'll take 12 at least,
| if not 16.
| eyegor wrote:
| Most of the time an easy optimization is to pad fields of your
| struct to a 32 bit boundary. Almost any compiler will do this
| for you (look up "struct alignment / padding"). If the compiler
| is going to do this anyway, might as well use the memory
| yourself instead of letting it be empty space. If it doesn't
| happen, you leave performance on the table, so doing this
| raises the chance that your struct/fields will be aligned.
|
| Nuance is that each field should be at an address divisible by
| the fields size or wordline size, not some magic 32 constant.
| The entire struct should also be padded to a multiple of the
| largest fields size. In practice this usually means 32 bit
| alignment.
|
| Ref http://www.catb.org/esr/structure-packing/
| [deleted]
| bjourne wrote:
| I like it a lot. Especially the part about ditching const
| qualifiers. They clutter function declarations, don't make the
| intent any clear, and almost never improve performance. Restrict,
| on the other hand, I've found makes the compilers emit better
| code in many cases.
|
| But I don't like using 1 and 0 instead of booleans. Many standard
| C functions (fclose for example), return 0 on success. Better to
| be explicit here.
| dundarious wrote:
| I use an exitint typedef to signify "0 is success, non-0 is
| failure" and boolint equivalent to his b32. Not typesafe of
| course, so it's just info for fallible humans.
| pylua wrote:
| I like using the const keyword, and believe it serves a real
| purpose with readability. I feel like most things are read
| access by default which is why it seems cluttered. I believe
| rust gets immutable by default correct .
| vitiral wrote:
| Great article. One thing for me is that I think we named our
| variables wrong. We should be specifying the number of bytes, not
| bits. I use
|
| U1, U2, U4, U8, I1, I2, etc
|
| Also S for "slot" aka unsigned pointer sized integer (usize_t)
|
| Another big point is formatting code to line-up instead of with
| an autoformatter. When you are doing something which is almost
| the same but slightly different it helps readability
| considerably. It is also a sign of a well-loved codebase, since
| I've never seen an autoformatter that can do it.
|
| Maybe we could make formatters at least auto _detect_ that code
| is already aligned and to just leave that code alone. Some kind
| of "love heuristic"
| ggliv wrote:
| Could you elaborate on the reasoning behind using byte length
| instead of bit length?
|
| Most of the time when I use fixed-width int types I'm trying to
| create guarantees for bitwise operators. From my perspective I
| feel like it therefore makes the most sense to name types on a
| per-bit level.
| stephc_int13 wrote:
| One nice thing about short types (i32 f32 u16 etc.) is that they
| can easily be extended for vector types such as f32x4 or u8x16
| etc.
|
| Consistency, conciseness and clarity (you don't have to guess
| much about those, once you've understood the naming scheme)
| Pathogen-David wrote:
| I actually have the opposite feeling. I like the non-sized type
| names like float extended as float4 or float4x4.
|
| My brain wants to read f32x4 as a 32 by 4 matrix of floats.
|
| (That being said I'd definitely be interested in trying your
| convention in the context of something like Rust.)
| stephc_int13 wrote:
| float4 is nice, but in video games, especially graphics, the
| vast majority of our floats are 32 bits but we also use 16
| bits floats quite extensively, and in this case it is quite
| practical to follow the same convention. f32 -> f16 f32x4 ->
| f16x4
|
| And when working on some heavily optimized SIMD code on the
| CPU side, I tend to use the default types even less.
| Pathogen-David wrote:
| My preference for float4 comes from HLSL so I'm aware. For
| vectors of 16 bit floats my preference would be half4 etc.
| (Although I do concede that in HLSL land that doesn't
| actually do what you want on older language versions.)
|
| IMO including the number of bits in all primitive types is
| usually an overcorrection from trauma caused by C/C++'s
| historic loose definitions of primitive type sizes. However
| I don't write much CPU SIMD code and can definitely see how
| you'd develop your preference from that context.
| [deleted]
| jeffrallen wrote:
| I came back to C after a good long time in Go, and I found that
| my C style had picked up some of these same good ideas, which I
| attributed to Go. In particular I also swore off NUL terminated
| strings, and started using structure returns to send back
| multiple values.
| thetic wrote:
| > #define sizeof(x) (size)sizeof(x)
|
| Undefined behavior[1]
|
| > #define assert(c) while (!(c)) __builtin_unreachable()
|
| Undefined behavior[1]
|
| > I'll cast away the const if needed.
|
| Undefined behavior[2]
|
| > The assignments are separated by sequence points, giving them
| an explicit order.
|
| I don't believe assignments are sequence points and only the
| function call is.
|
| [1]
| https://en.cppreference.com/w/c/language/identifier#Reserved...
|
| [2] https://en.cppreference.com/w/c/language/const
| LegionMammal978 wrote:
| >> I'll cast away the const if needed.
|
| > Undefined behavior[2]
|
| How so? As the page you linked mentions, simply _casting_
| 'const T *' to regular 'T *' is well-defined; it's only
| _modifying_ a const object through the pointer that 's UB (C17
| 6.7.3/7).
|
| > I don't believe assignments are sequence points and only the
| function call is.
|
| Assigments within expressions don't create sequence points.
| However, the expression of an expression statement is a full
| expression (i.e., not a subexpression of another expression),
| and there is a sequence point between each pair of full
| expressions (C17 6.8/4). In other words, the semicolons create
| sequence points.
| WalterBright wrote:
| > #define assert(c) while (!(c)) __builtin_unreachable()
|
| And people keep telling me that nobody uses the C preprocessor
| to define their own syntax any more!
| [deleted]
| dundarious wrote:
| > > I'll cast away the const if needed.
|
| > Undefined behavior[2]
|
| To be clear, it's only UB if the object was defined const,
| which _is_ the case given he wrote:
|
| > One small exception: I still like it as a hint to place
| static tables in read-only memory closer to the code. I'll cast
| away the const if needed.
|
| So you are correct on this point. Funnily enough, such objects
| are relatively rare IME, so I had to double-check to see that
| he was advocating it specifically in the rare case where it
| _must_ not be applied.
| comex wrote:
| > Undefined behavior[2]
|
| Given that this particular undefined behavior usually causes
| crashes in practice, I expect the author is talking about
| casting away the const but not actually writing to the pointer.
| Which is legal.
| thetic wrote:
| The legal cases in which he needs to cast away const could be
| avoided if the arguments to called functions were
| appropriately qualified.
| LegionMammal978 wrote:
| He never said he _needs_ to cast away const to do what he
| is attempting to do, he just said that he wants to cast
| away const to reduce clutter, even though the program would
| have the same semantics as if he kept the const.
| thetic wrote:
| If only there were a way to indicate the function
| argument isn't mutated. </s>
|
| My spidey senses tingle whenever I see const-ness cast
| away because it almost always means something is wrong.
| Either a function is missing a qualifier on an argument,
| or something very unsafe is happening. Why force callers
| to cast away const-ness in hopes that everything will be
| fine when you can just write the correct function
| signature.
| paulddraper wrote:
| Or a common situation is mutable -> const -> mutable.
|
| And that is legal.
| EPWN3D wrote:
| lol "no const". This is really groundbreaking stuff. Let's take a
| memory-unsafe language and make it even less safe.
| bandrami wrote:
| Rugby has much lower injuries than American football, it's
| often argued because rugby players don't use the helmets and
| padding and so are less willing to make catastrophically
| dangerous hits.
| cshenton wrote:
| Really lovely. A lot here reminds me of design in Odin lang.
| Short integral types, no const, composite returns over out
| params. Big fan of the approach of designing for a single
| translation unit and exploiting the optimisations that provides
| from RVO etc.
| jpcfl wrote:
| _Parameters and functions_
|
| _No const._
|
| Please don't. `const` is incredibly valuable, not only to the
| reader, but to the compiler.
|
| Take for example: int Foo_bar(Foo const* self);
|
| Just looking at this signature, I know that calling `bar()` will
| not modify the state of the object. This is incredibly valuable
| information to the reader.
|
| Furthermore, if I want to create a `Foo` constant, I can only
| call this function if it is `const`. static Foo
| const a_foo = FOO_INIT(&some_params); return
| Foo_bar(&a_foo); // Will not compile without 'const' in function
|
| `const` is valuable to the compiler, since `a_foo` can be placed
| into ROM on some platforms like MCUs, saving precious RAM.
| paulddraper wrote:
| Agreed; const is one of those features that is so good I wish a
| lot of other languages (e.g. java) had it.
| epcoa wrote:
| const in C and C++ are an abomination. On a pointer they
| don't tell the compiler to do shit, because they can't.
|
| That I can agree with TFA. However I agree with the GP that
| dismissing it entirely is a little misplaced. It serves as a
| hint/documentation and I think the article undersells the
| value of rodata (not the pointer use of const which is
| basically shit).
|
| I mean I have seen at least a few SIGSEGV/aborts due to
| attempted writes to ro memory. Also like, one of the few
| modern justifications for C, embedded, const still has
| important link time meaning.
| skovati wrote:
| FWIW, Java does have the "final" keyword.
| flakes wrote:
| Final only protects the variable from being assigned a new
| reference (similar to a const pointer). It doesn't protect
| any of the underlying data held by the object from being
| changed, unless the entire hierarchy has every field
| declared final as well. I still use final heavily in all of
| my Java code, but it doesnt convey the full intent I would
| like it to.
| _old_dude_ wrote:
| I remember James Gosling saying, a long time ago, that
| the whole class should be either mutable or not so you do
| not need to tag some methods with const.
|
| The consequence is that you may define two classes, one
| non-mutable and one mutable like String/StringBuilder.
| taylorius wrote:
| Java has many great qualities, but concision is not one
| of them.
| layer8 wrote:
| It means you have to triplicate each mutable class,
| because besides the immutable variant you also need the
| common interface (e.g. CharSequence), in order to pass
| mutable instances to read-only functions.
| paulddraper wrote:
| No, there are two classes -- mutable and immutable --
| that both implement the immutable interface.
| layer8 wrote:
| Yes, so three classes. I'm counting a Java interface as a
| class, because it is the same as a purely abstract class.
| In any case, three different named types.
|
| As a side note, I would say the interface is
| unmodifiable, not immutable, because references of the
| interface type may refer to mutable instances that can
| mutate while you use it through the interface. Immutable
| = doesn't change state, unmodifiable = _you_ can't change
| it's state via that reference (but it might change it's
| state due to other concurrent code holding a mutable
| reference). This nomenclature comes from the
| "unmodifiable" collection wrappers in Java, which _don't_
| make the underlying object immutable.
| david2ndaccount wrote:
| You only know there are no mutations if Foo itself does not
| contain any indirections. Additionally, the compiler generally
| cannot assume that Foo_bar does not modify Foo as it is legal
| to cast away const as long as it is not originally a variable
| declared as const (so in your static Foo example it would be UB
| to cast away const).
|
| static + const is valuable, but const parameters are merely a
| convention, there is no actual enforcement around them and due
| to aliasing the compiler generally can't assume the parameter
| doesn't actually change anyway.
| lelanthran wrote:
| > Additionally, the compiler generally cannot assume that
| Foo_bar does not modify Foo as it is legal to cast away const
|
| No, but _it can warn you!_
|
| The type is meant to capture programmer intention, and if you
| use `const` the compiler can warn you that your intention
| does not match the intention of the existing code (like, the
| intention of the author who wrote Foo_Bar).
| jpcfl wrote:
| True. I should rephrase to say `const` strongly suggests that
| a function does not change the observable state of a
| variable.
| glitchc wrote:
| I'm afraid you are mistaken. In particular for pointers, const
| does not guarantee that the memory at the location pointed to
| won't change. Const only guarantees that the address itself
| doesn't change.
| jpcfl wrote:
| > const does not guarantee that the memory at the location
| pointed to won't change
|
| I didn't say this. I said a `const` function tells the reader
| that the state of an object doesn't change.
|
| Another reader correctly pointed out that there are ways to
| modify the state of a `const` parameters (indirection and
| const cast), but I would argue that such an API is poorly-
| designed.
|
| To qualify my original comment, a reader only _knows_ a
| function doesn 't change an object's state if the API is
| well-designed.
| kibibu wrote:
| Should perhaps be int Foo_bar(const Foo *
| self);
| jenadine wrote:
| `const Foo*` and `Foo const*` are exactly the same and just
| a question of style (east-const vs. west-const)
|
| Not to be confused with `Foo *const`
| PennRobotics wrote:
| For anyone thinking, "wtf const pointer order??" fall
| back on the spiral rule:
|
| https://c-faq.com/decl/spiral.anderson.html
| glitchc wrote:
| Even then, some other function can change the memory at the
| address of self while this one is executing, especially in
| concurrent systems. Additionally, any other pointer
| pointing to the same address can also modify self's memory.
| const in this case is really just "scout's honour".
| __MatrixMan__ wrote:
| Sorry for going off topic, but something fun I've learned
| lately is that in Nim (which compiles to C) changing:
|
| let x = foo()
|
| ... to ...
|
| const x = foo()
|
| ...runs foo at compile time to get the value. I dunno I just
| thought it was neat.
| im3w1l wrote:
| > #define assert(c) while (!(c)) __builtin_unreachable()
|
| This seems like a bad idea, because the whole point of an assert
| is that something shouldn't happen, but might due to a (future?)
| bug.
| haimez wrote:
| > This seems like a bad idea, because the whole point of an
| assert is that something shouldn't happen, but might due to a
| (future?) bug.
|
| And so it's a bad idea because...?
|
| The whole idea is to notice a bug before it ships. Asserts are
| usually enabled in test and debug builds. So having an assert
| hit the "unreachable" path should be a good way to notice "hey,
| you've achieved the unexpected" in a bad way. You're going to
| need to clarify in more detail why you think that's a bad
| thing. I'm guessing because you would prefer this to be a real
| runtime check in non debug builds?
| im3w1l wrote:
| It's undefined behavior if the assert triggers in production.
| It's too greedy for minor performance benefit at the risk of
| causing strange issues.
| haimez wrote:
| Yikes. I did have to go down a little rabbit hole to
| understand the semantics of that builtin (I don't normally
| write C if that wasn't immediately obvious from the
| question) but that seems like a really questionable
| interpretation of "this should never happen". I would
| expect the equivalent of a fault being triggered and
| termination of the program, but I guess this is what the
| legacy of intentionally obtuse undefined behavior handling
| in compilers gets you.
| im3w1l wrote:
| The builtin itself is fine. It works exactly as it's
| intended. It says "I've double and tripple checked this.
| Trust me compiler. Just go fast". But you should not use
| it to construct an assert.
| LoganDark wrote:
| this is a true unconditional assert, e.g. "I assert that
| this condition is true". Problem is that's too much power
| to throw at most use cases.
| josephg wrote:
| Eh. I absolutely get what you're saying. And this is for
| sure flying very close to the knife's edge. But if your
| assertion checks don't run in release mode, and due to
| some bug, those invariants don't hold, well, your program
| is already going to exhibit undefined behaviour. Why not
| let the compiler know about the undefined behaviour so it
| can optimize better?
|
| The nice thing about this approach is that the assertion
| provides value both in debug and release mode. In debug
| mode, it checks your invariants. And in release mode, it
| makes your program smaller and faster.
|
| Personally I quite like rust's choice to have a pair of
| assert functions: _assert!()_ and _debug_assert!()_. The
| standard assert function still does its check in both
| debug and release mode. And honestly thats a fine default
| these days. Sure, it makes the binary slightly bigger and
| the program slightly slower, but on modern computers it
| usually doesn 't matter. And when it does matter (like
| your assertion check is expensive), we have
| _debug_assert_ instead.
| dxhdr wrote:
| > But if your assertion checks don't run in release mode,
| and due to some bug, those invariants don't hold, well,
| your program is already going to exhibit undefined
| behaviour. Why not let the compiler know about the
| undefined behaviour so it can optimize better?
|
| Usually in release mode you want to log the core dump and
| then fix the bug.
| travisgriggs wrote:
| Pretty much agree with most of this. My own personal evolved
| style is pretty similar. I'm suspect this kind of pragmatic style
| offends the theoretic and the academic. That can be intimidating.
| I'm glad at least one other person out there is like me.
| waffletower wrote:
| Would be great to see the author's treatment of memory allocation
| and lifecycle for complex data types.
| ok123456 wrote:
| See his posts about arena allocations.
| neilv wrote:
| > _While I still prefer ALL_CAPS for constants, I've adopted
| lowercase for function-like macros because it's nicer to read._
|
| "ALL_CAPS" in C was not for constants, but for preprocessor
| macros. It's shouting in all-caps, because it means "Look out!
| There's a cpp macro expansion here!"
|
| Related, please stop using "ALL_CAPS" for constants in other
| languages. Not only does shouting _constants_ as the most
| prominent syntax in the code make no sense, but there are _much_
| better uses for shouting in a programming language.
|
| (For an example of a good use of "ALL_CAPS": if your language
| ever acquires Scheme-like template-based hygienic macro
| transformers, "ALL_CAPS" (or "ALL-CAPS") is excellent for making
| template pattern variables stand out within the otherwise literal
| code blocks.)
| eviks wrote:
| Indeed, these are less readable and harder to type for no
| benefit
| sdk77 wrote:
| The case of what is a constant, and whether or not it even
| really is, is not always clear in C. As an embedded developer
| (almost always on bare metal), variables declared with the
| const modifier are usually (but not always, it depends on the
| linker script) placed in read only memory. For those kind of
| variables (read only ones, they're not really constants as in
| C++ constexpr) I don't use all caps. But for preprocessor
| macros, always. Even "#define MY_CONSTANT 10" is a macro, and
| not a constant or a variable. And it should be treated with
| caution, because it is dangerous (inexperienced programmers
| might change it to #define MY_CONST 2 * OTHER_CONSTANT, which
| opens up a can of worms).
| harpiee wrote:
| I do it mainly as a form of namespacing and aid in readability.
| do_thing(foo); // foo is variable do_thing(FOO); //
| FOO is constant (i.e this call should always do the same thing)
| foo = FOO; // I wanna name a variable the same as a constant
| swah wrote:
| Same, but maybe alternatively I'd accept:
| do_thing(Constants.foo)
|
| as equally clear...
| adamrezich wrote:
| nah, sorry, I'm going to keep using ALL_CAPS_CONSTANTS (and
| enum members, because they're also constants). I like having
| constants be visually distinct from the rest of my code.
| P_I_Staker wrote:
| Yeah, I don't see the problem. I believe this is default for
| pylint, and I see no issue with convention.
| rollcat wrote:
| It's the reason Rust makes us use an exclamation mark with
| macro calls: beware! magic! here!
|
| I like this as a convention, but not necessarily as a grammar
| rule. Printing values is common enough that it shouldn't
| require shouting for constant attention, simple code shouldn't
| trigger sensory overload.
| Sharlin wrote:
| I do think that the default convention of SHOUT_CASE for
| constants in Rust is too in-your-face given that there's
| nothing about Rust constants that would particularly require
| them to stand out. I might have gone with CamelCase given
| that some things in Rust already straddle the "types are
| CamelCase, terms are snake_case" delineation (enum variants,
| even data-less ones, are CamelCase, as well as the implicitly
| defined constant `Foo` for any unit struct `Foo`).
| afdbcreid wrote:
| There is some gotcha: they're copied on use, which means
| you could end up with more than one copy in your binary
| (unlikely with an optimizing compiler), or, worse, that if
| you have interior mutability in them it just won't work.
| jraph wrote:
| Interesting take, but good luck with this fight against all
| caps (snake case) constants, at this point it's almost a
| consensus, a shared culture element, a deeply ingrained habit
| that the vast majority of developers have and recognize.
|
| Changing this would be a huge undertake I'd be afraid of
| engaging, if I cared that much.
| avgcorrection wrote:
| Thanks for bringing this up. There is no reason for modern
| languages that have proper constants (not preprocessor macros
| which happen to sometimes be used for them) to use this tedious
| style.
|
| Constants are so innocent and useful. Why indirectly discourage
| their use by making their usage an eye-bleed?
| barbs wrote:
| What alternative would you suggest?
| yakubin wrote:
| In C++ I do it Google style: static
| constexpr int kConstantName = 42;
| _moof wrote:
| This is the old Mac style too. Might just be my
| upbringing but I prefer it.
| avgcorrection wrote:
| What's the point of the `k` prefix?
| smithza wrote:
| k for konstant
| avgcorrection wrote:
| Yes, that's what it stands for. What's the point of using
| any prefix at all?
| wrs wrote:
| So you don't have to wonder if it's a variable. Old Mac
| style uses prefixes: kConstant, gGlobalVar, TType,
| mMemberVar. Remember this was when all coding was done in
| black and white in a plain text editor.
| colejohnson66 wrote:
| Hungarian notation
| [deleted]
| avgcorrection wrote:
| Writing in a normal way in snake case or camel case or
| whatever the convention is.
| fefe23 wrote:
| Why would anyone care what the favorite whatever style of some
| dude on the Internet is?
|
| I like petunias! Now what? How does that help anyone?
| forgotpwd16 wrote:
| Your preference doesn't help anyone. That's true. But coding
| styles may improve its reader's programming.
| awestroke wrote:
| I'll probably copy everything in the post for my next C
| project. Super nice stuff.
| fredrb wrote:
| Sharing preferences and opinions on code ergonomics certainly
| has value for me, and I bet it has to other people too. This
| is, after all, a developer's forum.
|
| I'm certain your opinion on petunias and your possible distaste
| for orchids will be welcomed in a flower-news type orange site.
| :-)
| drpixie wrote:
| While there are a few disagreeable points, I like the article.
|
| I've always felt that C is unfairly maligned. Yes, it's very low
| level, it's meant to be. Yes, it lets you shoot yourself in the
| foot, but what language doesn't?
|
| Most of the problems with C are really issues with the standard
| library, the Unix (now Posix) interfaces, and the string type.
|
| None of these are actually part of C, but are part of how C is
| normally used. So those problems can be avoided, and use C for
| what it's good at.
| edvinbesic wrote:
| > I've always felt that C is unfairly maligned. Yes, it's very
| low level, it's meant to be. Yes, it lets you shoot yourself in
| the foot, but what language doesn't
|
| Isn't it a beauty of lower level languages that creating higher
| level abstractions provides more value?
|
| edit: typo
| eldenring wrote:
| > Yes, it lets you shoot yourself in the foot, but what
| language doesn't?
|
| Good lord.
| drpixie wrote:
| Well ... Name a language is which you categorically cannot
| "shoot yourself in the foot"!
| jcrites wrote:
| The issue is that it's a spectrum: how easily you can shoot
| yourself in the foot, especially on accident, without
| awareness of the risks. And perhaps what the consequences
| are when you do. Risk and consequence. C is high risk and
| also high consequence.
|
| In higher level languages, you can't shoot yourself in the
| foot nearly as easily in such a way as to trivially create
| a correctness problem and security vulnerability (like a
| buffer under/overflow). Languages like Java and C# make it
| pretty difficult to shoot yourself in the foot this way
| (though you still can in other ways, like with incorrect
| concurrency). Rust makes it a lot harder to shoot yourself
| in the foot across the board, especially on accident (i.e.,
| without being aware that you're something dangerous and
| low-level, viz. `unsafe`).
| LoganDark wrote:
| And even in Rust, the unsafe code guidelines and UB are
| extremely hotly debated and well-defined whenever
| possible.
| eviks wrote:
| The "categorically" part is a useless qualification, you
| don't program in a binary world, the ease with which a
| footgun is possible in a language is very important and
| can't be reduced to isPossible
| blix wrote:
| I do most of my programming in a binary world.
| slimsag wrote:
| It's not unfairly maligned, it's just that everyone remembers
| their college/university 'learning experience' which made no
| distinction between C/C++, they were told to use the Borland
| compiler, and when trying to learn printing "hello world" they
| only got a `segmentation fault` error instead of a stack trace.
| When they asked why it's so hard, they were told C/C++ is hard
| - so they dropped the class.
|
| Then they picked up a JS or Python class, were told high-level
| languages are easy and viola! they started to understand
| programming.
|
| That's the reason people are spiteful of it. They had a
| terrible learning experience right out the gate.
| jes5199 wrote:
| do compilers like gcc support stack traces now?
| ndesaulniers wrote:
| No; it's up to the program author to link against a library
| that provided back-traces (and maybe install a signal
| handler to call into that unwinder). Even then, some kind
| of information needs to be retained in the binary that's
| normally not (-gmlt comes to mind).
|
| Usually folks attach a debugger to capture a stack trace.
| Usually the debugger uses debug info to determine where the
| program is, and it's stack trace. Or it can walk frame
| pointers. Depends on if either are even used, which is a
| compile time decision.
| Darbyannskinner wrote:
| [dead]
| uwagar wrote:
| looks like u tabstop at 4, i prefer 2. other than that our styles
| match :)
| WalterBright wrote:
| Interesting how my experience has led me in a different
| direction:
|
| https://dlang.org/blog/2023/10/02/crafting-self-evident-code...
|
| (The article is crafted around D, but the principles apply to C
| as well.)
| keyle wrote:
| Fun read. What happened to the conditional
| expressions? Move them to the interiors of doX() and doZ().
|
| That was an interesting point. Not sure that it's always valid
| but I guess it depends where you want the abstraction to lay,
| and how it affects the mental construct around the code.
|
| e.g. deleteRecords();
|
| is not better than if let x = deadRecords()
| deleteRecords(x);
|
| Sure, it looks messier but there is value is showing upfront
| that you're pruning and not wiping.
|
| If the author wisely renames his function e.g.
| pruneDeadProjects(), yes. But merely moving the the condition
| within the function can be dangerous for context and be a leaky
| abstraction.
| WalterBright wrote:
| Finding the right abstraction isn't always easy. Sometimes if
| I just put it down and let it slosh around in my brain for a
| few days, it comes to me.
|
| Like your idea of pruneDeadProjects()!
| tmtvl wrote:
| The programming strategy of "less typing, more thinking",
| it's a good one for avoiding RSI.
| [deleted]
| lelanthran wrote:
| In general, the style the author has adopted is to introduce
| brevity where he can, and use wrappers over what would have been
| standard and idiomatic C code. In most situations, these
| conventions aren't good in a non-solo project, because they
| simply aren't as obvious to the programmer.
|
| He says so himself:
|
| > I don't intend to use these names in isolation, such as in code
| snippets (outside of this article). If I did, examples would
| require the typedefs to give readers the complete context. That's
| not worth extra explanation. Even in the most recent articles
| I've used ptrdiff_t instead of size.
|
| You require extra work to understand his basic types before
| reading even a short snippet, so he doesn't use it when he wants
| people to read short snippets.
|
| Introducing additional stuff the programmer must remember that
| does not add any safety is pointless busywork.
|
| A non-complete summary of his conventions:
|
| 1. typedef standard typenames to 3-char symbols,
|
| 2. remove qualifiers like const,
|
| 3. use macros to reduce the amount of typing the programmer does,
|
| 4. typedef all structs (and enums too, I assume)
|
| 5. A macro-ized string-typed with prefixed-length.
|
| > Starting with the fundamentals, I've been using short names for
| primitive types. The resulting clarity was more than I had
| expected,
|
| This _isn 't_ clear: `int8_t` is a lot clearer to a C programmer
| than `i8`, because a C programmer has already internalised the
| pattern of the stdint.h types. This is going to lead to subtle
| bugs as well: quick, according to his convention, what is the %
| specifier for `byte`?
|
| You can use %c, but that gives you an ascii character (which is
| not what we think of when we say 'byte').
|
| If you use PRIu8 the compiler might give warnings because `char`
| might be signed. The best option is to just not use `byte` and
| use `uint8_t` instead (or, in his system, `u8`).
|
| Same with `b32` vs `i32` - it's a distinction without a
| difference and mixing these types won't give compiler warnings,
| while it is almost certainly an error on the part of the
| developer. Use `bool` if you don't like `_Bool`.
|
| In general I try to take advantage of whatever typing C provides;
| I don't try to subvert it because I _want_ the compiler to warn
| me when my intention doesn 't match the code I wrote.
|
| > No const. It serves no practical role in optimization, and I
| cannot recall an instance where it caught, or would have caught,
| a mistake.
|
| I disagree with dropping `const`.
|
| 1. It's useful as an indicator to the caller that the returned
| value must/must not be freed. It's a convention I use that makes
| it easy to visually spot memory leaks.
|
| Of the two functions below, it's clear to me which one needs the
| returned value `free()`ed and which one doesn't.
| const char *replace_substring (const char *src, const char *pat,
| const char *replacement); char *replace_substring (const
| char *src, const char *pat, const char *replacement);
|
| 2. It actually _does_ catch a lot of problems, because the
| compiler warns me when I attempt to modify a value that some
| other code I wrote never intended to be modified. It 's about
| intention, and when I _know_ it is safe to modify the `const`
| value, then I have to explicitly cast away the const to compile
| my program. Anyone reading the program will _know_ that the
| modification of the const-qualifed value is intentional, and
| therefore safe.
|
| > #define s8(s) (s8){(u8 *)s, lengthof(s)}
|
| This is interesting. I will try this out in my next project. I do
| think that there'll be quite a few compiler warnings for sign-
| mismatch though. This is the second "I wonder what the sign is"
| question for programmers reading his code - it means that his
| code has to compile with the flags that he compiles it with (I
| assume he's passing a flag to force chars to a particular sign).
| You can't simply compile his code in another project unless you
| copy his flags, and those flags may conflict with the new
| projects flags.
|
| I also wish that he'd showed a few examples of how having the
| length helps - what is presented in the post doesn't show any
| additional string safety over using nul-terminated strings. All
| those macros, including the one that creates the struct, could be
| written to operate on null-terminated strings. In essence, the
| length can be simply unused for everything! Where's the safety!?
|
| > It's also led to a style of defining a zero-initialized return
| value at the top of the function, i.e. ok is false, and then use
| it for all return statements. On error, it can bail out with an
| immediate return.
|
| I use a similar pattern, but I use `goto cleanup` on all errors;
| you can't, as a general pattern, return early in a non-trivial C
| function without leaking resources. You can, as a general
| pattern, `goto cleanup` in every C function to clean up
| resources. I prefer the general pattern that I use everywhere
| rather than having to ensure that all resources acquired _up to
| that particular return statement_ are released.
|
| > rather than include windows.h, write the prototypes out by hand
| using custom types.
|
| I think this is a very bad idea: you can't depend on the headers
| not changing after a compiler or library update. Sure, _maybe_ in
| practice, all the Windows types and declarations don 't change
| all that much, but I wouldn't want to be the developer trying to
| hunt down a bug because the interface to some function has
| changed and the compiler isn't giving me errors.
|
| All in all, I dunno if I would look forward to working on a team
| with these conventions - the code is harder to read, doesn't work
| in isolation, needs custom flags, and introduces a string type
| without introducing any string safety with it.
| sylware wrote:
| I would not use typedefs as this should not be in the C syntax
| (like enum,switch, and much more).
|
| The primitive type names should be native, but to "fix" C I
| prefer using the C preprocessor (I use it for namespace/name
| mangling too). This is not perfect, but should be already way
| more than enough.
|
| With proper preprocessor usage (without going amok), one can
| write one compilation unit software roughly easily.
| lionkor wrote:
| Isn't defining byte = char a bit wrong? char may be signed or
| unsigned, and may be more or less than 1 byte, right? So why
| that?
| tmtvl wrote:
| > _char may be signed or unsigned,_
|
| Correct, the standard does not specify whether char is signed
| or unsigned, so it's implementation-specific.
|
| > _and may be more or less than 1 byte, right?_
|
| Wrong, char is specified as a single byte character, so the
| following will always be true: sizeof(char) ==
| 1;
| Uptrenda wrote:
| IMO, defining your own types is one step too far. Now everyone
| who is already familiar with C types has to learn your own quirky
| system to understand one program. I think it does probably make
| sense to be specific about the sizes though e.g. using uint32_t
| over just uint (and expecting to receive some architecture-
| dependent size you might not get with uint.) These types should
| be defined in the right header (I think it depends on compiler?)
| It's been a while since I wrote any amount of C so my apologizes
| if this isn't correct.
| Lockal wrote:
| Just a note: defining own integer types has sense for resource-
| limited platforms. Most common type I see is something like
| "dim_t", which is 32-bit or 64-bit depending on use-case.
| 32-bit integers are often used even on 64-bit platforms in
| pointer compression schemes (for example, allocate your own
| heap and only store 32-bit offsets). This not only gives 2x
| improvement on memory usage for <4GB workloads, but it also
| improves performance due to better cache locality.
| adrianN wrote:
| How does it improve locality?
| rocqua wrote:
| More 'pointers' (32 bit offset ints) fit on a single cache
| line. Or, put differently, a list off offsets is half as
| long amd hence everything om the list is twice as close to
| everything else on the list.
| Nursie wrote:
| > These types should be defined in the right header
|
| stdint.h
|
| It's always been amazing to me how many different projects I've
| worked on (not that I've been in professional C for about 7
| years now)) that include their own painstaking recreation of
| this file.
|
| Reusing them and effectively translating them just to your own
| name is just annoying to the reader IMHO. I am reminded of a
| C++ project I worked on, where I questioned the extensive use
| of typedefs around collections of things, various forms of
| references and compound objects etc. I was informed by one of
| the more experienced C++ folks that it made the code easier to
| comprehend.
|
| Later I saw the typedef cheat-sheet sellotaped to the side of
| his monitor...
| gallier2 wrote:
| For bool it's even worse.
| cesarb wrote:
| > It's always been amazing to me how many different projects
| I've worked on (not that I've been in professional C for
| about 7 years now)) that include their own painstaking
| recreation of this file.
|
| How many of them started before stdint.h existed? AFAIK, it's
| a somewhat recent addition to the C language, and IIRC, for a
| long time even after it became part of the C standard, some
| popular C compilers still didn't have it.
| Nursie wrote:
| As recently as eight years ago, on projects started within
| the previous handful of years. It's more to do with a lot
| of C programmers being stuck in a sort of stasis IMHO. (I'm
| sure I was too in many ways).
|
| And yes, Microsoft were the outlier and absolutely dragged
| their heels on stdint, but you could always grab a
| compliant implementation from one of the FOSS projects that
| produced one.
| uxp8u61q wrote:
| inttypes.h was added to C99. A quarter of a century ago.
| yakubin wrote:
| Many C codebases predate that. And Visual Studio for a
| long time didn't support anything newer than C89.
| eqvinox wrote:
| There's no requirement for a born-1995 codebase to still
| build on a 1995 system in 2023.
|
| I work on a born-1995 codebase. We started requiring an
| ISO C11 _plus GNU extensions_ 1 several years ago and are
| actively removing "compatibility" checks and kludges that
| are outdated.
|
| [1 to be fair - not needing to support Windows is a
| godsend for any C project.]
| [deleted]
| kazinator wrote:
| The u32/i64/... naming system is pretty common.
|
| Assuming you can trust those types to be what they look like,
| the code is readable.
|
| I've worked with C for well over 30 years; custom typedefs are
| par for the course. Work with OpenMAX libs? You have OMX_U32.
| On Windows? You have DWORD. Using Glib? guint32 ...
| the_mitsuhiko wrote:
| The big issue with custom integer types is that while they are
| awesome in the implementation files, they are problematic for
| libraries in headers. And if you want to avoid a divergence
| between header and implementation files you're kinda stuck with
| the inttypes.h ones in practice.
| rkagerer wrote:
| The author _did_ qualify it with _personal_ coding style.
| Frankly the standard types are too verbose and I wish this guy
| 's elegant and clear list had been the one that was adopted way
| back when.
| iforgotpassword wrote:
| That ship has sailed ages ago. There are some things you
| should just accept about C, or any programming language
| really. Just because you _can_ do something doesn 't mean you
| _should_ do something. I don 't know how many years of
| experience in C this guy has, but this is a "been there, done
| that" case for me. I stick to stdint and stdbool today, and
| even if only half the code/libs I interface with do that,
| it's already worth the extra _t-typing all the time. Just the
| fact that they use the i prefix for signed, and s for string
| has a high chance that his s8 string type gets confused with
| an 8-bit signed int.
|
| But as you say, it's a personal style, and the author seems
| to be aware of that:
|
| > I'm not saying everyone should write C this way, and when I
| contribute code to a project I follow their local style.
|
| Because that's by far the most important rule to follow in
| _any_ language.
|
| I think the rest is less controversial, the 0 vs. NULL thing
| has been going on forever; I didn't check recently but I'd
| assume "const somestruct *foo" would still sometimes help out
| the compiler to optimize vs. the non-const version.
| stephc_int13 wrote:
| I use very similar types as the author in my own libs and
| framework.
|
| I think this is perfectly legitimate, in the same way that
| I don't use std libs directly but always behind wrappers or
| my own implementation.
|
| The C std lib and default types are often what is keeping
| the language back.
|
| And they should be used when you have no other choice.
|
| Short name for scalar types is also pretty much the new
| standard for modern languages such as Zig.
| marwis wrote:
| In case of winapi you can actually generate your own headers
| following the style you want, see for example
| https://github.com/microsoft/cppwin32
| lelanthran wrote:
| > The author did qualify it with personal coding style.
| Frankly the standard types are too verbose and I wish this
| guy's elegant and clear list had been the one that was
| adopted way back when.
|
| They didn't adopt it for the same reason that it is a bad
| idea now - too many programs already contained at least one
| variable named after his types.
|
| If the standard had adopted his convention, too many programs
| will break, which is why his convention is currently
| unsuitable for any existing project.
| otikik wrote:
| Wouldn't existing programs just continue working? What the
| author did was adding new types, not modifying or removing
| existing ones.
| iforgotpassword wrote:
| Because those existing programs surely don't use the same
| identifiers for other stuff? Certainly there is no code
| out there using s8 for "signed char" instead of "utf8
| string"? :-)
| lelanthran wrote:
| > Wouldn't existing programs just continue working?
|
| Only ones which don't have variables named `i8` or `b32`
| (which is common, but not for booleans).
|
| I've seen many projects which used the pattern
| [a-z][1-9]+ as variables. Those programs with a variable
| called `i8` won't compile if the standard made a type
| called `i8`.
|
| In particular, the standard reserves entire patterns to
| itself, so it cannot reserve the pattern of [a-z][0-9]+.
| They could, and did, reserve the pattern *int*_t for
| themselves.
| otikik wrote:
| But that problem exists for any C project that uses an
| external library. If the library defines something that
| the project already uses, then the project will not work.
|
| In my mind that's not a problem with the decisions taken
| by the author of the article, it's more of a symptom of
| C's limitations.
| lelanthran wrote:
| > But that problem exists for any C project that uses an
| external library. If the library defines something that
| the project already uses, then the project will not work.
|
| For libraries, yes, but we're talking about why the
| _standard_ didn 't do it.
|
| The standard _did not want_ [1] to reserve keywords that
| current programs were already using.
|
| A library that conflicts on keywords will only break with
| those programs that use it. A standard that conflicts on
| keywords breaks all programs in that language.
|
| > In my mind that's not a problem with the decisions
| taken by the author of the article, it's more of a
| symptom of C's limitations.
|
| One of the constraints of taking decisions is to work
| within the limits existing framework - if you're avoiding
| the alternatives _that don 't break_, then it's the
| decision-makers bug, not the frameworks.
|
| The framework has limitations, widely published and
| known. You make decisions within those limitations.
|
| [1] Although, they do do it, it's only with relectance,
| not on a whim to avoid typing a few characters)
| Communitivity wrote:
| No program should every have variables names according to
| [a-z][1-9]+ pattern, except perhaps loop indices - and
| not even then.
| lelanthran wrote:
| > No program should every have variables names according
| to [a-z][1-9]+ pattern, except perhaps loop indices - and
| not even then.
|
| What's that got to do with not breaking existing
| programs?
| bdw5204 wrote:
| Those may be terrible variable names but they were
| understandable back in the 70s and 80s when disk space
| was at a premium and compilers only cared about the first
| 6 characters in a variable or function name. That's the
| downside of a 50 or so year old programming language: you
| have to worry about not breaking legacy code that did
| things based on the hardware limitations of that time.
| gdprrrr wrote:
| Typedefs and variable namens don't live in them same
| namespace, do they?
| lelanthran wrote:
| > Typedefs and variable namens don't live in them same
| namespace, do they?
|
| Depends. See this snippet:
| https://www.godbolt.org/z/5T5jz47q4
|
| Cannot declare a variable called `u8` when there is a
| typedef of `u8`.
|
| And even when you _can_ declare a variable called (for
| example) `int`, that effectively "breaks" the program by
| not being even a tiny bit readable anymore.
| mattpallissard wrote:
| > Now everyone who is already familiar with C types has to
| learn your own quirky system
|
| Oh I dunno. On one hand yeah learning a quirky system is an
| annoyance at times. On the other hand when you're coming from a
| language with a real type system dealing with custom types is
| standard operating procedure.
|
| I've had to patch a lot of C over the years. I can't say I've
| ever been bothered by types. It's always the usual suspects;
| hard coded offsets peppered throughout the codebase, stack
| smashing, baby's first callback implementation, "parsing" that
| omits lexing/tokenizing, archaic business logic that may-or-may
| not have ever been correct.
| voxl wrote:
| they're not quirky types in the least...
| guidoism wrote:
| I agree. A lot of languages have settled on those same names
| or something similar. We don't live in a world with a single
| word size anymore so carrying bit length in the name is
| critical, and so is keeping identifier names short. His trade
| off is exactly the one I would make.
| WalterBright wrote:
| > carrying bit length in the name is critical
|
| I beg to disagree. In D: byte - 8 bits
| short - 16 bits int - 32 bits long - 64
| bits
|
| absolutely nobody is confused about this.
| eviks wrote:
| Of course plenty of people are confused, the overhead of
| "short/long" just makes no sense, but yet another bad
| design from the past carefully preserved
| WalterBright wrote:
| Haven't run into a confused one yet, and D has been
| around 20 years.
| iforgotpassword wrote:
| Because we've used those names since forever, but that's
| archaic random crap really. Nothing apart from maybe
| "byte" makes sense here, the rest is completely arbitrary
| historic cruft. Could as well have called the rest timmy,
| britney and hulk.
| runiq wrote:
| > Nothing apart from maybe "byte" makes sense here
|
| Lest we forget: https://web.archive.org/web/2017040313082
| 9/http://www.bobbem...
| layer8 wrote:
| Java uses exactly the same, and has a huge developer
| mindshare. While rooted in historic accidents, it's well-
| established.
| another2another wrote:
| In that case D should probably start to have an internal
| conversation about what they're going to call 128 bits
| then, 'cause its going to become a thing sooner or later.
|
| stdint already has that covered though: (u)int128_t
| gallier2 wrote:
| it has defined the type for very long: it's cent and
| ucent. It hasn't implemented it completely though, but
| named and defined it is already forever.
| fps_doug wrote:
| So will the 256bit one be dollar or euro?
| WalterBright wrote:
| We'll think of something. Perhaps `bright`?
| another2another wrote:
| That's really interesting, and for me a totally
| unexpected name, having never seen that nomenclature
| before - would be interesting to see how consensus around
| that was arrived at - but hey, we gotta call it
| something! (But not DoubleQuadWord please ... )
| gallier2 wrote:
| and ubyte - 8 bits ushort - 16
| bits uint - 32 bits ulong - 64 bits
| ucent - 128 bits float - 32 bits
| double - 64 bits real - maximum precision
| hardware allows (80 bits on x87).
| seanw444 wrote:
| When we move into the 128-bit CPU era, will we call
| 128-bit integers "super long"? Maybe "elongated". Maybe
| "huge"?
|
| Or, you know, we could just name them all by bit length
| and completely future-proof this system.
| ogogmad wrote:
| Why would we ever need 128-bit CPUs? I remember the PS2
| had something like that (with details and caveats I don't
| understand), but subsequent games consoles went back to a
| more usual register size:
| https://en.wikipedia.org/wiki/128-bit_computing
| seanw444 wrote:
| All I know is we keep having this issue with saying "nah,
| this is it. Nobody will ever need more than this." And
| then inevitably the time comes when we need more.
| richard_todd wrote:
| We used to call 16-bytes a paragraph, so the nostalgic
| geek in me would love to see 'para' catch on. I never
| thought I'd be slinging around whole paragraphs of memory
| in registers!
| WalterBright wrote:
| We call them "cent" and "ucent" in D :-)
| outsomnia wrote:
| In isolation, they're not crazy.
|
| But much C code is bringing in library headers which contain
| their author's own pet choices for these, which inevitably
| are not the same and the result is extremely confusing when
| you have that in play as well as the stdint.h ones.
|
| The kernel contains a mixture of "pet" types like u32 and
| stdint ones, it's already confusing.
|
| He also does make a "crazy" choice later to call his string
| class "s8" which clashes with his nomenclature here.
| wiseowise wrote:
| > which clashes with his nomenclature here.
|
| How?
| asalahli wrote:
| Because it can be mistaken for 8-bit signed integer.
| lelanthran wrote:
| > they're not quirky types in the least...
|
| But they _are_ buggy (correct code cannot depend on the sign
| of `char`), which is usually the result of typedefing
| primitive types to save typing 3 characters on each use.
| WalterBright wrote:
| The reality is that C `int` is 32 bits in size.
|
| Sure, that's not true for 16 bit targets. But are you really
| going to port a 5Mb program to 16 bits? It's not worth worrying
| about. Your code is highly unlikely to be portable to 16 bits
| anyway.
|
| The problem is with `long`, which is 32 bits on some machines
| and 64 bits on others. This is just madness. Fortunately, `long
| long` is always 64 bits, so it makes sense to just abandon
| `long`.
|
| So there it is: char - 8 bits short -
| 16 bits int - 32 bits long long - 64 bits
|
| Done!
|
| (Sheesh, all the endless hours wasted on the size of an `int`
| in C.)
| rurban wrote:
| C is also used on embedded. It's even there the default
| language.
| ultrarunner wrote:
| Yeah, almost the only time I'm writing C anymore is
| embedded, where I want to reason about type widths (while
| taking on as light a cognitive load as is possible). I have
| enough code that gets compiled to an 8, 16, or 32 bit
| target depending on context that having the bit width right
| on the tin is valuable. And it doesn't even cost me "hours
| and hours".
| ryandrake wrote:
| Also: Embedded is almost the only time you really, truly
| need to care about how many bits a type is, and only when
| you're interacting with actual hardware.
|
| For almost every other routine task in programming, I
| would argue that it really doesn't matter if your int is
| 32 bits wide or 64 bits wide. Why go through the trouble
| of insisting on int32_t or int64_t? It probably doesn't
| matter for the things you are counting.
|
| Some programmers will say "Well, we should use int64_t
| here because int32_t might overflow!" OK, so why weren't
| you checking for overflow if it was an expected case?
| int64_t might overflow too, are you checking after every
| operation? Probably not. "OK, let's use uint64_t then,
| now we get 2x as many numbers!" Now you have other
| overflow (and subtraction) problems to handle.
|
| Nowadays, I just use int and move on with my life. It's
| one of those lessons from experience: "When I was
| younger, I used int and char because I didn't know any
| better. When I was older, I created this complex,
| elaborate type system because I knew better. Now that I'm
| wise, I just use int and char."
| WalterBright wrote:
| > It's one of those lessons from experience: "When I was
| younger, I used int and char because I didn't know any
| better. When I was older, I created this complex,
| elaborate type system because I knew better. Now that I'm
| wise, I just use int and char."
|
| Right on, dude. I've gone full circle on that, too.
|
| I also spent years wandering the desert being enamored
| with the power of the C preprocessor. Eventually, I just
| ripped it out as much as possible, replacing it with
| ordinary C code. C is actually a decent language if you
| eschew the damned preprocessor.
| terracottalite wrote:
| > 2x as many numbers
|
| Minor correction, 2^32x as many numbers. Though I agree
| with your point.
|
| Edit: added x to the number for consistency and clarity.
| ryandrake wrote:
| LOL yea, that's what was in my brain but somehow 2x got
| typed. Good catch.
| [deleted]
| slikrick wrote:
| so... what I'm seeing is that C got it wrong relative to the
| way things actually work and get used.
|
| the fact that you had to have tribal knowledge about all of
| this is why C shouldn't stay for the long term and we should
| phase out languages into ones with stronger more correct
| defaults.
|
| would a new programmer use "long long"? would they notice
| immediately that things didn't work if they didn't use it?
|
| Rust got it correct by labeling the bits with the type
| directly
| lelanthran wrote:
| You realise that C had labeled types long before Rust was
| conceived?
| kazinator wrote:
| Rust's integer types are poorly abstracted. The use of
| specifically sized types for quantities that are not
| related to hardware is comically ridiculous.
|
| In the C world, only the goofballs do things like use
| _char_ or _int8_t_ for the number of children in a family,
| or wheels on a car.
|
| yet that is what Rust code looks like. Almost every Rust
| code sample I've ever seen sets off my bozon detector just
| for this reason.
| Findecanor wrote:
| Yet another issue is that `char` is signed on some platforms
| but unsigned on others. It is signed on x86 but unsigned on
| RISC-V. On ARM it could be either (ARM standard is unsigned,
| Apple does signed).
|
| I therefore use typedefs called `byte` and `ubyte` wherever
| the data is 8-bit but not character data. I also use the
| aliases `ushort`, `uint` and `ulong` to cut down on typing.
| On the other hand, the types in <stdint.h> are often
| recognised by syntax colouring in editors where user-defined
| types aren't.
| WalterBright wrote:
| Yes, the optional sign on char is also madness. C had a
| chance in 1989 to make it unsigned, and muffed it. (When
| C86 decided between value-preserving and sign-preserving
| semantics, they could have also said char was unsigned, and
| saved generations of programmers from grief.)
|
| D's `char` type is unsigned. Done. No more problems.
| hasmanean wrote:
| Default signed/unsigned is not a platform convention but a
| compiler one. You can change it with a compiler switch, in
| the makefile.
| WalterBright wrote:
| Um, it was dependent on how the CPU handled it back in
| those days. That problem went away, though.
| chmod775 wrote:
| Then you're better off using custom types - that way
| people will immediate know your type is non-default - as
| opposed to hiding your customization away in a makefile,
| pranking people who expect built-ins to behave a certain
| way.
| munch117 wrote:
| The people who understand that it can be either,
| depending on a compiler switch, are exactly the people
| who use an explicit sign (typically via a typedef) to
| ensure their code always works.
|
| The people who say that char is de facto signed and
| everyone should just deal with it, are the people who end
| up writing broken code.
| consp wrote:
| Which are minima, but in practice they represent the width.
| kazinator wrote:
| The Motorola 68K family has been targeted by C compilers
| configured with 16 bit int.
| WalterBright wrote:
| "has been", i.e. obsolete
| Am4TIfIsER0ppos wrote:
| Wait... where's plain "long"? I know, you probably know, but
| that is why you use explicit sizes where you can.
| WalterBright wrote:
| I quit using "long" because sometimes a long is 32 bits and
| sometimes 64, and I can never remember which compiler does
| which. But "int" is 32 bits and "long long" is always 64
| bits, so that's what I stick with.
|
| C's "long" should not be used in new code.
| sovande wrote:
| Exactly this (plus floating point types and unsigned
| qualifier) and done. It's standard C, there is no need to
| invent yet another unnecessary "type" system for standard C
| native types. I do like bool though.
| stephc_int13 wrote:
| Long Long is ridiculous and often confusing.
|
| Now, what about SIMD types?
|
| What about 16bits floats?
|
| Using the short size convention we have easy and logical
| answers.
|
| The reason why new languages like Rust and Zig are using
| those conventions is not random, types naming (and stdlib) is
| a weak point of C (and C++).
|
| Luckily they are not set in stone, we can choose different
| and reasonable conventions.
| WalterBright wrote:
| I use `halffloat` for 16 bit floats. But be careful, there
| are several different encodings of 16 bit floats, so
| float16 isn't enough in and of itself.
|
| SIMD types in D are done with:
| __vector(byte[16]), __vector(int[8])
|
| and an alias (typedef for the C folk) for this is commonly
| used, like `byte16` and `int8`.
| sovande wrote:
| > Long Long is ridiculous and often confusing
|
| It might be ridiculous, but it's hardly confusing for a C
| programmer. But, yeah in and ideal world 'long' should just
| be defined as 64 bits
| self_awareness wrote:
| Isn't "int" 64-bit on some (rare) architectures?
| grotorea wrote:
| Not mentioned on the table I know
| https://en.cppreference.com/w/cpp/language/types
|
| edit: Oh you're right
|
| > Other models are very rare. For example, ILP64 (8/8/8:
| int, long, and pointer are 64-bit) only appeared in some
| early 64-bit Unix systems (e.g. UNICOS on Cray).
| terracottalite wrote:
| My computer being one of those (rare?) architectures.
| Though I think it is not entirely dependent on the
| processor and the OS choice also affects this.
| colejohnson66 wrote:
| Not sure about 64-bit `int`, but it is 16-bit on some
| 16-bit micros, such as the AVR line (used by the original
| Arduino).
| galangalalgol wrote:
| Ti dsps have 48bits for long.
| i_am_a_peasant wrote:
| yeah i had 48 bit pointers on one asic. upper 16 bits
| selected memory partition/region
| [deleted]
| aidenn0 wrote:
| I swear I've seen 128-bit "long long" types.
| kazinator wrote:
| Choosing integer sizes in C is pretty easy. The standard
| guarantees certain minimum ranges.
|
| 1. Consider the char and short types only if saving storage
| is important. Do not declare "char number_of_wheels" for a
| car, just because no car has anywhere near 127 wheels, unless
| it is really important to get it down to one byte.
|
| 2. Prefer signed types to unsigned types, when saving storage
| is not important. Unsigned types bend the rules of arithmetic
| around zero, and mixtures of signed and unsigned arithmetic
| add complexity and pitfalls. Do use unsigned for bitmasks and
| bitfields.
|
| 3. Two's complement is ubiquitous: feel free to assume that
| signed char gives you -128, and short gives you -32768, etc.
| ISO C now requires two's complement.
|
| 3. Use the lowest ranking type whose range is adequate, in
| light of the above rules: rule out the chars and shorts, and
| unsigned types, unless saving space or working with bits.
|
| For instance, for a value that ranges from 0 to 65535, we
| would choose _int_. If it were important to save storage,
| then _unsigned short_.
|
| The ISO C minimum required ranges are: char
| 0..255, if unsigned; -128..127 if unsigned, therefore: 0..127
| signed char -128..127 unsigned char
| 0..255 short -32768..32767 unsigned
| short 0..65535 int -32768..32767
| unsigned int 0..65535 long
| -2147483648..2147483647 unsigned long
| 0..4294967295 long long
| 9223372036854775808..9223372036854775807 unsigned long
| long 0..18446744073709551615
|
| If you're working with bitfields, and saving storage isn't
| important, start with unsigned int, and pick the type that
| holds all the bits required. For arrays of bitfields, prefer
| unsigned int; it's likely to be fast on a given target. It's
| good to leave that configurable the program. E.g. a good
| "bignum" library can easily be tuned to have "limbs" of
| different sizes: 16, 32 or 64 bit, and mostly hides that at
| the API level.
|
| If you're working with a numeric quantity, remove the
| unsigned types, shorts and chars, unless you need to save
| storage (and don't need negative values). Then pick the
| lowest ranking one that fits.
|
| E.g. if saving storage, and don't need negative values,
| search in this order: char, signed char, unsigned char,
| short, unsigned short, long, unsigned long, long long,
| unsigned long long.
|
| If saving storage, and negatives are required: signed char,
| short, int, long, long long.
|
| If not saving storage: int, long, long long.
|
| If the quantity is positive, and doesn't fit into long long,
| but does fit into unsigned long long, that's what it may have
| to be.
| epcoa wrote:
| Unsigned doesn't really bend any rules and their behavior
| around wraparound is well defined.
|
| Therefore there is another use case : circular buffer
| indices.
| kazinator wrote:
| Yes it does bend rules. Say that _a_ , _b_ and _c_ are
| small integers (we don 't worry about addition overflow).
| Given an inequality formula like: a
| < b + c
|
| we can safely perform this derivation (add -b to both
| sides): a - b < c
|
| This is not true if _a_ , _b_ and _c_ are unsigned. Or
| even if just one of them is, depending on which one.
|
| What I mean by "bend the rules of arithmetic" is that if
| we decrement from zero, we suddenly get a large value.
|
| This is rarely what you want, except in specific
| circumstances, when you opt into it.
|
| Unsigned tricks with circular buffer indices will not do
| the right thing unless the circular buffer is power-of-
| two sized.
|
| Using masking on a poweer-of-two-sized index will work
| with signed, due to the way two's complement works. For
| instance, say we hava have [0] to [15] circular buffer.
| The mask is 15 / 0xF. A negative index like -2 masks to
| the correct value 14: -2 & 15 == 14. So if we happen to
| be decrementing we _can_ do this: index = (index - 1) &
| MASK even if index is _int_.
| epcoa wrote:
| > What I mean by "bend the rules of arithmetic" is that
| if we decrement from zero, we suddenly get a large value.
|
| Yes completely consistent with rules of modular
| arithmetic. A programmer ought to be able to extend math
| horizons beyond preschool. Which is ironic because I can
| explain this concept to my 6 year old on a clock face and
| it's easy for them to grasp.
|
| > Unsigned tricks with circular buffer indices will not
| do the right thing unless the circular buffer is power-
| of-two sized.
|
| How will they "not do the right thing?". With power of 2
| you avoid expensive moduli operations, but nothing breaks
| if you choose to use a non power of 2.
|
| > two's complement
|
| Two's complement is not even mandated in C. You are
| invoking implementation defined behavior here. Meanwhile
| I can just increment or decrement the unsigned value
| without even masking the retained value and know the
| result is well defined.
|
| Like I get 2s complement is the overwhelming case, but
| why be difficult, why not just use the well defined
| existing mechanism?
|
| And there's no tricks here, literally just using the
| fucking type as it was designed and specified, why
| clutter things with extra masking.
|
| There's also the pragmatic atomicity benefit.
| kazinator wrote:
| In the N3096 working draft it is written: "The sign
| representation defined in this document is called two's
| complement. Previous revisions of this document
| additionally allowed other sign representations."
|
| Non-two's complement machines are museum relics, and are
| no longer going to be supported by ISO C.
|
| > _why clutter things with extra masking._
|
| Because even if the circular buffer is a power of two,
| its size doesn't necessarily line up with the range of a
| given unsigned type.
|
| If the buffer doesn't have a width of 256, 65536, or
| 4294967296, then you're out of luck; you can't just
| uint8_t, uint16_t or uint32_t as the circular buffer
| index without masking to the actual power-of-two size.
|
| (Note that uint16_t and uint8_t promote to int (on the
| overwhelming majority of platforms where their range fits
| into that type), so you don't get away from reasoning
| about signed arithmetic for those.)
| epcoa wrote:
| So then what is the advantage of using a signed type in
| this case?
|
| And C++20 already standardized it I know that I already
| acknowledged this.
|
| Should I go back and rewrite all the old correct code so
| you feel better?
| kazinator wrote:
| The main advantage is not foisting unsigned on the user
| of the API.
|
| (You can do that while using unsigned internally, but
| then you have to convert back and forth.)
|
| The most important decision is what is the index type at
| the API level of the circular buffer, not what is inside
| it. But it's nicer if you can just use the API one
| inside.
|
| The sizeof operator yielding the type size_t which is
| unsigned has done a lot of harm. Particularly the way it
| spread throughout the C library. Why do we have size_t
| being unsigned? Because on small systems, where we have
| 16 bit sizes, signed means limiting to 32767 bytes, which
| is a problem. In all other ways, it's a downer. Whenever
| you mention sizeof, you have unsigned arithmetic creeping
| into the calculation.
|
| The author of the above blog article has the right idea
| to want a sizeof operator that yields ptrdiff_t instead
| of size_t. (Unfortunately, the execution is bungled; he
| redefined a language keyword as a macro, and on top of
| that didn't wrap the macro expansion in parentheses,
| even.)
| epcoa wrote:
| > If the buffer doesn't have a width of 256, 65536, or
| 4294967296, then you're out of luck
|
| Why so much hyperbole? You're not out of luck. You can
| atomic increment/add the unsigned no matter the buffer
| size. You don't worry about overflow like you would with
| a signed type. You can mask after.
|
| And you continue to avoid answering the simple question:
| what is the advantage of the signed type. I've already
| outlined the one with unsigned, especially with atomics.
| kazinator wrote:
| Comment said "why clutter with extra masking" (just use
| the unsigned types).
|
| Although unsigned types have no overflow, running to them
| as some sort of safe refuge is a mistaken knee-jerk
| reaction.
| colejhudson wrote:
| I mean, being a bit glib here, but a lot of programming is
| dealing with someone else's type system.
|
| Moreover, for those of us who write C fairly often, the
| mnemonics here are familiar.
|
| Actually, as custom type systems go, this one is pretty
| elegant. Reminds me of Rust.
| oaiey wrote:
| Yeah but not for basic types. Also, most code mingles sooner
| or later with other code. Than this is just ugly.
| circuit10 wrote:
| It's better than dealing with needlessly long type names
| like uint32_t though
| uxp8u61q wrote:
| Long type name?? uint32_t is literally 8 characters long.
| There's not much to cut here.
| slikrick wrote:
| if you like typing _, go ahead, but it's not 1 key press
| and shouldn't be treated as a keystroke like the letter
| a...
| uxp8u61q wrote:
| On my keyboard layout it's one keypress. And since code
| is read about 100 times as much it's written, I don't
| particularly care about reaching 250 WPM while writing
| code. The difficulty of writing code is thinking about
| it, not actually physically writing it.
| circuit10 wrote:
| Integer types are very common so it does still get a bit
| tedious. I guess it causes clutter when reading too
| seanw444 wrote:
| How about... u32?
| uxp8u61q wrote:
| Sure, that's totally readable and meaningful.
| slikrick wrote:
| _gasp_ blasphemy! C programmers literally can 't have
| nice things
| circuit10 wrote:
| It's longer than it needs to be when you're typing it out
| so many times
| billfruit wrote:
| What are we trying to optimize, the number of characters
| to type or clarity/readbility?
| circuit10 wrote:
| Both I guess? I don't think the extra length helps with
| readability at all
| oaiey wrote:
| No. It would be better if the standardization groups
| would have done that, but not when every developer has a
| different scheme
| circuit10 wrote:
| It's not great but they're just aliases so they're
| interchangeable, which means you can keep everything
| consistent within a project and it won't cause any
| problems when interacting with outside code
| lelanthran wrote:
| Until you include a header written by someone with the
| same opinion, and now you get compile errors because they
| both defined 'u8'.
|
| I gotta be honest, all of those style suggestions look
| good until you try them in a non-solo and non-isolated
| project, and then you see what a mess you created.
|
| We've all been there, as C programmers, and we've all
| done that in the past, which is why we don't do it
| anymore
| circuit10 wrote:
| Unless they were defined to completely different types,
| that shouldn't be an error
| billfruit wrote:
| uint32_t isn't that long, and is quite clear what type it
| means.
| Sharlin wrote:
| Rust made the correct choice: things used most often should
| be assigned the shortest names. This "Huffman encoding"
| style is what natural languages have evolved toward as
| well. In 2023, if I were to write C, and didn't have
| existing guidelines to adhere to, I'd most probably
| introduce the same typedefs as the author here has done.
| [deleted]
| kristopolous wrote:
| Speaking of type systems, I read glib as g-lib a few times
| and tried to understand how you were talking about the GNU
| lib in that sentence.
| userbinator wrote:
| I did too. Humans use context to resolve ambiguities in
| language, and in this case the context was very much
| statistically favouring the library; if you're using it,
| glib _is_ literally "someone else's type system".
| zombot wrote:
| Wouldn't that have to be `glibc`?
| wyldfire wrote:
| It is unfortunate that the two have such similar names
| because there's a lot of room for confusion. It doesn't
| help that they have somewhat adjacent functionality
| almost.
| slondr wrote:
| https://en.wikipedia.org/wiki/GLib?wprov=sfti1
| zombot wrote:
| I see. Thanks for that link!
| petabytes wrote:
| It's pretty much the same types you see in Rust or Zig, and I
| think Linux even uses some of the same types.
| oaiey wrote:
| Yeah, and C/C++ has these since three times the age of Rust.
| I also find them beautiful but not consistent.
| pwdisswordfishc wrote:
| With slightly different semantics; as I recall, the Linux uXX
| and iXX types have natural alignment (equal to size), while
| stdint.h types are not required to.
| pwdisswordfishc wrote:
| IMO, defining your own aliases for stdint.h types is innocent
| enough, but manually defining prototypes for Win32 calls
| instead of including standard headers is one step too far. You
| don't own the ABI here: things like HANDLE and WPARAM have
| already changed sizes once, who's to say ULONG_PTR will stay
| the same as uintptr_t for all future architectures? There is a
| good reason why doing the same on Unix is discouraged.
| quelsolaar wrote:
| You can declare flexible array member strings with literals like
| this:
|
| #define DECLARE_STRING(variable_name, string) struct{size_t
| allocated; size_t used; char string[sizeof(c_string)];}
| variable_name ## internal = {.allocated = sizeof(c_string) - 1,
| .used = sizeof(c_string) - 1, .string = c_string}; MyString
| *variable_name = &variable_name ## internal
|
| Its a lot of C99 magic, so it may not be what you want but it is
| possible.
| gilcot wrote:
| When I started reading the first section, about his own types, I
| couldn't help thinking: oh my, sounds like "Hungarian
| notation"[1] :)
|
| I think those definitions go to a header file. But how different
| will it be if he use existing types with an abbreviation system?
| And is this feature available with some IDE?
|
| [1] https://en.wikipedia.org/wiki/Hungarian_notation
| ww520 wrote:
| This is an excellent writeup. I especially like the string
| treatment and the fat-pointer struct parameters and return
| values.
| bullen wrote:
| This is my personal style:
|
| http://edit.rupy.se/?host=move.rupy.se&path=/file/game.cpp&s...
|
| It's messy and in many ways ugly but it compiles and runs for an
| eternity.
| 0xedd wrote:
| > code style > personal
|
| You missed the point of code style.
| beeforpork wrote:
| Hopefully I never need to review code written with these
| definitions. It's an awful idea to do this.
|
| The first section shows why some languages have no 'typedef':
| introducing another layer of aliases is just not a good idea.
| It's confusing, it changes the appearance to basically a new
| language. Just use the standard names, instead of redefining your
| language, like everyone else. This style is almost as bad as
| '#define begin {'.
|
| Many of the other defs are obfuscations or language changes --
| this coerces C to something else. I'd not like to read code
| written with this, as it heavily violates the principle of least
| astonishment (POLA).
|
| (As a side note, I don't understand the #define for sizeof. The
| operator sizeof returns size_t -- it's size_t's definition, so
| what is this for?)
| Pathogen-David wrote:
| > As a side note, I don't understand the #define for sizeof.
|
| Their `size` type is signed. It's `ptrdiff_t`, not `size_t`.
| beeforpork wrote:
| Ah, thanks! So code will look normal, but be subtly
| different. This is even worse than I thought!
| dathinab wrote:
| size_t not mapping to size but size being defined is a footgun
|
| I would have used isize instead, I think.
| jll29 wrote:
| The notion of "personal style" is problematic, even for hobby
| projects, because (good) programming is ultimately a social
| activity.
|
| Even Linux started as a personal project, but because of its
| quality and the need it met it quickly spread. So please write
| your code in such a way that experienced other C programmers can
| read it easily.
|
| In isolation, I like some of his ideas, but some issues with C
| remain, and he is perhaps just to comfortable with C to jump ship
| and embrace Rust, which has many things he likes and more (e.g.
| no buffer overflows by design).
| kazinator wrote:
| > _#define sizeof(x) (size)sizeof(x)_
|
| That breaks any macro that uses sizeof in its expansion, and
| subtly changes any code snippet you might bring into the code
| that uses sizeof, even if those macro are defined first.
|
| Speaking of which, if you define a macro for a C keyword _before_
| including any standard header, the behavior is undefined.
|
| It's an unparenthesized unary expression, which has a lower
| precedence than postfix. sizeof(x)[ptr] will turn into
| (size)sizeof(x)[ptr] which parses as (size) ( sizeof(x)[ptr] ).
| bsder wrote:
| > signed sizes are the way
|
| Well, I should probably just say "We're done here." and stop
| reading the rest of the article. "Signed sizes" are an
| _extremely_ surprising abstraction break that are just _asking_
| for disaster.
|
| > No const. It serves no practical role in optimization, and I
| cannot recall an instance where it caught, or would have caught,
| a mistake.
|
| Should you even be writing C if you haven't hit this? People mix
| up "in buffers" and "out buffers" _all the time_. "const" flags
| this _immediately_.
|
| > Declare all functions static except for entry points. Again,
| with everything compiled as a single translation unit there's no
| reason to do otherwise.
|
| And when you go trying to debug something and get at a variable
| or function that you can't find because everything is "static",
| you'll curse the one who wrote the code.
|
| > Another change has been preferring structure returns instead of
| out parameters.
|
| Which is a great way to accidentally return a pointer to your
| stack and open a big ass security hole. Passing in the output
| buffers makes clear the ownership semantics.
|
| This guy seems like he mostly writes code for 64-bit systems. The
| coding advice is ... okay, I guess? Maybe? In that domain?
|
| In a 32-bit embedded domain, some of these guidelines are a good
| way to get youself into a lot of trouble in a real hurry.
| ripe wrote:
| > signed sizes are an extremely surprising abstraction break
| that are just asking for disaster.
|
| Bjarne Stroustrup wrote a detailed memo advocating for signed
| sizes:
|
| https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p14...
| josephg wrote:
| Eh, I hard disagree with this memo. He's either dismissing or
| unaware of the biggest advantage of unsigned types, namely
| they make invalid state unrepresentible. And essentially all
| of his criticism of unsigned types is really criticism of the
| sloppy way old C and C++ compilers let you mix signed and
| unsigned numbers in math operations.
|
| Modern C/C++ compilers can and will warn you (quite
| aggressively) if you mix signed and unsigned numbers without
| thinking about it.
|
| A lot of the examples also seem weird. Eg, he gives a
| negative example of a function: unsigned
| area(unsigned x, unsigned y) { return x * y; }
|
| In this, he complains that you can still write buggy code:
| area(height1-height2, length1-length2);
|
| He's right - that is potentially buggy, But, that code would
| be buggy whether the area function took signed or unsigned
| numbers as input. However, the signed version of this
| function is still worse imo because it could hide the logic
| bug for longer. If the area function should always return a
| positive number, I'd much rather that invalid input results
| in an area number like 4294967250 than a small negative
| number.
|
| Similarly, accidentally passing a negative index to a vec is
| much more dangerous with signed indexes because v[-2] will
| probably quietly work (but corrupt memory). However,
| v[4294967294] will segfault on the problematic line of code.
| That'll be much easier to find & debug.
|
| And a lot of the examples he gives, you'd get nice clear
| compiler warnings in most modern compilers if you use
| unsigned integers. You won't get any warnings with signed
| integers. Your program will just misbehave. And thats much
| worse. I'd rather an easy to find bug than a hard to find bug
| any day of the week.
| uecker wrote:
| The advantage of using signed types is that you can
| reliably find overflow bugs using UBSan and protect against
| exploiting such errors by trapping at run time. For
| unsigned types, wrap-around bugs are much harder to find
| and your program will silently misbehave.
| harpiee wrote:
| With unsigned you can actually check for overflow
| yourself very easily z=x+y; if(z < x || z
| < y) // overflow
|
| And bounds checks are just a single comparisons against
| an upper bound (handles both over and underflow)
| size = x + y; // or size = x - y
| if(size < bound) // good to go
|
| Prior to C23 (stdckdint.h) its very error prone to check
| for signed overflow since you have to rearrange equations
| to make sure no operation could ever possibly overflow.
| uecker wrote:
| You can write correct programs with both. The reality is
| that people often fail to do this. But you can
| automatically detect signed overflow and protect against
| it, while unsigned wrap detected at run-time could be a
| bug or could be just fine (e.g. because you did your own
| "overflow" check and handle it correctly). This makes it
| extremely hard to find unsigned wraparound bugs and
| impossible to trap at run-time.
| tom_ wrote:
| The no-const people will never be satisfied, so just use const
| as necessary, propagate as required, and ignore them when they
| complain. If they take it out, put it back in. They'll always
| get bored first. I've been doing this for 25 years, and I'm
| still here.
|
| (The static thing might depend on the tooling. I went static-
| by-default about 15 years ago, around the same time I went full
| size_t, and I've yet to have a problem with it.)
| thefourthchime wrote:
| I've been her 30 years. I've never found much use for const.
| I value brief simple code that doesn't rely on things like
| const to tell people what's going on.
|
| Codebases have their own conventions and design patterns. If
| you have that const is a needless formality.
|
| Code should be being simple and clean first, constantly
| stating things that are obvious 90% of the time isn't that.
| lsh123 wrote:
| I wrote and still maintain an open source C project for 20+
| years. Once a year I get a new guy coming in and telling me I am
| doing it wrong: you should typedef all data types, you should
| stop using const, and so on. It stopped being funny after the
| first couple times.
| f1shy wrote:
| Funny enough, the take on const in the only thing I agree with
| in the whole post...
| rurban wrote:
| I"m even not agreeing with the const part. The standard did
| define const API"s, so why shouldn"t we follow? I know that C
| const are only half of C++ consts, but still. Still catching
| const errors somewhere because I do use const in APIs.
|
| I only agree with "Declare all functions static except for
| entry points".
|
| s8(s) is only for literal strings, it should be called s8_c
| instead and keep s8 for the default ctor.
|
| The struct return part is okay, but I"ve never used. This is
| not Common Lisp.
| AdamH12113 wrote:
| Can you explain the static functions thing? What's the
| benefit of declaring all functions static if you're
| compiling them as a single translation unit anyway?
| lelanthran wrote:
| You don't get the linker complain of multiply defined
| symbols when your code is linked in as a library to some
| other project.
|
| If you're writing code that never gets reused, then it's
| fine, no need for static functions.
| laserbeam wrote:
| I'm getting an "I don't use const, and here's my view on it"
| vibe from the author much more than "you shouldn't use const".
| I'm really not getting any demand that you change your coding
| style, just someone reflecting on their work and explaining it
| to others. And... Whether I agree with their choices or not, I
| find that very cool and informative.
| outsomnia wrote:
| It wasn't clear to me if he's talking about const as a
| variable declaration qualifier - I never used it - or const
| in pointer types, which is very useful.
| P_I_Staker wrote:
| It's kinda both for me. If you have certainty from top to
| bottom that something will always be const, then maybe.
|
| This is often not true, and even if you think so, you're
| often wrong. I can't recall all the consequences of the
| flaws in the system (promote/demote const), but it's not
| fun to deal with.
|
| I've seen so many things wind up passed to a function or
| going through an interface eventually that's non-const (or
| lets not forget is "const'd for safety").
|
| This is where some would say you should give up on
| practical grounds... if the mission is to determine which
| const scenarios can be ensured, you argue this is not
| practically possible and throw the whole thing out.
| jstimpfle wrote:
| For me it's the other way around -- I use const for global
| variables because it makes a real difference, the data will
| be put in a .ro section.
|
| Pointer-to-const on the other hand (as in "const Foo *x")
| is a bit of a fluff and it spreads like cancer. I agree
| with the author that const is a waste of time. And it
| breaks in situations like showcased by _strstr()_.
|
| I use pointer-to-const in function parameter lists though
| (most of the time it does not actually break like in
| strstr()): as documentation, and to be compatible with code
| that zealously attaches const everywhere where there
| (currently) is no need to mutate.
|
| But overall my use of const is very very little and I
| generally do not waste my time (anymore) with it. I almost
| never have to use "const casts" so I suppose I can manage
| to keep it in check. In C++ it is a bit worse, when
| implementing interfaces, like const_iterator etc. That
| requires annotating constness much more religiously, and
| that can lead to quite a bit of cruft and repetition.
| [deleted]
| outsomnia wrote:
| You're right, I also mark array definitions as const to
| control the section they go in, I was thinking about
| things like const int a = 5; ... anything I want a simple
| const var for is done with #define for me.
|
| const is indeed viral when eg, used in apis, but it's a
| strong indication at a glance for api users what they can
| expect to happen to the memory the pointer points to,
| whether it's just for input or is modified... and the
| virality is only a pain (it can be a pain) if you didn't
| use it from the start so all the things it might call are
| already kitted out with it.
| jstimpfle wrote:
| const makes sense and rarely causes problems when used in
| function parameter lists.
|
| However, when used for members in datastructures, it's
| more often than not problematic.
| P_I_Staker wrote:
| I will say that using const can make a huge mess of code,
| especially if you care about adhering to (very reasonable)
| guidelines.
|
| There's a really good chance you will either have to promote
| or cast away the const, which I hate.
|
| I tend to agree regarding not using const. It's been a while,
| so I don't have an example off the top of my head, but it's
| incredibly easy to break the const mechanism and have to deal
| with these annoying flaws.
|
| I've just seen this go really bad with any kind of code that
| has a split responsibility between teams. Eventually you will
| have to pass to a non-const interface, that you aren't
| supposed to change.
|
| So perhaps it makes sense if you have control from the top
| down and can ensure that the constness is maintained, or
| completely not, if it ends up non-const (then you could also
| try to move the interface to const, if it truly is)...
|
| ... I also suspect in many projects you'd just have to come
| to the conclusion that nothing can be const'd, because it
| ends up non-const anyway. Thus leading to the conclusion
| "just don't use const".
|
| P.S. I'm a bad boy that didn't read TA yet. This is just
| based on my past experience where we didn't really have the
| authority to change stuff in the stack... often times there
| was eg an MCU interface at the end that was non-const...
| guess we could contact the silica manufacturer... sure
| they'll get right on that.
| SoftTalker wrote:
| And, he also says "I'm not saying everyone should write C
| this way, and when I contribute code to a project I follow
| their local style."
| rewgs wrote:
| I'm not a C developer, so I have to ask: why in the world would
| you not use const?
| P_I_Staker wrote:
| You want to avoid promoting or casting away the const. If you
| start using const, you almost inevitably wind up with a
| mismatch.
|
| This introduces flaws in the type system. I wish I had a
| better breakdown on the impact of these concerns, but I'd
| rather not worry at all.
|
| Anyway, if you don't use const, this goes away. Bear in mind
| the minor amount of "safety" it provides, because you can
| just ignore it later, as you arguably tend to be doing anyway
| when you pass a const to non-const or visa versa.
|
| Inevitably, outside of really small insular project (and
| often times even then), there's something down the line that
| winds up being non-const that you don't want to change.
|
| C developers of this mindset tend to just come to the
| conclusion that you will immediately break the type system,
| just give up on the whole game.
|
| Edit adding at least on example:
|
| Example: You define as const and remove the const later. If
| anything writes to the non-const, this is undefined behavior
|
| Example: I believe the above is actually true for const
| promotion if you modify the non-const version... I think this
| is only after the call (edit. ie after it become const,
| really interest in the answer).
|
| No Undefined behavior
|
| /* I imagine this would be okay */
|
| si_non_const = si_non_const + GetMagicValue();
|
| /* Const is promoted here */
|
| const int fparam = si_non_const;
|
| /* Writing to fparam is undefined past here */
|
| f_const(&fparam);
|
| Undefined behavior
|
| /* I imagine writing, after using as const is also not
| defined, but is fine at this point */
|
| si_non_const = si_non_const + GetMagicValue();
|
| /* Here we now have a constant value that will never be
| written to */ const int fparam = si_non_const;
|
| f_const(&fparam);
|
| /* I think this would also be UB, even though it's accessed
| through a different symbol */
|
| si_non_const = si_non_const + GetMagicValue();
|
| Interested in other opinion, maybe will think on later...
| would it be valid for the compiler to remove that last
| assignment?
|
| Edit: Sorry, this is unreadable, if you put a space between
| the not undefined, and undefined it's easier
| jstimpfle wrote:
| It's more work. Not only to put all the annotations
| correctly, but only because it causes some real headaches.
| It's easy (implicit) to transition from non-const to const 1
| pointer level deep. But the other way around -- it's really
| awkward to "remove" a const.
|
| The strstr() signature is probably the shortest example /
| explanation why. To implement strstr(), you have to hack the
| const away to create the return value. Alternatively, create
| a mutable_strstr() variant that does the exact same thing.
| This is the kind of boilerplate that we don't want in C (and
| that C is bad at generating automatically).
|
| Think about it this way: Real const data doesn't exist. It
| always gets created (written) somewhere, and usually removed
| later. One way where this works cleanly is where the data is
| created at compile time, so the data can be "truly" const,
| and be put in .ro section, and automatically destroyed when
| the process terminates. But often, we have situations where
| some part of the code needs to mutate the data that is only
| consumed as read only by other parts of the code. One man's
| const data is another man's mutable data.
|
| In C, the support for making this transition work fluently is
| just very limited (but I think it's not great in most other
| languages, either).
| mmoll wrote:
| > Real const data doesn't exist
|
| Ever seen a ROM?
|
| And the C library's hacks around not being able to overload
| functions (which is the only reason for strstr et al's
| weird signature) wouldn't stop me from using const. It can
| be really useful both for documentation and for
| correctness. Think memcpy, not strstr.
| jstimpfle wrote:
| > Ever seen a ROM?
|
| How does the data get onto the ROM?
|
| But read 2 sentences further, where I had addressed this
| already.
|
| > Think memcpy, not strstr.
|
| See my other comments, I do think that making const
| function parameters is generally good for documentation
| and compatibility. strstr() is only a showcase for the
| limitations. Typically, const works for function
| parameters but not data structures.
| gwd wrote:
| > The strstr() signature is probably the shortest example /
| explanation why. To implement strstr(), you have to hack
| the const away to create the return value.
|
| It seems to me the "hacking" is exactly the side-effect
| that is wanted. It's like the requirement in Rust to do
| certain kinds of things in an `unsafe { }` block (or using
| the `unsafe` package in Go): not that you want the compiler
| to prevent you from doing things completely, but that you
| want the compiler to prevent you from doing things by
| accident.
|
| > One man's const data is another man's mutable data.
|
| Yes; and the point of `const` for function parameters is to
| make sure that data isn't mutated unexpectedly.
| jstimpfle wrote:
| > It seems to me the "hacking" is exactly the side-effect
| that is wanted.
|
| It is not. It's broken at the surface level. If you
| passed a pointer that is already const on your side, you
| get back a non-const pointer back that allows you to
| write to your const memory.
| [deleted]
| david2ndaccount wrote:
| I disagree about the structs vs out-parameters thing. I've found
| it makes functions that could return an error much harder to
| compose and leads to a proliferation of types all over the place.
| In practice almost all functions can fail (assuming you are
| handling OOM), so having a predictable style of returning errors
| is more important.
| JonChesterfield wrote:
| Returning option<foo> (or sum<foo, error>) is the right thing
| but a real pain to write in C. I'm not sure the pattern of `if
| (thing(...)) goto fail` on every function call is particularly
| wonderful either, though the Go crowd seem to like it.
|
| Otherwise there's thread_local mylibrary_errno, which might
| actually be the right thing for within a library, translating
| it to an enum return on the boundaries.
| staunton wrote:
| > assuming you are handling OOM
|
| Which almost noone ever does. It's very hard and almost never
| has any benefit. At that point you have way different problems
| than programming style choices...
| ComputerGuru wrote:
| I might just be a grumpy old dev but a lot of this stuff gets an
| immediate no from me because it's so unidiomatic. You have to
| unlearn the accepted way of doing things and you end up with a
| codebase that is just so foreign to anyone looking at even a
| small chunk of it, unless they are committed to really learning
| to do things your way.
|
| Everyone knows what a uint32_t is when they see it. The cognitive
| overhead (until it becomes second nature, obviously) just feels
| like a heavy price to pay in order to save yourself a few
| characters.
|
| (Some other stuff in the proposed coding style still gets a
| thumbs up from me, though.)
| benreesman wrote:
| I want to both be polite to the OP but also agree.
|
| Writing correct C is hard, so I'm not going to knock anyone who
| found stuff that helps them.
|
| But pound defining shit to things you know via your Hungarian
| notion? Write some elisp. My Haskell programs don't actually
| have Unicode lambda in them.
|
| Pascal strings? Yeah, that's probably the better call, but why
| not use C++ or Rust or something where a bunch of geniuses got
| it right already?
| mikewarot wrote:
| >Pascal strings? Yeah, that's probably the better call, but
| why not use C++ or Rust or something where a bunch of
| geniuses got it right already?
|
| I'll be diving into C fairly heavy for the first time ever
| next year. I intend to skip right past pascal strings and
| implement/use free pascal's AnsiString or UnicodeString, both
| of which are reference counted, have a length (with no limit)
| and are guaranteed null terminated. I've stored a gigabyte in
| them in a few milliseconds. There's no need to allocate or
| free memory either... it's like freaking magic.
| mhd wrote:
| Haven't seen that many refcounted string libraries for C in
| recent times, most common one is probably still the one
| from glib.
|
| I guess you're going to implement your from scratch, or is
| there some prior art you're likely to use?
| mikewarot wrote:
| If at all possible, I'll just lift the one from Free
| Pascal. Otherwise, it's yet another chore in the process
| of bringing MStoical (a modern port of the STOIC
| language) to life
| saulpw wrote:
| Too many geniuses spoil the soup. We all see many of
| thousands of recipes and techniques over the course of our
| careers and it makes sense that each of us are continuously
| curating the small subset that we reach for in every project.
| I enjoy seeing the workbenches of other craftsmen, and
| nothing here looks unfamiliar.
|
| Arthur Whitney however is nuts.
| dev_dwarf wrote:
| You'll find by looking at their older posts that the author
| has actually written quite a lot of elisp.
| loeg wrote:
| The u32, i8, etc type aliases are the least offensive parts of
| this to me, even though I rarely see them in C code. I think
| those are pretty clear.
|
| b32, size (ptrdiff_t), usize (size_t), nothing for ssize_t...
| what? Those are unidiomatic and also kind of weird. The
| macros... some are fine, some are weird.
|
| If this makes the author more productive in C, it might behoove
| them to see if a higher level language like Rust would meet
| their needs.
| mberning wrote:
| To be fair they did say that when contributing to a shared
| project they follow the prevailing standard.
|
| I don't see the harm in following this for your own passion
| projects. You aren't doing it for the world, you're doing it
| for yourself.
| ComputerGuru wrote:
| I mean to each their own, but in my own experience, I value
| being able to easily and reliably copy-and-paste code
| snippets across projects (and I have a million of them,
| across several evolutions of my own personal coding styles
| and conventions) or files without worrying about whether the
| typedefs are in scope, polluting a namespace with possibly
| conflicting names or macros, etc.
|
| I also have often found myself publishing "for my own use
| only" code as open source later and like to keep things
| understandable to maybe help teach someone something someday.
| [deleted]
| badsectoracula wrote:
| > I might just be a grumpy old dev [...] Everyone knows what a
| uint32_t is when they see it.
|
| You might not be old enough then :-P many codebases typedef
| their own int types. See glib (gint, gshort, gint32, etc), SDL
| (Sint32, Uint32, etc) off the top of my head and there are
| _many_ that define types like "int32" or "i32" like the linked
| article.
| ComputerGuru wrote:
| I cut my teeth on DWORD, PHALF_PTR, and friends, so my issue
| is not so much "don't know how to grok this" as it is "we
| finally have sane, universal type names and you're throwing
| them away."
|
| Sure, the _t suffix may be an eyesore but I'll take size_t
| over "size" any day.
| LoganDark wrote:
| I like Rust's approach of "isize", as in "size" is in the
| place of the bit width. "size" sounds stupid.
| Pannoniae wrote:
| Unpopular opinion: something being unusual does _not_
| necessarily mean it is bad. Yes, it will look foreign to random
| people looking at it, but if someone wants to seriously work
| with it, it will only take a few days to get familiarised with
| it. The justification of "cognitive overhead" is, from what I
| have seen, a shibboleth for rejecting "outsider" code written
| by someone not conforming to the language standards by claiming
| it is harder to understand. Personally, I would say that says
| more about the person's inflexibility and/or OCD, not the
| writer's style.
|
| I am not saying _every_ style is good (some simply obfuscate
| things and /or make things overly verbose or unreadable) but
| rejecting a style solely based on it being "non-idiomatic" is
| not a good thing.
| yura wrote:
| Well said. This is also something that I don't buy from the
| criticism towards Lisp. Something along the lines of: "Lisp
| did not become mainstream because everyone writes their own
| little language for their project, and so no one can
| understand other project's code."
|
| pg wrote excellent arguments against this criticism in "On
| Lisp" SS 4.8 Density, which apply just as well to the
| discussion above: "If your code uses a lot
| of new utilities, some readers may complain that it is hard
| to understand. People who are not yet very fluent in Lisp
| will only be used to reading raw Lisp. In fact, they may not
| be used to the idea of an extensible language at all. When
| they look at a program which depends heavily on utilities, it
| may seem to them that the author has, out of pure
| eccentricity, decided to write the program in some sort of
| private language. [...] If people
| complain that using utilities makes your code hard to read,
| they probably don't realize what the code would look like if
| you hadn't used them. Bottom-up programming makes what would
| otherwise be a large program look like a small, simple one.
| This can give the impression that the program doesn't do
| much, and should therefore be easy to read. When
| inexperienced readers look closer and find that this isn't
| so, they react with dismay."
| thetic wrote:
| I don't even think that's a controversial opinion. Breaking
| convention isn't inherently bad; it has costs, some of which
| you described. In this case specifically, the novelty is not
| justified by any significant benefit.
| varispeed wrote:
| Been there. Writing u8 instead of uint8_t may seem like a time
| saver, but in reality it makes it more difficult to read and
| reason about.
|
| If you pack too much information, you are taxing your brain more,
| you are slower to analyse the code and you make it easier to make
| mistakes.
|
| Now I much prefer code that is as verbose as possible.
| Gibbon1 wrote:
| [flagged]
| petabytes wrote:
| What's wrong with it?
| superchroma wrote:
| Well, from a team perspective, it's extremely opinionated and
| hostile to newcomers and messes with core language features
| at the expense of readability. If it's your personal codebase
| then do whatever, obviously.
| aportnoy wrote:
| > extremely opinionated
|
| I have not seen a single codebase that widely uses uint8_t
| and does not typedef it to u8. It is the exact opposite of
| "extremely opinionated".
| loeg wrote:
| It doesn't mess with a core language feature to alias 'u8'
| to 'uint8_t'. It's a reasonable use for the name and one
| used in other languages (e.g., Rust). There's nothing in
| the C standard that defines or uses the 'u8' name.
| Dylan16807 wrote:
| Opinionated? Was there something else you wanted u8 to
| mean?
| dmr_92 wrote:
| Can you explain why---what is it about this style that puts you
| off?
| JonChesterfield wrote:
| > typedef char byte;
|
| That one is dubious. Char has magic aliasing properties that
| uint8_t might not have (iirc that was contentious in a GCC bug
| report) and it will be signed on some platforms and unsigned on
| others, which changes implicit integer conversions.
|
| Missing from this is to embrace attribute((overloadable)) and
| attribute((cleanup)).
|
| Overloadable is the sane, useful alternative to the thing
| standardised as _Generic. The C _Generic will let you define an
| overload set, with some weirdness around type conversions,
| provided you write the entire set out as a single _Generic
| expression, probably wrapped in a macro. If you want to dispatch
| on more than one argument, you nest _Generic expressions. If you
| want to declare different functions in different headers - maybe
| you want 'size(T)' defined on various types in the codebase - you
| can't. If you don't like the idea of thousands of lines of
| distracting nonsense in the preprocessed output, tough. Or - use
| overloadable, get open overload sets, minimal compile time cost,
| obvious intermediate IR, everything works. Prior art is _all of
| C++_ , so talking decades of the tooling learning to deal with
| it.
|
| Cleanup is either a replacement for raii, or a means to have
| debug builds yell at you when you miss a free. It looks like that
| got warped into a thing called 'defer' with different behaviour
| that didn't make it through the committee last time.
|
| Other than that, ad hoc code generators work _really_ well with
| C. Especially if you 're willing to use some compiler extensions.
| Code generators + overloadable will give a fair approximation to
| templated data structures without going deep into the insanity of
| the preprocessor. If the overloadable functions are static inline
| forwarding things in a header they don't even mess up symbol
| names; you just get a straightforward translation to
| vector_float_size or whatever.
|
| Personally I've given up on ISO C. I'd quite like to code in a
| dialect of C99 with a few of the GNU extensions and the
| equivalent of `fno-strict-aliasing`, but C with the pointer
| provenance modelling and an accretion of C++ features has no
| personal value. Currently still using clang with flags to make it
| behave like that but I'm conscious that's on borrowed time - the
| application performance friendly aliasing rules are the default
| and gaining popularity, and relying on opt-out flags is a means
| of opting into compiler bugs.
|
| Semi-actively seeking something that will let me write assembly
| without the hassle of manual register allocation and calling
| conventions. Old style C with some of the warts bashed off would
| be good for that.
| pif wrote:
| > No const.
|
| I stopped reading there. I wish this guy a happy coding (and non-
| coding) life, but I hope we never work together.
| cryo wrote:
| I'd love to have him in a team. He truly cares exactly how code
| works and analyzes/fuzzes the hell out of everything.
|
| I don't agree with all stylistic choices in his code, but the
| level of experience and skills are far above most C developers.
| juped wrote:
| "Skills" are subjective, but significant "experience" would
| involve traumatic foot-shooting turning him off most of the
| things advocated in this post.
| Lockal wrote:
| > typedef ptrdiff_t size;
|
| This reminds me of "#define max ..." in Windows.h. Not as bad,
| but if you autoreplace `sizeof(ptrdiff_t)` with `sizeof(size)`,
| good luck, because it will output size of type of size variable,
| if it exists in the scope.
| bandrami wrote:
| Yeah that's why the uppercase/lowercase hardline is a good one.
| Even Lisp, where free variable capture is a legitimate design
| pattern, has troubles with this.
| bArray wrote:
| > typedef float f32;
|
| > typedef double f64;
|
| Assuming float is 32 bits and double is 64 bits sounds like a
| foot-gun. OpenCV defines a float16_t [0], CUDA implements half-
| precision floats [1], micro-controllers implement whatever they
| want.
|
| C++23 introduces fixed width floating-point types [2], but not
| aware of any way to enforce this in C. What I would suggest it to
| have a macro to check data is not lost at compile time.
|
| Generally I agree with others, it might be better to leave some
| of these things as default for readability, even if it is not
| concise.
|
| [0]
| https://docs.opencv.org/4.x/df/dc9/classcv_1_1float16__t.htm...
|
| [1] https://docs.nvidia.com/cuda/cuda-math-
| api/group__CUDA__MATH...
|
| [2] https://en.cppreference.com/w/cpp/types/floating-point
| CodeArtisan wrote:
| gcc has _Float<size> types typedef _Float32
| f32; typedef _Float64 f64;
|
| https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html
| sigsev_251 wrote:
| It's not just gcc, they are in the C23 standard along with
| the _Decimal<size> types.
| hyc_symas wrote:
| typedef all structs - yes, helps with conciseness. Use typedefs
| liberally, I say. But only typedef the things themselves, not
| pointers to the things. You can always use (type *) when you need
| a pointer. In particular, for function pointers, typedef the
| function, not the function pointer. Then you can use the function
| typedef for function declarations too, which gives you parameter
| type checking without needing to fix declarations everywhere if
| you change a function signature. I see most C codebases get this
| one wrong, typedef'ing the function pointer and still needing to
| manually write out all function declarations for that pointer
| definition.
|
| I'm not sold on the structs as return types thing. I prefer just
| a numeric error code as a return value, and out parameters for
| any other returns.
| jpcfl wrote:
| I prefer to use typedef's for opaque structs to emulate classes
| with all private fields, and use 'struct' for plain ol' data
| structures. Classes should only be accessed via functions,
| while structs can be accessed directly.
|
| I think this is more-or-less a C/POSIX standard convention.
| E.g., `pthread_t` vs. `struct stat`.
| yvdriess wrote:
| It's definitely part of the linux kernel coding style:
| https://www.kernel.org/doc/html/v4.10/process/coding-
| style.h...
| Warwolt wrote:
| That's all fine, but you cannot have nicely behaved stack
| allocated structs and use the data hiding method outlined
| in that blog post, which I think is a pretty big caveat
| lelanthran wrote:
| > I prefer to use typedef's for opaque structs to emulate
| classes with all private fields, and use 'struct' for plain
| ol' data structures. Classes should only be accessed via
| functions, while structs can be accessed directly.
|
| Totally agree, I even wrote this as a blog post:
| https://www.lelanthran.com/chap9/content.html
| PH95VuimJjqBqy wrote:
| > But only typedef the things themselves, not pointers to the
| things.
|
| I agree with this. One of the things I dislike about SDL_net,
| etc, is they do exactly what you're describing. It's a pointer
| but they typedef it as if it's a value type.
|
| I understand the intent but imo that's very icky.
| nspattak wrote:
| i love this blog, i hold Chris Wellons to a very high estime BUT
| i utterly disapprove the usage of macros so much, especially to
| wrap cstd types, functions, etc.
| jstimpfle wrote:
| I'm currently on a C++ (mostly C with C++ compiler) trip. It
| does make some things easier and some things harder. It makes
| it easier to work with C++ developers :-). I sometimes use the
| more involved C++ features but often regret it after because of
| complications.
|
| But one thing that makes it worth it is the removal of the
| struct tag space. I have a strong dislike for the struct tag
| boilerplate in C, but the alternative -- typedef boilerplate --
| in C is unbearable to the point that I have a macro to define
| structs in C that does this automatically.
| #define STRUCT(name) typedef struct name name; struct name
| STRUCT(Foo) { int x; int y; };
|
| But macros often come with disadvantages. In this case it's
| that many IDEs have trouble finding the struct definitions from
| a usage site.
| CatrionaBath14 wrote:
| [dead]
| comex wrote:
| > #define sizeof(x) (size)sizeof(x)
|
| I'm guessing this is lacking an outer pair of parentheses (i.e.
| it's not `((size)sizeof(x))`) on the grounds that they're
| unnecessary. In terms of operator precedence, casting binds
| tightly, so if you write e.g. `sizeof(x) * 3`, it expands to
| `(size)sizeof(x) * 3`, which is equivalent to `((size)sizeof(x))
| * 3`: the cast happens before the multiplication. Indeed, casting
| binds more tightly than anything that could appear on the right
| of sizeof(x) - with one exception which is completely trivial.
|
| But just for fun, I'll point out the exception. It's this:
| (size)sizeof(x)[y]
|
| Indexing binds more tightly than casting, so the indexing happens
| before the cast. In other words, it's equivalent to
| `(size)(sizeof(x)[y])`, not `((size)sizeof(x))[y]`.
|
| But you would never see that in a real program, since the size of
| something is not a pointer or array that can be indexed. Except
| that technically, C allows you to write integer[pointer], with
| the same meaning as pointer[integer]. Not that anyone ever writes
| code like that intentionally. But you could. And if you do, it
| will compile and do the wrong thing, thanks to the macro lacking
| the extra parentheses.
|
| ...On a more substantive note, I quite disagree with the claim
| that signed sizes are better. If you click through to the
| previous arena allocator post, the author says that unsigned
| sizes are a "source of defects" and in particular the code he
| presents would have a defect if you changed the signed types to
| unsigned. Which is true - but the code as presented _also_ has a
| bug! Namely, it will corrupt memory if `count` is negative. You
| could argue that the code is correct as long as the arguments are
| valid, but it 's very easy for overflow elsewhere in the code to
| make something accidentally go negative, so it's better for an
| allocator not to exacerbate the issue.
|
| With unsigned integers, a negative count is not even
| representable, and a similar overflow elsewhere in the program
| would instead give you an extremely high positive count, which
| the code already checks for.
|
| Personally I prefer to use unsigned integers but do as much as
| possible with bounds-checked wrappers that abort on overflow.
| Rarely does the performance difference actually matter.
| bjourne wrote:
| That's a good catch. The moral of the story is that unless your
| macro definition expands to a single token (e.g #define X 123)
| you should always, always, always surround it with parenthesis.
| Because C's precedence rules are damn complicated.
| mananaysiempre wrote:
| > Because C's precedence rules are damn complicated.
|
| This particular part is not actually complicated: the postfix
| operators bind the most tightly, then the prefix ones, then
| the infix ones. (The last part is quite messy, though.)
|
| So (int)x[y] parses the same way as, for example, *p++, which
| should be familliar to a C programmer.
| jstimpfle wrote:
| > (size)(sizeof(x)[y])
|
| Actually, and this is probably surprising to many, this is
| equivalent to (size)(sizeof ((x)[y]))
|
| sizeof is not a function but a unary operator, and indexing (as
| well as function calling...) binds stronger than the sizeof
| operator. It is not a function, not even syntactically! Hence
| why I strongly prefer putting a space after the sizeof keyword,
| and to not use parens for the operand unless needed.
|
| https://en.cppreference.com/w/c/language/operator_precedence
|
| So the "correct" way to define the macro is
|
| #define sizeof(x) ((size)(sizeof (x)))
| comex wrote:
| (self-reply) One more thing.
|
| > I could use _Bool, but I'd rather stick to a natural word
| size and stay away from its weird semantics.
|
| This is even more subjective, but personally I like _Bool's
| semantics. They mean that if an expression works in an `if`
| statement: if (flags & FLAG_ALLOCATED)
|
| then you can extract that same expression into a boolean
| variable: _Bool need_free = flags &
| FLAG_ALLOCATED;
|
| The issue is that `flags & FLAG_ALLOCATED` doesn't equal '0 if
| unset, 1 if set', but '0 if unset, some arbitrary nonzero value
| if set'. (Specifically it equals FLAG_ALLOCATED if set, which
| might be 1 by coincidence, but usually isn't.) This kind of
| punning is fine in an `if` statement, since any nonzero value
| will make the check pass. And it's fine as written with
| `_Bool`, since any nonzero integer will be converted to 1 when
| the expression is implicitly converted to `_Bool`. But if you
| replace `_Bool` with `int`, then this neither-0-nor-1 value
| will just stick around in the variable. Which can cause strange
| consequences. It means that if (need_free)
|
| will pass, but if (need_free == true)
|
| will fail. And if you have another pseudo-bool, then
| if (need_free == some_other_bool)
|
| might fail even if both variables are considered 'true' (i.e.
| nonzero), if they happen to have different values.
|
| _Bool solves this problem. Admittedly, the implicitness has
| downsides. If you're refactoring the code and you decide you
| don't really need a separate variable, you might try to replace
| all uses of `need_free` with its definition, not realizing that
| the implicit conversion to _Bool was doing useful work. So you
| might end up with incorrect code like: if
| ((flags & FLAG_ALLOCATED) == true)
|
| Also, if you are reading a struct from disk or otherwise
| stuffing it with arbitrary bytes, and the struct has a _Bool,
| then you risk undefined behavior if the corresponding byte
| becomes something other than 0 or 1 - because the compiler
| assumes that the implicit conversion to 0 or 1 has been done
| already.
| billforsternz wrote:
| This is all very good and very, ahem, true. But (and it's a
| big butt);
|
| if (need_free == true)
|
| Is such a horrible code smell to me. You have a perfectly
| good boolean. Why compare it to a second boolean to get a
| third boolean?
|
| if (need_free)
|
| or
|
| if (!need_free)
|
| for the opposite case is so much better.
|
| I will admit that in my world this leaves
|
| if (need_free == some_other_bool)
|
| as something I don't have a particularly comfortable way of
| doing safely.
| hun3 wrote:
| Better example: #define FLAG_63 (1ULL <<
| 63) long long flags = FLAG_63;
|
| In this case, if (flags & FLAG_63) pass();
|
| will pass, but typedef int BOOL; BOOL
| set = flags & FLAG_63; if (set) pass();
|
| won't pass, due to truncation.
|
| Question: Would you argue that a datatype that holds the
| _smallest_ (1-bit) datum should be as wide as the largest
| integer type _just_ to handle such cases?
|
| If so, that would be highly inefficient for storage
| purposes. Note that Win32 has 32-bit BOOL type, but
| internally NT uses 8-bit BOOLEAN type to store bools in
| structures.
| MilanTodorovic wrote:
| Couldn't we do it like (!need_free == !some_other_bool)?
| mariusor wrote:
| You can go the php way and do if (!!need_free)
| bmacho wrote:
| > if (need_free == true) > Is such a horrible code
| smell to me. You have a perfectly good boolean. Why compare
| it to a second boolean to get a third boolean? > if
| (need_free)
|
| You are probably interested if the `need_free` flag is set
| to true, and not if `need_free`. It is true that `if
| (need_free)` has the same behaviour, but it is some steps
| farther from what you are interested in.
| gjm11 wrote:
| This feels to me like you're introducing the same
| unnecessary extra layer into your text as in the original
| code. I mean, why not
|
| "You are probably interested in whether it's true that
| the 'need_free' flag is set to true"
|
| leading to
|
| > if ((need_free == true) == true)
|
| ? Answer: because that extra layer of indirection adds
| nothing, and just gives you a bit of extra cognitive load
| and an extra opportunity to make mistakes. I think the
| same is true about going from "need_free" to "need_free
| is set to true".
|
| (This becomes less clear if you have variable names like
| 'need_free_flag'. I say: so don't do that then! It's
| almost always appropriate to give boolean values and
| functions that return boolean values names that reflect
| _what it means when the value is true_.)
| [deleted]
| cozzyd wrote:
| I always just use !! for this case.
| fullstop wrote:
| Yes, I do this as well. It looks a little funny if you're
| not used to it, I suppose.
| moodguy wrote:
| For my hobby projects i do the following.
|
| I adopted the style of writing all macros in lower case with the
| prefix "macro_", so i can grep through all macros. So macro_
| becomes like a keyword. Same with enum.
|
| I use almost the same naming scheme for i32,f32, etc., but i
| typedef for example size_t const to usz, and size_t to usz_ (or
| use macro mut(x) x ## _) . So all shorter type names are const by
| default. I use a single header of 35 sloc to do that.
|
| I see that this way of coding can be confusing for other people,
| so i avoid it when writing code that other people have to work
| with. But for personal code its really enjoyable for me.
| thesnide wrote:
| comments here are a canonical example of bikeshedding...
| loondri wrote:
| I get that everyone has their own coding style, but ditching
| established conventions in C for personal aesthetic seems a bit
| much. Like, using u8 or i32 instead of the standard uint8_t or
| int32_t might save a few keystrokes, but it could confuse anyone
| else looking at the code. And the custom string type over null-
| terminated strings? C's built around those, and deviating from
| that just feels like making life harder for anyone else who might
| need to work with your code.
|
| And manually writing out Win32 API prototypes instead of
| including windows.h might shave off some compile time, but it's
| like ignoring a well-maintained highway to trek through the
| woods. Just seems like a lot of these changes are about personal
| preference rather than sticking to what makes C code easy for
| everyone to work with.
| JonChesterfield wrote:
| u16 etc show up a lot and are unlikely to confuse programmers.
|
| Where it does go to pieces is when two different programs both
| define u16, use them in header files, and then a third program
| tries to include both those header files at the same time. The
| big advantage of <stdint.h> is avoiding that failure mode.
|
| The namespaced library type equivalent is something like
| libname_u32, at which point it's tempting to write uint32_t
| instead of the libname:: or libname_ prefix.
| SkeuomorphicBee wrote:
| > Like, using u8 or i32 instead of the standard uint8_t or
| int32_t might save a few keystrokes [...]
|
| It is not about saving keystrokes, it is about reducing sensory
| load when reading it.
|
| Sorry, I know it may sound like I'm splitting hairs, but every
| single time when the argument of verbosity vs conciseness in
| programing languages comes around, this "keystrokes" argument
| is thrown and it is extremely flawed. The core belief that
| conciseness is only better for faster typing but that verbosity
| is somehow always better than conciseness for reading is just
| plain wrong and we should stop using it. And yes, verbosity has
| some advantages for reading comprehension, but so does
| conciseness, no side is a clear winner, it is all about the
| different compromises.
| __loam wrote:
| I think you completely missed the point of his comment. C
| isn't a new programming language. There are well worn
| conventions and making custom types because you don't like
| them is like forking your own custom dialect nobody can
| understand for very little benefit.
| adriangrigore wrote:
| Agree on no _t suffixes!
| Luker88 wrote:
| More than a few of these look like what you will find in Rust by
| default
|
| Nice to see some evolutive convergence in C programmers, too
| zeroCalories wrote:
| I think C really needs an update to the standard library that
| includes these shorter types. Seems like a fine list though.
|
| Some of my own style changes this year:
|
| I try really hard to write functional code. Mainly try to keep
| functions pure, and write declarative code. I find that this
| makes the code easier to write(not necessarily read), and I'm
| less scared of bugs.
|
| I also avoid malloc unless I absolutely need it. You can usually
| preallocate space on the stack or use a fixed length buffer,
| which pretty much avoids all fears of memory leaks or use after
| free type bugs. You will sometimes waste memory by allocating
| more than you need, but it's a lot more predictable.
| juunpp wrote:
| How do you not use const for arguments of data types that are too
| costly to pass by value?
| progfix wrote:
| Usually if it is too costly, then you pass it via a pointer.
| da39a3ee wrote:
| > #define countof(a) (sizeof(a) / sizeof(*(a)))
|
| > #define lengthof(s) (countof(s) - 1)
|
| It makes no sense to use the word "length" to mean one less than
| the number of items. You could call it maxindexof perhaps.
|
| There may be good arguments for zero based indexing, but we have
| to also accept that there are downsides. One is that your code
| has to feature an artificial quantity obtained by subtracting one
| from a meaningful quantity.
| abareplace wrote:
| This is length of a null-terminated string, e.g.
| lengthof("abc").
| da39a3ee wrote:
| OK, thanks I don't use C much and definitely forgot about
| that. But my point still stands doesn't it, in that the
| author is using these preprocessor macros for general arrays,
| which don't have a special terminator sentinel.
| jeanlucas wrote:
| I wish I could work with C again
| Quekid5 wrote:
| I wish nobody would have to.
|
| (Haha, only serious.)
| [deleted]
| sys_64738 wrote:
| C is wonderful so if you an find a project at work with a lot
| of C code then it'll remain forever. All these other fad
| languages will die before C ever does.
| josephg wrote:
| I'm hopeful for Zig as a modern replacement for C. It feels
| modern like rust, but still with C's lightness.
___________________________________________________________________
(page generated 2023-10-09 23:02 UTC)