[HN Gopher] The C Programming Language: Myths and Reality
       ___________________________________________________________________
        
       The C Programming Language: Myths and Reality
        
       Author : lelanthran
       Score  : 194 points
       Date   : 2023-07-17 13:05 UTC (9 hours ago)
        
 (HTM) web link (www.lelanthran.com)
 (TXT) w3m dump (www.lelanthran.com)
        
       | mcnichol wrote:
       | I'm appreciative for the person who wrote the article.
       | 
       | These are my favorite types of debates and while I disagree with
       | it for much better articulated reasons already mentioned,
       | primarily the exploitative nature of header files used as
       | security through obscurity, I think it stirs up a lot of debate
       | and keeps the spaghetti meatball of knowledge around
       | patterns/anti-patterns, "best-practices", etc. moving forward in
       | a much more passionate way.
       | 
       | I think this is the truest sense of the term "hacker" which lives
       | up to the title of the site pushing us into these debates.
       | Putting stuff together that doesn't always work as intended or
       | expected but arguing for and against it.
       | 
       | A long winded thank you to everyone, OP and all the threads
       | responding.
        
       | lelanthran wrote:
       | The first in a series of blogs posts I am putting together around
       | the C programming language (Not the book!)
        
       | sfpotter wrote:
       | One of the advantages C has over C++ here is that the addition of
       | templates in C++ makes this kind of encapsulation more difficult.
       | See the proliferation of header-only libraries where nearly
       | everything is templatized, as well as the pimpl idiom. You still
       | get some encapsulation in the form of private, but with "C-style"
       | encapsulation, if the header file doesn't change and the source
       | files implementing what's declared in that header don't change,
       | then none of the files need to be recompiled when rebuilding.
       | This makes recompiling much faster.
       | 
       | Of course, you could do much the same thing if you didn't use
       | templates in C++, or if you were very disciplined and limited
       | with your use of templates, but this seems to go against the
       | grain of how I've seen C++ used.
        
         | pjmlp wrote:
         | No longer the case since C++20 modules.
        
           | sfpotter wrote:
           | Thanks for the downvote...? Not sure what the point of that
           | was.
           | 
           | I haven't used C++ much since modules were introduced. From
           | doing some quick reading, it seems unclear whether they
           | significantly improve compilations times in practice. I
           | wasn't able to find anything which addressed how they relate
           | to the ABI... I suspect this is a pretty complex topic. If
           | you have any good references related to either of these
           | points, I'd be very eager to read them.
           | 
           | Either way, "C++ with modules" appears to me to be unlikely
           | to clear the bar set by "C with opaque types" (which, for all
           | intents and purposes, can be done in C++) in terms of 1) ABI
           | stability and 2) encapsulation. I consider point (1) to be
           | related to point (2), since details which are leaked into the
           | ABI are not encapsulated.
        
             | pjmlp wrote:
             | First of all, who told you I downvoted you?!?
             | 
             | Importing the whole standard library as defined by C++23
             | _import std;_ is quicker than doing a plain _#include
             | <iostream>_, as shared per Microsoft employees in some talk
             | they have done regarding upcoming C++23 support.
             | 
             | Unfortunely I can't remeber which one it was, but someone
             | else can glady share the link.
             | 
             | Second by having template details marked as private on the
             | module metadata, that isn't directly exposed to consumers.
             | 
             | As for the ABI specifically, that is compiler dependent
             | anyway.
        
         | cozzyd wrote:
         | It is different, but provides much better encapsulation.
         | Keeping ABI compatibility is much easier this way.
        
       | throaway23423 wrote:
       | In practice, please don't do this... it breaks inlining and adds
       | unneeded allocations.
        
       | ryu2k2 wrote:
       | >You don't have inheritance [..]
       | 
       | Modern PL features are more or less wrappers around comparatively
       | complex C code. Object inheritance is actually one that isn't too
       | difficult or complex to implement:
       | https://www.youtube.com/watch?v=443UNeGrFoM&t=4275s
        
         | jll29 wrote:
         | > Modern PL features are more or less wrappers around
         | comparatively complex C code. Object inheritance is actually
         | one that isn't too difficult or complex to implement:
         | https://www.youtube.com/watch?v=443UNeGrFoM&t=4275s
         | 
         | True, and if you implement it as a preprocessor, that's exactly
         | how C++ started in 1979 (Stroustrup's "C with classes" Cfront
         | pre-processor:
         | https://en.cppreference.com/w/cpp/language/history).
        
       | commandlinefan wrote:
       | When I first encountered Object Orient Programming around '97 or
       | so, all of the literature focused on the three pillars of OO:
       | encapsulation, inheritance and polymorphism. I was struck at the
       | time at how OO mostly just formalized and gave specific names to
       | what good programmers were already doing informally.
        
         | wvenable wrote:
         | A lot of comments/articles complaining about Object Oriented
         | Programming -- especially the style implemented by C++ and
         | similar languages -- start with premise that OOP is some
         | academic prescription declared from up high like Moses and 10
         | Commandments.
         | 
         | But the reality is that the best practices of imperative
         | programming where already much like object oriented programming
         | and that OOP is a formalization of those practices.
        
           | User23 wrote:
           | Strongly agree. Good C developers pretty much instinctively
           | do structured programming[1]. While it's superficially dated,
           | the core concepts are still all very much applicable to the
           | working programmer today and even largely paradigm
           | independent.
           | 
           | [1] https://dl.acm.org/doi/book/10.5555/1243380 (lousy
           | interface, but PDF download available)
        
           | pjmlp wrote:
           | Indeed, I see many of the concepts as extensible modules, as
           | one would use from Modula-2 or Object Pascal.
           | 
           | That is why Oberon takes the spartan approach of only having
           | extensible types, everything else is just like in Modula-2.
           | Later descendants adopted a more mainstream approach.
           | 
           | Likewise how OOP is done in Ada or Modula-3 isn't quite like
           | in mainstream approach.
           | 
           | Or when modules can be manipulated like variables, and given
           | type signatures, we get Standard ML functors, with
           | overlapping capabilities to OOP.
        
         | rightbyte wrote:
         | The words are also confusing and too much latin.
         | 
         | Encapsulation - Hidden
         | 
         | Inheritance - Same fields in beginning
         | 
         | Polymorphism - Tagged struct
        
           | slavapestov wrote:
           | "Hidden" from whom and how? "Same fields in beginning"
           | doesn't characterize multiple inheritance. "Tagged structs"
           | are just one implementation strategy for one kind of
           | polymorphism, and "tagged" and "struct" are themselves
           | jargon. Using and knowing the precise terminology is
           | important when communicating technical concepts.
        
         | jcelerier wrote:
         | > I was struck at the time at how OO mostly just formalized and
         | gave specific names to what good programmers were already doing
         | informally.
         | 
         | I mean, that's the point of everything, isn't it? giving names
         | and defining good practices so that everyone can benefit from
         | them? because even today you can find a LOT of codebases which
         | are an imperative mess
        
       | gjulianm wrote:
       | I wouldn't call this equivalent to the private keyword:
       | 
       | * It does not work on a field, but on the whole struct. Either
       | all fields are public or all are private. The latter case forces
       | you to write getters/setters for properties you want users to
       | access, and that in C can be even more cumbersome as you need to
       | write the definition in the .h and the implementation in the .c.
       | 
       | * It breaks, without an explicit and specific and error message,
       | several actions. As it's mentioned in the article, malloc isn't
       | possible, but neither are copies by value, sizeof() breaks... And
       | those will break with an "incomplete type" error message, not a
       | "this type is intentionally made private", which can add
       | confusion in some situations.
       | 
       | * Completely incompatible with inlining code. Considering a lot
       | of people still use C precisely for its performance, I think this
       | can be a drawback in a lot of usecases.
       | 
       | I honestly think that hiding struct declarations should be done
       | sparingly, and preferably limiting it to cases where it's
       | actually necessary (for example, a library that doesn't expose
       | internal struct fields so the same executable works with
       | different versions of the dynamic library; or proprietary
       | libraries that want to expose as little as possible). In the end
       | it's still easy to bypass, and the distinction between header and
       | code files already provide an indication of which functions you
       | should use and which ones you shouldn't.
        
         | falcrist wrote:
         | I guess I'm reading the article a different way, because what
         | I'm getting is that the author is suggesting the C analogue of
         | a class is an entire header/source "module".
         | 
         | That would make more sense, since you can use the header to
         | craft an interface in which some components are public and some
         | are private.
        
           | c-linkage wrote:
           | Brings back memories of CS101 back in 1992.
           | 
           | They called this "modular programming" where "classes" were
           | represented by opaque pointers and actions could only be
           | performed on those pointers using the functions defined in
           | the module.
        
             | falcrist wrote:
             | In the production C code I've written that makes use of
             | modules with interfaces that define public and private
             | variables and functions, I tend to avoid using pointers
             | where reasonable. I'd prefer to either give access to the
             | variable or provide getter and setter functions.
             | 
             | What's the benefit of "opaque" pointers?
             | 
             | For background, my C code is almost entirely on
             | microcontrollers. So I'm looking at it from that point of
             | view. If you're talking about event-based applications
             | running inside a full operating system, I've always stepped
             | up to something like C#, so I don't have much experience
             | with function pointers for that kind of work.
        
               | cozzyd wrote:
               | yes, there is no benefit on micrcontrollers where
               | everything is usually compiled together as one image (ok,
               | I imagine you COULD design a microcontroller firmware
               | with dynamically loadable sections... but let's not think
               | about that).
               | 
               | But for dynamic linking, this is how you avoid breaking
               | ABI while maintaining forward flexibility.
        
               | c-linkage wrote:
               | For those accustomed to always using open source, the
               | idea of hiding the implementation must seem odd.
               | 
               | But consider the position of developer who implements a
               | shared library that is distributed in binary form only.
               | In this case, the benefit of opaque pointers _for the
               | development of a library_ is that the implementation
               | remains private at the source level. One could, of
               | course, reverse-engineer the binary but few people would
               | do it.
               | 
               | If you define your structures in a public header -- and
               | this includes C++ classes and templates with private
               | members -- one can easily see the implementation and,
               | with a few casts, start munging the guts of your objects
               | and baking in a hard requirement on a specific layout and
               | / or version of your library.
        
               | falcrist wrote:
               | Ok I see what you're saying. Closed source libraries can
               | make use of this idea.
        
           | ricardo81 wrote:
           | That's how I read it.
           | 
           | Clearly you can abstract away the parts of data that the API
           | should not see.
        
           | gjulianm wrote:
           | Yes, that's usually how it goes, each header/source is like a
           | "class". And usually, what you have is a "main" struct that
           | represents the "class" itself. In this case that main struct
           | can't have both private and public fields.
        
             | lelanthran wrote:
             | > Yes, that's usually how it goes, each header/source is
             | like a "class". And usually, what you have is a "main"
             | struct that represents the "class" itself. In this case
             | that main struct can't have both private and public fields.
             | 
             | You can, in a defined way (i.e., no invoking of UB). I just
             | didn't put that in.
        
               | gjulianm wrote:
               | Out of curiosity, how would you do it? The ways I've seen
               | require using another "public" struct and either casting
               | public to private or using a nested pointer, each with
               | their set of problems. In either case, it's still a bit
               | hacky and still each struct is all-or-nothing public or
               | private.
        
               | lelanthran wrote:
               | Pretty much. All the casting and hackiness isn't visible
               | to the caller, and the implementation still maintains its
               | ABI when the private stuff changes.
        
             | falcrist wrote:
             | > And usually, what you have is a "main" struct that
             | represents the "class" itself.
             | 
             | Yes that tracks with the work I've done. It could even be
             | so simple that only one value needs to be exposed through a
             | getter (along with a couple "methods").
             | 
             | An ADC in an embedded system could operate like that.
        
         | throwawayiddqd2 wrote:
         | With plan9-extensions gcc flag:                   struct Thing
         | { int i; };
         | 
         | In an implementation c file:                   struct _Thing
         | {             struct Thing;             int b;         };
         | struct Thing *new_thing()         {             struct _Thing
         | *thing = malloc..             thing->i = 0;
         | thing->b = 1;             return thing;
        
         | Joker_vD wrote:
         | > It does not work on a field, but on the whole struct.
         | struct PrivateFieldsOfMyShinyClass;              struct
         | MyShinyClass {             int somePublicData;
         | double morePublicData;             struct
         | PrivateFieldsOfMyShinyClass *p;         };
         | 
         | > malloc isn't possible, but neither are copies by value,
         | sizeof() breaks
         | 
         | Those are implementation details which are deliberately being
         | hidden.
         | 
         | > Completely incompatible with inlining code.
         | 
         | They are as incompatible with code inlining as public/private
         | modifiers in C++ are. That is, LTO is your best friend here.
         | Also, have you ever tried to maintain binary compatibility with
         | several versions of a third-party C++ library that keeps
         | adding/removing private fields to/from its classes?
        
           | gjulianm wrote:
           | The code sample still isn't equivalent. Now you have yet
           | another pointer, which implies another allocation, another
           | source of possible memory leaks and mistakes, and a separate
           | memory space that will hurt the cache.
           | 
           | > Those are implementation details which are deliberately
           | being hidden.
           | 
           | I know they're being hidden deliberately, but in C++
           | "private" doesn't break malloc (new), copies by value or
           | sizeof. Or stack allocation, to add to the list.
           | 
           | > They are as incompatible with code inlining as
           | public/private modifiers in C++ are. That is, LTO is your
           | best friend here.
           | 
           | public/private aren't incompatible with inlining in C++. That
           | is, you can call class functions that access private members
           | and the compiler can inline those functions. Also, LTO is not
           | always enabled by default, and doesn't always inline the
           | things you want it to inline.
           | 
           | > Also, have you ever tried to maintain binary compatibility
           | with several versions of a third-party C++ library that keeps
           | adding/removing private fields to/from its classes?
           | 
           | I mentioned binary compatibility as one of the reasons one
           | might want to do this. However, if you have a third party
           | that doesn't care about API compatibility I doubt struct
           | fields are the only thing they're going to change constantly.
        
             | skribanto wrote:
             | Maybe have your public fields defined as a second struct,
             | and then you can cast the pointer to your struct to the
             | concrete struct that has all the public fields. This has
             | the restriction that all public fields must be at the
             | start, and you must make sure to maintain the same order
             | between the two structs.
             | 
             | At this point though, I think I honestly would prefer
             | setters/getters.                   struct MyClassPublic {
             | int x;             int y;             ...         }
             | /* using it */         MyClass *myclass = myclass_create();
             | ((MyClassPublic *)myclass)->x = 5;
        
           | dfawcus wrote:
           | struct public_stuff {             ...         }
           | struct private_stuff {             struct public_stuff
           | public;             ...         }              struct
           | public_stuff *make_public(/* ... */) {             struct
           | private *prv = malloc(sizeof *prv);             /* ... */
           | return &prv->public;         }
           | 
           | Then when passed in to functions, cast the passed in struct
           | pointer to the private one.
           | 
           | The public struct doesn't even have to be at the start if one
           | make appropriate use of offsetof and ensuring valid
           | alignment.
           | 
           | Nothing new under the sun...
        
           | jstimpfle wrote:
           | Hiding members in this way is only possible with pointer
           | indirection, which isn't satisfying.
           | 
           | However, having only a boolean private/public access state
           | isn't generally satisfying either. It often leads to
           | violation of the principle of separation of concerns when all
           | the functions (methods) acting on certain "private" fields
           | need to live in the same class.
           | 
           | In simple classes, like std::vector, it's possible to get
           | away with private. But in many cases that are more complex
           | than that, it seems to me that the best approach is still to
           | expose the data and to be just very clear about the exact
           | purpose of each member.
        
         | lelanthran wrote:
         | > I wouldn't call this equivalent to the private keyword:
         | 
         | I didn't mean to imply it is, I lead with:
         | 
         | >> All too often someone, somewhere, on some forum ... will
         | lament the lack of encapsulation and isolation in the C
         | programming language. This happens with such regularity that I
         | now feel compelled to address the myth once and forever.
         | 
         | It's only about the myth that C doesn't have _any_ level of
         | encapsulation or isolation.
        
           | brabel wrote:
           | Nearly every rebuttal on the internet starts with
           | misinterpretation of what's being said. Good reading skills
           | are extremely rare, it seems.
           | 
           | To be clear: you tried to say "C offers
           | encapsulation/isolation". People read "This solution is
           | equivalent to 'private'", an almost completely unrelated
           | statement, and then respond to that.
           | 
           | That could typically be classified as a "Straw man
           | fallacy"[0], but I believe people who do this in many cases
           | simply do not have the necessary reading skills to understand
           | what proposition has been made, and therefore honestly
           | believe themselves to be reasoning correctly (i.e. without
           | fallacy).
           | 
           | Reading comprehension used to be a topic at school when I was
           | a child. I suppose that's no longer the case??
           | 
           | [0] https://en.wikipedia.org/wiki/Straw_man
        
             | gjulianm wrote:
             | As the other commenter said, and I want to reiterate, what
             | led me to start with the talk about the private keyword is
             | the big bold header that says "Myth: C has no equivalent to
             | "private"" and the first code snippet that shows how you
             | can have "private" fields in C++ and the following ones
             | that show an "equivalent" implementation in C. So I think
             | it's reasonable to infer that the author is talking about
             | encapsulation and "the private keyword" as somewhat
             | interchangeable (and I don't disagree, for this discussion
             | they're practically the same). Not only that, but the
             | points I made are about the implementation shown in the
             | article, which is independent of whether the talk is about
             | "private equivalency" or "lack of encapsulation": the gist
             | of it is "yes, you 'encapsulate' things but not in the same
             | way and it comes with disadvantages that aren't really
             | there in other languages".
             | 
             | With all that said, I don't think all that condescending
             | talk about the lack of reading comprehension or skills,
             | without actually going into the arguments themselves, is
             | really necessary or positive.
        
             | jasode wrote:
             | _> To be clear: you tried to say "C offers
             | encapsulation/isolation". People read "This solution is
             | equivalent to 'private'", an almost completely unrelated
             | statement, and then respond to that.
             | 
             | >That could typically be classified as a "Straw man
             | fallacy"[0], but I believe people who do this in many cases
             | simply do not have the necessary reading skills _
             | 
             | Fyi... the author's article that this thread is about has
             | in bold heading: _" Myth: C has no equivalent to
             | "private""_
             | 
             | So, a reasonable interpretation of the text following that
             | headline is how to use C Language constructs to dispel that
             | myth.
             | 
             | Doesn't seem like "straw man" applies here.
        
           | gjulianm wrote:
           | I mean, there's a big header that says "Myth: C has no
           | equivalent to "private"" and a code example about the private
           | keyword, so that's why I started saying that. Even then, my
           | points still apply: this isn't really equivalent to how
           | encapsulation works in other languages due to the lack of
           | granularity and the "extras" of all the usual language
           | behavior that stops being supported.
        
         | dmitrygr wrote:
         | >* Completely incompatible with inlining code
         | 
         | You were right 10 years ago. Today we have LTO
        
       | 0xfedbee wrote:
       | Great article lelanthran. It's always refreshing to see someone
       | going against the popular opinion here and piss everyone off.
       | Keep it up!
        
       | simias wrote:
       | I see where you're coming from but IMO this "Cheshire cat" idiom
       | to hide the implementation details is not exactly like private,
       | it fact it can do things that private can't do, and doesn't do
       | things private does.
       | 
       | The advantage of hiding your state behind an opaque struct with
       | builders and accessors is that you can change the size and layout
       | of said struct without it being a breaking API change. The code
       | remains binary compatible even, no need for a recompile if you're
       | shipping a shared lib. This is something just using private
       | members doesn't achieve since with private members the compiler
       | still knows and uses the layout of the struct, it just forbids
       | access to it.
       | 
       | That's why you can even find C++ libraries use this idiom even
       | though C++ obviously has `private`. It's about having a stable,
       | opaque API.
       | 
       | On the other hand because of this added indirection, there's
       | usually a greater performance hit to accessing these opaque
       | structs since code can't be inlined. With private since the
       | compiler can still see inside the struct, it's able to more
       | aggressively optimize the code. You can also store the objects
       | directly on the stack without requiring malloc.
       | 
       | IMO the right way to have private members in C structs is... to
       | document that members shouldn't be touched directly, perhaps
       | using a special naming convention or embedding the publicly-
       | accessible members in a dedicated sub-struct to prevent
       | confusion.
        
         | Joker_vD wrote:
         | > IMO the right way to have private members in C structs is...
         | to document that members shouldn't be touched directly, perhaps
         | using a special naming convention or embedding the publicly-
         | accessible members in a dedicated sub-struct to prevent
         | confusion.
         | 
         | Reminds me of that one time when glibc broke the whole of
         | Debian for s390 architecture by changing the fields in the
         | jmp_buf struct (which is public): [0].
         | 
         | [0] https://lwn.net/Articles/605607/
        
         | bluetomcat wrote:
         | To achieve a reasonable level of encapsulation in C, a header
         | file must be seen as a public-only interface. It should declare
         | only the structs that are relevant for the user of the module.
         | If that's "struct my_module_handle { ... }", declare it and
         | document the corresponding accessor and modifier functions.
         | Everything else must reside in the C source file with internal
         | linkage (static storage class). The whole source file is your
         | implementation.
         | 
         | There is an anti-pattern where header files are used for all
         | the declarations needed internally by the source file.
         | Including (pasting verbatim with the preprocessor) that file
         | from another module would bring in all the unnecessary
         | declarations.
        
           | hbossy wrote:
           | This is how it's supposed to be done but you always end-up
           | moving them to header just to make writing unit tests less
           | painful.
        
             | 10000truths wrote:
             | This is a smell. Your unit tests should not have to rely on
             | internal implementation details.
        
               | icedchai wrote:
               | Are unit tests common in C? In the mid-2000's, I worked
               | on an "enterprise" system, written in C and C++. There
               | were about 300,000 lines of code, maybe 10 tests. This
               | thing was the core of a billion dollar business
        
               | gpderetta wrote:
               | And in the worst case you can have module-private
               | headers. No need to pollute your interface.
        
               | coldtea wrote:
               | They don't have to and shouldn't, but it's convenient.
               | That's the parent's point ("to make writing unit tests
               | less painful").
        
           | menaerus wrote:
           | Opaque pointers usually impose the restriction on the API
           | such that in order to use the handle one has to dynamically
           | allocate the object on heap. That's a quite unfortunate
           | tradeoff IMO.
        
             | cozzyd wrote:
             | there are ways around this, if VLAs are allowed.
             | // in <opaque_foo.h>        typedef struct opaque_foo
             | opaque_foo_t;       size_t opaque_foo_sz(void);        void
             | opaque_foo_init(opaque_foo_t* foo)             // in your
             | code, which you could write a helper macro for if you were
             | so inclined       char opaque_foo_mem[opaque_foo_sz()];
             | opaque_foo_t * my_foo = (opaque_foo_t*) opaque_foo_mem;
             | opaque_foo_init(my_foo);
        
               | kopecs wrote:
               | Doesn't this violate strict aliasing?
        
               | cozzyd wrote:
               | yes, though you can fix that with compiler flags (or,
               | #pragma if you want strict aliasing elsewhere in your
               | code)
               | 
               | alternatively, gcc supports VLAs in unions, but I don't
               | think clang does, but that makes it extra annoying to do.
               | 
               | edit: apparently you can probably apply the may_alias
               | attribute to the type? Or you could try using
               | transparent_union. No idea if clang supports either...
        
               | jenadine wrote:
               | And alignment?
        
               | cozzyd wrote:
               | yes, you may need an alignas depending on platform
               | (though you probably want it even if unaligned access is
               | supported).
        
           | simias wrote:
           | I think what you say makes complete sense at module-level (as
           | in, for a standalone lib for instance) but I never bother
           | segregating things internally within a lib/module/exe and
           | rely on good documentation and coding practices to avoid
           | having member mutations all over the place.
           | 
           | If I code in Rust or C++ I can use namespacing and
           | public/private to give every single object in the codebase a
           | clean interface, but in C doing that is just frustrating, not
           | to mention potentially inefficient.
        
         | Dwedit wrote:
         | Link-time optimization means that you're probably not going to
         | take that much of a performance hit.
         | 
         | But yes, opaque structs do enforce that it will be treated as a
         | plain pointer, and the compiler (usually) cannot treat it as an
         | aggregate of variables.
        
           | Athas wrote:
           | If a linker did that, changing the layout would be an ABI-
           | breaking change. I think this opaque struct design is most
           | common for dynamically loaded libraries, where link time
           | optimisation does not occur (unless dynamic linkers got a lot
           | more fancy recently).
        
         | zokier wrote:
         | > The code remains binary compatible even, no need for a
         | recompile if you're shipping a shared lib. This is something
         | just using private members doesn't achieve since with private
         | members the compiler still knows and uses the layout of the
         | struct, it just forbids access to it.
         | 
         | There is somewhat common PIMPL idiom to work around the binary
         | compat issue. Iirc there were some macros floating around to
         | make it easier to manage.
        
           | maleldil wrote:
           | What do the macros do? Isn't this as easy as forward
           | declaring the impl class, adding std::unique_ptr<Impl> as a
           | private field and have public methods refer to the field? I'm
           | struggling to understand why macros would help here.
        
             | comex wrote:
             | Perhaps the macros help you forward methods on the outer
             | class to methods on the impl class? While your approach of
             | having public methods refer to the field also works, it's
             | nice to have public and private methods in the same place
             | (the impl class's definition) and using the same syntax
             | (neither having to go through the impl field).
        
             | saghm wrote:
             | I'm also only familiar with this idiom in C++, but based on
             | the description in the parent comment, I suspect that this
             | is sometimes used in C too, in which case you obviously
             | can't use unique_ptr or private fields; maybe macros might
             | be a way to avoid having to write a bunch of boilerplate to
             | achieve a similar effect?
        
         | c-linkage wrote:
         | I like the way that Windows does it, where they have as the
         | first element of the struct a double-word size (dwSize) element
         | that records in 32-bits the size of the structure. The size
         | essentially acts as a version identifier, as long as you never
         | rearrange the fields and only append fields for new versions.
         | The opaque functions test the value of the dwSize element to
         | see what actions can be performed on the object.
         | 
         | The code that _you_ develop can still access the member fields
         | directly, and those accesses can be inlined and optimized
         | aggressively by the compiler.
        
           | bluejekyll wrote:
           | This implies a branch statement in every function call,
           | doesn't it?
        
             | veltas wrote:
             | There's zero cost abstractions and then there's zero
             | features abstractions.
        
             | speed_spread wrote:
             | The cost of that branch is insignificant next to that of
             | the syscall you opted to make.
        
       | nvy wrote:
       | Jumping through all these hoops just to, what? Avoid the garbage
       | collector?
       | 
       | It's okay to admit C has shortcomings.
        
         | zer8k wrote:
         | The OP article seems to be someone trying to cram modern
         | concepts into a language that explicitly rejects them. A
         | symptom that the OP should've probably just used another
         | language.
        
         | lelanthran wrote:
         | > It's okay to admit C has shortcomings.
         | 
         | I did that, didn't I?
         | 
         | >> Just to be clear, C is an old language lacking many, many,
         | many modern features. One of the features it does not lack is
         | encapsulation and isolation.
        
       | bjourne wrote:
       | Sure, but it still goes against the grain of the language. You
       | could write a similar article explaining why everyone who thinks
       | C doesn't have automatic memory management are wrong. Opaque
       | structs are relatively annoying to work with especially if
       | sibling modules in the same package have good reasons to
       | manipulate the private parts directly. It is not nearly as
       | convenient as it is in a language with better support for
       | encapsulation (e.g Java). Most of the C code I write do not
       | encapsulate anything. It's not worth the bother. Especially not
       | when unit-testing for which encapsulation would force you to
       | write lots of redundant getters and setters just for the unit
       | tests themselves. My view is that you simply shouldn't use C if
       | you need encapsulation.
        
       | jasode wrote:
       | To the author : your explanation can be interpreted as "correct"
       | but also be aware that -- for some readers -- your argument is a
       | variation of the Turing Tarpit:
       | https://en.wikipedia.org/wiki/Turing_tarpit
       | 
       | In other words, the 2 different possible receptions to your post:
       | 
       | - YES, file-level modularity with opaque structs is _equivalent_
       | to class private members --> for those mindsets already
       | sympathetic to C Language
       | 
       | - NO, using file-scoping rules and structs is _not equivalent_ to
       | class private members because it 's a bunch of extra ceremonial
       | syntax to implement a workaround. (The "Turing Tarpit"). It's
       | using the opaque struct as a "design pattern" and as Peter Norvig
       | famously said, _" Design patterns are bug reports against your
       | programming language."_
        
         | alpaca128 wrote:
         | I don't think boilerplate code has much to do with Turing
         | Tarpits. It might be annoying but not Brainfuck-grade insanity.
        
         | colonwqbang wrote:
         | I would argue that C++ "pimpl" design pattern brings more
         | "ceremonial syntax" than the C equivalent.
         | 
         | C++ style (without "pimpl") requires recompilation of the whole
         | dependent tree when adding a new private member function. It's
         | encapsulation only in a formal sense
        
       | ReflectedImage wrote:
       | Actually, you can just have a public version of a struct and a
       | private version of a struct with more fields.
       | 
       | You can give callers the public version and then cast it to the
       | private version for internal usage.
        
         | bluejekyll wrote:
         | These will have different sizes though, so it's only safe to
         | cast when used off the heap and where it's memory is allocated
         | with the larger variant, right?
         | 
         | I guess my point is that there's a whole bunch of caveats in
         | regards to safety that need to be considered in your solution.
        
       | zh3 wrote:
       | That seemed a complicated way of reminding us that C++ (at least
       | originally) is/was compiled down to pure C and thus can't do
       | anything that C can't do.
        
       | 3cats-in-a-coat wrote:
       | So let's do polymorphic virtual methods now.
        
       | adwn wrote:
       | > _Hell, they cannot even malloc() their own StringBuilder
       | instance, because even the size of the StringBuilder is hidden.
       | They have to use creation and deletion functions provided in the
       | implementation as specified in the interface._
       | 
       | And you've just made it impossible for the users of your
       | StringBuilder to pass it around by-value. Every instance has to
       | be malloc'ed by your library, even though it's just a tiny, word-
       | sized struct. Awesome! And each access needs to go through an
       | additional pointer indirection. All this just to pretend that C
       | supports proper encapsulation. Hooray!
       | 
       | I'm sorry that I'm targeting your blog post specifically, but
       | it's just so stereotypical of C proponents, that can't (or
       | won't?) realize that their favorite programming language is
       | inherently limiting and limited along several very important
       | dimensions. It makes me think that although some of them might be
       | excellent _programmers_ , they make for terrible _software
       | engineers_.
        
         | deadbeeves wrote:
         | What do you mean? The fact that C can do this is an example of
         | how it's _not_ limited. A lot of other languages instead
         | _require_ you to allocate everything in the heap and there 's
         | no possibility of passing things by copy, or of not accessing
         | things through anything other than a pointer. C at least is
         | capable of allocating and accessing at least some things
         | directly on the stack.
        
           | adwn wrote:
           | > _The fact that C can do this is an example of how it 's not
           | limited._
           | 
           | No, C is limited because it's mutually exclusive: _either_
           | encapsulation, _or_ zero overhead by-value passing. Other
           | languages, like C++ or Rust, allow _both at the same time_.
        
             | rightbyte wrote:
             | You can do dummy defines of structs with the same size as
             | the real one if you want the struct on the stack and
             | encapsulation.
        
               | vore wrote:
               | At that point you might as well just name your fields
               | like DONTTOUCHTHIS_foo if you're having to keep the
               | private definition with fields in sync with the opaque
               | public definition (and making sure the alignment and
               | sizing are always in sync with the private one...)
        
               | e4m2 wrote:
               | I've done this before, works quite well actually, but
               | isn't very popular for some reason.
               | 
               | > You can do dummy defines of structs with the same size
               | 
               | Don't forget alignment. The general pattern is:
               | https://godbolt.org/z/6je9Yb3rf.
        
               | sfpotter wrote:
               | Maybe I'm missing the idea, but I'm not sure how this
               | idea is supposed to work without using something like
               | alloca.
               | 
               | If you have:                   struct foo;
               | 
               | in foo.h and:                   struct foo {
               | int a;             double x;             ...         };
               | 
               | in foo.c, you won't have sizeof(struct foo) available
               | from bar.c, so your construction won't work in bar.c. You
               | could define a function:                   size_t
               | sizeof_foo(void);
               | 
               | which just returns sizeof(struct foo) from inside foo.c,
               | but since this size is now only known at runtime, you'll
               | need to resort to alloca or VLAs...
        
               | gpderetta wrote:
               | You define public_foo in the header with the public
               | members and a appropriately sized byte array for the
               | private members.
               | 
               | In the .c file you define private_foo, same a public_foo
               | except that the byte array is replaced with the actual
               | members.
               | 
               | You static assert that size and alignment match and cast
               | at function boundaries.
               | 
               | You hope not to have violated strict aliasing rules.
               | 
               | This is not completely unlike type erasure with small
               | buffer optimization done by some c++ classes like
               | std::function.
        
               | sfpotter wrote:
               | OK, I understand what you're doing now. Thanks for
               | clarifying.
               | 
               | The big downside here is that you're leaking the size of
               | the details into your ABI which wouldn't happen with a
               | fully opaque type... I could see some uses for it but
               | haven't felt a strong enough need to reach for it before,
               | although it has occurred to me.
        
               | gpderetta wrote:
               | Of course there is no way around that. A partial
               | mitigation is the same as done for network protocols:
               | reserve some space for future extensions.
        
             | deadbeeves wrote:
             | Eh. Arguably in C++ you don't get proper encapsulation just
             | with private members, because changing the layout for those
             | members changes the ABI.
        
               | adwn wrote:
               | Don't let Perfect be the enemy of Good. The quality of
               | encapsulation you can achieve in C++ is miles ahead of
               | that of C, even if it isn't all that could ever be.
        
               | pjmlp wrote:
               | For example, user defined types that behave like built-
               | ins, while preserving invariants.
               | 
               | Specially great in IoT instead of macros accessing
               | directly IO ports.
        
             | dasyatidprime wrote:
             | There's a third vertex to the triangle here: C++ and Rust
             | allow both at the same time by dropping separate
             | compilation. More thoroughly so in Rust than in C++, but
             | header-focused libraries move C++ further toward whole-
             | program compilation compared to C (maybe you could call it
             | "large-overlapping-chunks-of-program compilation").
        
               | jll29 wrote:
               | Very valid point. I keep fond memories of Modula-2, which
               | has DEFINITION modules and IMPLEMENTATION modules, such
               | that you can compile the former and the latter seprately.
               | 
               | In Modula-2, I can specify an API in its DEFINITION
               | module, and after compiling it, client applications can
               | use such an API without the implementation being ready
               | yet, and still I can compile the client and check if it
               | is free of syntax errors.
        
         | _gabe_ wrote:
         | > And you've just made it impossible for the users of your
         | StringBuilder to pass it around by-value. Every instance has to
         | be malloc'ed by your library, even though it's just a tiny,
         | word-sized struct. Awesome! And each access needs to go through
         | an additional pointer indirection.
         | 
         | ...and?
         | 
         | I'm guessing the implication here is that you'll trash
         | performance by doing this. How can you assume that? The thing
         | about optimizing code is, you don't know where your hot paths
         | are until you _profile_ your code. And, the one thing
         | experience has taught me, my _intuitions_ about what the hot
         | spots will be _rarely_ match reality. There 's nothing wrong
         | with that, complex systems are complex, and we have incredible
         | profiling tools to eat through that complexity and highlight
         | the hot spots for us.
         | 
         | Now, I know you'll probably go on about a death by a thousand
         | cuts etc. The thing is, well-crafted modules typically don't
         | encapsulate on a fine-grained level. You usually have larger
         | _systems_ that hide details. These systems are usually used a
         | fraction of the time the rest of your program is. So the
         | indirections end up usually being a very insignificant cost to
         | the overall program.
         | 
         | And if you _are_ coding in such a way that copying a string
         | builder by value and /or the indirection imposed by
         | encapsulating that information is a bottleneck, I highly doubt
         | that "fixing" this by copying by value and/or removing the
         | indirection will suddenly make your entire program performant.
         | 
         | > It makes me think that although some of them might be
         | excellent programmers, they make for terrible software
         | engineers.
         | 
         | You haven't actually highlighted any issues here and then go on
         | to finish your argument with an ad hominem. Instead of
         | attacking the competence of C programmers, you should
         | illustrate the actual real world impact that this design
         | philosophy results in. I know plenty of _really_ slow Java
         | libraries, and plenty of _really_ fast C libraries that use
         | this method of encapsulation. So if your argument is that using
         | this method trashes performance, it 's a poor argument that
         | doesn't have many real world examples (unless you know of some
         | off the top of your head).
        
           | adwn wrote:
           | > _The thing about optimizing code is, you don 't know where
           | your hot paths are until you profile your code._
           | 
           | Absolutely! So, you profile your program, and it turns out
           | that 95% of the runtime is caused by malloc/free in tight
           | loops, which you can't get rid of, because they're hidden
           | behind an API which had to choose between encapsulation and
           | efficiency.
           | 
           | > _And if you are coding in such a way that copying a string
           | builder by value and /or the indirection imposed by
           | encapsulating that information is a bottleneck [...]_
           | 
           | You don't seem to realize that the _StringBuilder_ was just
           | an example to illustrate this style of encapsulation?
           | Oftentimes you want to encapsulate actual  "value structs",
           | where it is sensible to create millions of them in an array.
           | In C, you're forced to choose between following good software
           | engineering practices (=> encapsulation) and getting good
           | performance.
        
             | _gabe_ wrote:
             | > So, you profile your program, and it turns out that 95%
             | of the runtime is caused by malloc/free in tight loops,
             | which you can't get rid of, because they're hidden behind
             | an API which had to choose between encapsulation and
             | efficiency.
             | 
             | I have literally never run into a library that was written
             | so badly that using the library encouraged you to use the
             | API to create millions of small objects. That's what I'm
             | saying. Sure, this _can_ happen, but in _reality_ I 've
             | never seen it. Can you show me where this hypothetical
             | scenario is occurring and trashing people's performance? We
             | probably want to avoid using those libraries.
             | 
             | Instead, I usually see encapsulation used like it is in
             | GLFW, or libcurl, or stbi. The encapsulation covers
             | _systems_ and not tiny objects, which encourages the user
             | of the library to not make API calls millions of times or
             | construct millions of tiny objects.
             | 
             | > You don't seem to realize that the StringBuilder was just
             | an example to illustrate this style of encapsulation?
             | Oftentimes you want to encapsulate actual "value structs",
             | where it is sensible to create millions of them in an
             | array.
             | 
             | I did realize this. Encapsulation is typically useful on
             | larger systems. Once you get to the point of millions of
             | objects, you usually have a larger system managing those
             | millions of objects. And ideally, those millions of objects
             | should be POD. If they're POD, encapsulating the data makes
             | no sense at that point, because it makes more sense to
             | encapsulate whatever is managing that data.
             | 
             | > In C, you're forced to choose between following good
             | software engineering practices (=> encapsulation) and
             | getting good performance.
             | 
             | This is a false dichotomy. There are plenty of large C
             | projects that follow good software engineering practices
             | (which is entirely subjective, what is "good"?). Look at
             | any OS kernel, or the libraries I mentioned above.
             | 
             | So, once again, I'm curious if you know of any C libraries
             | (ab)using encapsulation in the hypothetical scenario you've
             | laid out. If there aren't any libraries that do this, then
             | this is a non-issue and attacking the competence of C
             | developers is entirely unwarranted since you've built up a
             | strawman that doesn't exist in reality.
        
         | xbar wrote:
         | The best software engineers I ever worked with were all C
         | programmers.
         | 
         | Further, and quite separate, the term "software engineer" was
         | coined on behalf of assembly programmers, whose language lacks
         | even further features.
         | 
         | Finally, I am not sure what you mean when you say they make bad
         | software engineers. Perhaps we have different definitions of
         | software engineer.
        
           | pjmlp wrote:
           | The term Software Engineer is something that in plenty of
           | countries is validated by the engineering organization and is
           | a legal title, not something one feels like calling
           | themselves.
           | 
           | Which also validates that any university teaching software
           | engineering has a certain quality level, and portofolio of
           | lectures, to create a general background across all subjects
           | of engineering practices besides writing code.
        
           | adwn wrote:
           | If someone doesn't recognize when and how their tools limit
           | the quality of their work, they can't be good craftsmen.
           | There might still be good reasons to use those tools (e.g.,
           | no better alternatives, or an existing ecosystem), but if you
           | don't realize that your programming language is fundamentally
           | limiting in ways that other languages are not, then you'll
           | never know how to build better software.
        
         | c-linkage wrote:
         | Passing around potentially large objects by value is wasteful
         | and prone to move / copy semantics.
         | 
         | Most languages pass objects by reference (C# and Java chief
         | among them).
         | 
         | Still, if you really want to pass by value -- even though
         | you'll likely end up with pointer ownership problems -- you
         | just add a few functions to the API to do so.
         | 
         | Creating an opaque type on the stack _can_ be done, you just
         | need a little more work.
        
       | oneeyedpigeon wrote:
       | Great article, OP, but you forgot to populate the href on your
       | "swig" link.
        
         | lelanthran wrote:
         | Thanks, fixed.
        
       | mojosam wrote:
       | I think there's a better example, but whether it applies it
       | depends on one of two major divisions of C code: that designed to
       | run on systems with a MMU (as typically used for Linux and other
       | large OSes) -- where virtual memory makes dynamic momory
       | allocation practical -- and those without -- which today is
       | primarily the very large world of embedded devices.
       | 
       | For the latter, the industry best practice is to avoid malloc(),
       | except maybe at init time, and instead allocate memory
       | statically. And in that use case, you break your code into
       | modules, which can contain private data, public data, private
       | functions, and public functions.
       | 
       | In other words, building an app out of C modules is a lot like
       | building an app in a more modern language just using static
       | classes, with no instantiation. And that design pattern -- which
       | is extremely common in the embedded world -- we have a direct
       | equivalent to the "private" qualifier, which is "static", which
       | restricts the rest of the app from accessing so-marked file-scope
       | variables and functions.
       | 
       | Where this breaks down -- as always with C -- is when you need
       | multiple instantiations of a module, which modern programming
       | languages refer to as an object. The closest we can get in C is
       | to pass the module's public functions a struct with some sort of
       | data structure containing the object's n9n-static data. And the
       | author explains, there are standard ways make that data structure
       | opaque to calling code, but those are definitely workarounds to
       | language shortcomings.
       | 
       | But the bottom line is that those language shortcomings -- the
       | lack of objects and a private qualifier for its members -- are
       | only shortcomings if you need those features, and in the embedded
       | world, most applications don't, they only require all the
       | advantages offered by C. So as always, this is about picking the
       | right language for the project, there's no one size fits all.
        
       | ape4 wrote:
       | I've seen code that prefixes private members with an underscore
       | and adds a comment saying that its private. Not saying that's
       | great but it does send a message.
        
       ___________________________________________________________________
       (page generated 2023-07-17 23:00 UTC)