[HN Gopher] The C Programming Language: Myths and Reality
___________________________________________________________________
The C Programming Language: Myths and Reality
Author : lelanthran
Score : 194 points
Date : 2023-07-17 13:05 UTC (9 hours ago)
(HTM) web link (www.lelanthran.com)
(TXT) w3m dump (www.lelanthran.com)
| mcnichol wrote:
| I'm appreciative for the person who wrote the article.
|
| These are my favorite types of debates and while I disagree with
| it for much better articulated reasons already mentioned,
| primarily the exploitative nature of header files used as
| security through obscurity, I think it stirs up a lot of debate
| and keeps the spaghetti meatball of knowledge around
| patterns/anti-patterns, "best-practices", etc. moving forward in
| a much more passionate way.
|
| I think this is the truest sense of the term "hacker" which lives
| up to the title of the site pushing us into these debates.
| Putting stuff together that doesn't always work as intended or
| expected but arguing for and against it.
|
| A long winded thank you to everyone, OP and all the threads
| responding.
| lelanthran wrote:
| The first in a series of blogs posts I am putting together around
| the C programming language (Not the book!)
| sfpotter wrote:
| One of the advantages C has over C++ here is that the addition of
| templates in C++ makes this kind of encapsulation more difficult.
| See the proliferation of header-only libraries where nearly
| everything is templatized, as well as the pimpl idiom. You still
| get some encapsulation in the form of private, but with "C-style"
| encapsulation, if the header file doesn't change and the source
| files implementing what's declared in that header don't change,
| then none of the files need to be recompiled when rebuilding.
| This makes recompiling much faster.
|
| Of course, you could do much the same thing if you didn't use
| templates in C++, or if you were very disciplined and limited
| with your use of templates, but this seems to go against the
| grain of how I've seen C++ used.
| pjmlp wrote:
| No longer the case since C++20 modules.
| sfpotter wrote:
| Thanks for the downvote...? Not sure what the point of that
| was.
|
| I haven't used C++ much since modules were introduced. From
| doing some quick reading, it seems unclear whether they
| significantly improve compilations times in practice. I
| wasn't able to find anything which addressed how they relate
| to the ABI... I suspect this is a pretty complex topic. If
| you have any good references related to either of these
| points, I'd be very eager to read them.
|
| Either way, "C++ with modules" appears to me to be unlikely
| to clear the bar set by "C with opaque types" (which, for all
| intents and purposes, can be done in C++) in terms of 1) ABI
| stability and 2) encapsulation. I consider point (1) to be
| related to point (2), since details which are leaked into the
| ABI are not encapsulated.
| pjmlp wrote:
| First of all, who told you I downvoted you?!?
|
| Importing the whole standard library as defined by C++23
| _import std;_ is quicker than doing a plain _#include
| <iostream>_, as shared per Microsoft employees in some talk
| they have done regarding upcoming C++23 support.
|
| Unfortunely I can't remeber which one it was, but someone
| else can glady share the link.
|
| Second by having template details marked as private on the
| module metadata, that isn't directly exposed to consumers.
|
| As for the ABI specifically, that is compiler dependent
| anyway.
| cozzyd wrote:
| It is different, but provides much better encapsulation.
| Keeping ABI compatibility is much easier this way.
| throaway23423 wrote:
| In practice, please don't do this... it breaks inlining and adds
| unneeded allocations.
| ryu2k2 wrote:
| >You don't have inheritance [..]
|
| Modern PL features are more or less wrappers around comparatively
| complex C code. Object inheritance is actually one that isn't too
| difficult or complex to implement:
| https://www.youtube.com/watch?v=443UNeGrFoM&t=4275s
| jll29 wrote:
| > Modern PL features are more or less wrappers around
| comparatively complex C code. Object inheritance is actually
| one that isn't too difficult or complex to implement:
| https://www.youtube.com/watch?v=443UNeGrFoM&t=4275s
|
| True, and if you implement it as a preprocessor, that's exactly
| how C++ started in 1979 (Stroustrup's "C with classes" Cfront
| pre-processor:
| https://en.cppreference.com/w/cpp/language/history).
| commandlinefan wrote:
| When I first encountered Object Orient Programming around '97 or
| so, all of the literature focused on the three pillars of OO:
| encapsulation, inheritance and polymorphism. I was struck at the
| time at how OO mostly just formalized and gave specific names to
| what good programmers were already doing informally.
| wvenable wrote:
| A lot of comments/articles complaining about Object Oriented
| Programming -- especially the style implemented by C++ and
| similar languages -- start with premise that OOP is some
| academic prescription declared from up high like Moses and 10
| Commandments.
|
| But the reality is that the best practices of imperative
| programming where already much like object oriented programming
| and that OOP is a formalization of those practices.
| User23 wrote:
| Strongly agree. Good C developers pretty much instinctively
| do structured programming[1]. While it's superficially dated,
| the core concepts are still all very much applicable to the
| working programmer today and even largely paradigm
| independent.
|
| [1] https://dl.acm.org/doi/book/10.5555/1243380 (lousy
| interface, but PDF download available)
| pjmlp wrote:
| Indeed, I see many of the concepts as extensible modules, as
| one would use from Modula-2 or Object Pascal.
|
| That is why Oberon takes the spartan approach of only having
| extensible types, everything else is just like in Modula-2.
| Later descendants adopted a more mainstream approach.
|
| Likewise how OOP is done in Ada or Modula-3 isn't quite like
| in mainstream approach.
|
| Or when modules can be manipulated like variables, and given
| type signatures, we get Standard ML functors, with
| overlapping capabilities to OOP.
| rightbyte wrote:
| The words are also confusing and too much latin.
|
| Encapsulation - Hidden
|
| Inheritance - Same fields in beginning
|
| Polymorphism - Tagged struct
| slavapestov wrote:
| "Hidden" from whom and how? "Same fields in beginning"
| doesn't characterize multiple inheritance. "Tagged structs"
| are just one implementation strategy for one kind of
| polymorphism, and "tagged" and "struct" are themselves
| jargon. Using and knowing the precise terminology is
| important when communicating technical concepts.
| jcelerier wrote:
| > I was struck at the time at how OO mostly just formalized and
| gave specific names to what good programmers were already doing
| informally.
|
| I mean, that's the point of everything, isn't it? giving names
| and defining good practices so that everyone can benefit from
| them? because even today you can find a LOT of codebases which
| are an imperative mess
| gjulianm wrote:
| I wouldn't call this equivalent to the private keyword:
|
| * It does not work on a field, but on the whole struct. Either
| all fields are public or all are private. The latter case forces
| you to write getters/setters for properties you want users to
| access, and that in C can be even more cumbersome as you need to
| write the definition in the .h and the implementation in the .c.
|
| * It breaks, without an explicit and specific and error message,
| several actions. As it's mentioned in the article, malloc isn't
| possible, but neither are copies by value, sizeof() breaks... And
| those will break with an "incomplete type" error message, not a
| "this type is intentionally made private", which can add
| confusion in some situations.
|
| * Completely incompatible with inlining code. Considering a lot
| of people still use C precisely for its performance, I think this
| can be a drawback in a lot of usecases.
|
| I honestly think that hiding struct declarations should be done
| sparingly, and preferably limiting it to cases where it's
| actually necessary (for example, a library that doesn't expose
| internal struct fields so the same executable works with
| different versions of the dynamic library; or proprietary
| libraries that want to expose as little as possible). In the end
| it's still easy to bypass, and the distinction between header and
| code files already provide an indication of which functions you
| should use and which ones you shouldn't.
| falcrist wrote:
| I guess I'm reading the article a different way, because what
| I'm getting is that the author is suggesting the C analogue of
| a class is an entire header/source "module".
|
| That would make more sense, since you can use the header to
| craft an interface in which some components are public and some
| are private.
| c-linkage wrote:
| Brings back memories of CS101 back in 1992.
|
| They called this "modular programming" where "classes" were
| represented by opaque pointers and actions could only be
| performed on those pointers using the functions defined in
| the module.
| falcrist wrote:
| In the production C code I've written that makes use of
| modules with interfaces that define public and private
| variables and functions, I tend to avoid using pointers
| where reasonable. I'd prefer to either give access to the
| variable or provide getter and setter functions.
|
| What's the benefit of "opaque" pointers?
|
| For background, my C code is almost entirely on
| microcontrollers. So I'm looking at it from that point of
| view. If you're talking about event-based applications
| running inside a full operating system, I've always stepped
| up to something like C#, so I don't have much experience
| with function pointers for that kind of work.
| cozzyd wrote:
| yes, there is no benefit on micrcontrollers where
| everything is usually compiled together as one image (ok,
| I imagine you COULD design a microcontroller firmware
| with dynamically loadable sections... but let's not think
| about that).
|
| But for dynamic linking, this is how you avoid breaking
| ABI while maintaining forward flexibility.
| c-linkage wrote:
| For those accustomed to always using open source, the
| idea of hiding the implementation must seem odd.
|
| But consider the position of developer who implements a
| shared library that is distributed in binary form only.
| In this case, the benefit of opaque pointers _for the
| development of a library_ is that the implementation
| remains private at the source level. One could, of
| course, reverse-engineer the binary but few people would
| do it.
|
| If you define your structures in a public header -- and
| this includes C++ classes and templates with private
| members -- one can easily see the implementation and,
| with a few casts, start munging the guts of your objects
| and baking in a hard requirement on a specific layout and
| / or version of your library.
| falcrist wrote:
| Ok I see what you're saying. Closed source libraries can
| make use of this idea.
| ricardo81 wrote:
| That's how I read it.
|
| Clearly you can abstract away the parts of data that the API
| should not see.
| gjulianm wrote:
| Yes, that's usually how it goes, each header/source is like a
| "class". And usually, what you have is a "main" struct that
| represents the "class" itself. In this case that main struct
| can't have both private and public fields.
| lelanthran wrote:
| > Yes, that's usually how it goes, each header/source is
| like a "class". And usually, what you have is a "main"
| struct that represents the "class" itself. In this case
| that main struct can't have both private and public fields.
|
| You can, in a defined way (i.e., no invoking of UB). I just
| didn't put that in.
| gjulianm wrote:
| Out of curiosity, how would you do it? The ways I've seen
| require using another "public" struct and either casting
| public to private or using a nested pointer, each with
| their set of problems. In either case, it's still a bit
| hacky and still each struct is all-or-nothing public or
| private.
| lelanthran wrote:
| Pretty much. All the casting and hackiness isn't visible
| to the caller, and the implementation still maintains its
| ABI when the private stuff changes.
| falcrist wrote:
| > And usually, what you have is a "main" struct that
| represents the "class" itself.
|
| Yes that tracks with the work I've done. It could even be
| so simple that only one value needs to be exposed through a
| getter (along with a couple "methods").
|
| An ADC in an embedded system could operate like that.
| throwawayiddqd2 wrote:
| With plan9-extensions gcc flag: struct Thing
| { int i; };
|
| In an implementation c file: struct _Thing
| { struct Thing; int b; };
| struct Thing *new_thing() { struct _Thing
| *thing = malloc.. thing->i = 0;
| thing->b = 1; return thing;
| Joker_vD wrote:
| > It does not work on a field, but on the whole struct.
| struct PrivateFieldsOfMyShinyClass; struct
| MyShinyClass { int somePublicData;
| double morePublicData; struct
| PrivateFieldsOfMyShinyClass *p; };
|
| > malloc isn't possible, but neither are copies by value,
| sizeof() breaks
|
| Those are implementation details which are deliberately being
| hidden.
|
| > Completely incompatible with inlining code.
|
| They are as incompatible with code inlining as public/private
| modifiers in C++ are. That is, LTO is your best friend here.
| Also, have you ever tried to maintain binary compatibility with
| several versions of a third-party C++ library that keeps
| adding/removing private fields to/from its classes?
| gjulianm wrote:
| The code sample still isn't equivalent. Now you have yet
| another pointer, which implies another allocation, another
| source of possible memory leaks and mistakes, and a separate
| memory space that will hurt the cache.
|
| > Those are implementation details which are deliberately
| being hidden.
|
| I know they're being hidden deliberately, but in C++
| "private" doesn't break malloc (new), copies by value or
| sizeof. Or stack allocation, to add to the list.
|
| > They are as incompatible with code inlining as
| public/private modifiers in C++ are. That is, LTO is your
| best friend here.
|
| public/private aren't incompatible with inlining in C++. That
| is, you can call class functions that access private members
| and the compiler can inline those functions. Also, LTO is not
| always enabled by default, and doesn't always inline the
| things you want it to inline.
|
| > Also, have you ever tried to maintain binary compatibility
| with several versions of a third-party C++ library that keeps
| adding/removing private fields to/from its classes?
|
| I mentioned binary compatibility as one of the reasons one
| might want to do this. However, if you have a third party
| that doesn't care about API compatibility I doubt struct
| fields are the only thing they're going to change constantly.
| skribanto wrote:
| Maybe have your public fields defined as a second struct,
| and then you can cast the pointer to your struct to the
| concrete struct that has all the public fields. This has
| the restriction that all public fields must be at the
| start, and you must make sure to maintain the same order
| between the two structs.
|
| At this point though, I think I honestly would prefer
| setters/getters. struct MyClassPublic {
| int x; int y; ... }
| /* using it */ MyClass *myclass = myclass_create();
| ((MyClassPublic *)myclass)->x = 5;
| dfawcus wrote:
| struct public_stuff { ... }
| struct private_stuff { struct public_stuff
| public; ... } struct
| public_stuff *make_public(/* ... */) { struct
| private *prv = malloc(sizeof *prv); /* ... */
| return &prv->public; }
|
| Then when passed in to functions, cast the passed in struct
| pointer to the private one.
|
| The public struct doesn't even have to be at the start if one
| make appropriate use of offsetof and ensuring valid
| alignment.
|
| Nothing new under the sun...
| jstimpfle wrote:
| Hiding members in this way is only possible with pointer
| indirection, which isn't satisfying.
|
| However, having only a boolean private/public access state
| isn't generally satisfying either. It often leads to
| violation of the principle of separation of concerns when all
| the functions (methods) acting on certain "private" fields
| need to live in the same class.
|
| In simple classes, like std::vector, it's possible to get
| away with private. But in many cases that are more complex
| than that, it seems to me that the best approach is still to
| expose the data and to be just very clear about the exact
| purpose of each member.
| lelanthran wrote:
| > I wouldn't call this equivalent to the private keyword:
|
| I didn't mean to imply it is, I lead with:
|
| >> All too often someone, somewhere, on some forum ... will
| lament the lack of encapsulation and isolation in the C
| programming language. This happens with such regularity that I
| now feel compelled to address the myth once and forever.
|
| It's only about the myth that C doesn't have _any_ level of
| encapsulation or isolation.
| brabel wrote:
| Nearly every rebuttal on the internet starts with
| misinterpretation of what's being said. Good reading skills
| are extremely rare, it seems.
|
| To be clear: you tried to say "C offers
| encapsulation/isolation". People read "This solution is
| equivalent to 'private'", an almost completely unrelated
| statement, and then respond to that.
|
| That could typically be classified as a "Straw man
| fallacy"[0], but I believe people who do this in many cases
| simply do not have the necessary reading skills to understand
| what proposition has been made, and therefore honestly
| believe themselves to be reasoning correctly (i.e. without
| fallacy).
|
| Reading comprehension used to be a topic at school when I was
| a child. I suppose that's no longer the case??
|
| [0] https://en.wikipedia.org/wiki/Straw_man
| gjulianm wrote:
| As the other commenter said, and I want to reiterate, what
| led me to start with the talk about the private keyword is
| the big bold header that says "Myth: C has no equivalent to
| "private"" and the first code snippet that shows how you
| can have "private" fields in C++ and the following ones
| that show an "equivalent" implementation in C. So I think
| it's reasonable to infer that the author is talking about
| encapsulation and "the private keyword" as somewhat
| interchangeable (and I don't disagree, for this discussion
| they're practically the same). Not only that, but the
| points I made are about the implementation shown in the
| article, which is independent of whether the talk is about
| "private equivalency" or "lack of encapsulation": the gist
| of it is "yes, you 'encapsulate' things but not in the same
| way and it comes with disadvantages that aren't really
| there in other languages".
|
| With all that said, I don't think all that condescending
| talk about the lack of reading comprehension or skills,
| without actually going into the arguments themselves, is
| really necessary or positive.
| jasode wrote:
| _> To be clear: you tried to say "C offers
| encapsulation/isolation". People read "This solution is
| equivalent to 'private'", an almost completely unrelated
| statement, and then respond to that.
|
| >That could typically be classified as a "Straw man
| fallacy"[0], but I believe people who do this in many cases
| simply do not have the necessary reading skills _
|
| Fyi... the author's article that this thread is about has
| in bold heading: _" Myth: C has no equivalent to
| "private""_
|
| So, a reasonable interpretation of the text following that
| headline is how to use C Language constructs to dispel that
| myth.
|
| Doesn't seem like "straw man" applies here.
| gjulianm wrote:
| I mean, there's a big header that says "Myth: C has no
| equivalent to "private"" and a code example about the private
| keyword, so that's why I started saying that. Even then, my
| points still apply: this isn't really equivalent to how
| encapsulation works in other languages due to the lack of
| granularity and the "extras" of all the usual language
| behavior that stops being supported.
| dmitrygr wrote:
| >* Completely incompatible with inlining code
|
| You were right 10 years ago. Today we have LTO
| 0xfedbee wrote:
| Great article lelanthran. It's always refreshing to see someone
| going against the popular opinion here and piss everyone off.
| Keep it up!
| simias wrote:
| I see where you're coming from but IMO this "Cheshire cat" idiom
| to hide the implementation details is not exactly like private,
| it fact it can do things that private can't do, and doesn't do
| things private does.
|
| The advantage of hiding your state behind an opaque struct with
| builders and accessors is that you can change the size and layout
| of said struct without it being a breaking API change. The code
| remains binary compatible even, no need for a recompile if you're
| shipping a shared lib. This is something just using private
| members doesn't achieve since with private members the compiler
| still knows and uses the layout of the struct, it just forbids
| access to it.
|
| That's why you can even find C++ libraries use this idiom even
| though C++ obviously has `private`. It's about having a stable,
| opaque API.
|
| On the other hand because of this added indirection, there's
| usually a greater performance hit to accessing these opaque
| structs since code can't be inlined. With private since the
| compiler can still see inside the struct, it's able to more
| aggressively optimize the code. You can also store the objects
| directly on the stack without requiring malloc.
|
| IMO the right way to have private members in C structs is... to
| document that members shouldn't be touched directly, perhaps
| using a special naming convention or embedding the publicly-
| accessible members in a dedicated sub-struct to prevent
| confusion.
| Joker_vD wrote:
| > IMO the right way to have private members in C structs is...
| to document that members shouldn't be touched directly, perhaps
| using a special naming convention or embedding the publicly-
| accessible members in a dedicated sub-struct to prevent
| confusion.
|
| Reminds me of that one time when glibc broke the whole of
| Debian for s390 architecture by changing the fields in the
| jmp_buf struct (which is public): [0].
|
| [0] https://lwn.net/Articles/605607/
| bluetomcat wrote:
| To achieve a reasonable level of encapsulation in C, a header
| file must be seen as a public-only interface. It should declare
| only the structs that are relevant for the user of the module.
| If that's "struct my_module_handle { ... }", declare it and
| document the corresponding accessor and modifier functions.
| Everything else must reside in the C source file with internal
| linkage (static storage class). The whole source file is your
| implementation.
|
| There is an anti-pattern where header files are used for all
| the declarations needed internally by the source file.
| Including (pasting verbatim with the preprocessor) that file
| from another module would bring in all the unnecessary
| declarations.
| hbossy wrote:
| This is how it's supposed to be done but you always end-up
| moving them to header just to make writing unit tests less
| painful.
| 10000truths wrote:
| This is a smell. Your unit tests should not have to rely on
| internal implementation details.
| icedchai wrote:
| Are unit tests common in C? In the mid-2000's, I worked
| on an "enterprise" system, written in C and C++. There
| were about 300,000 lines of code, maybe 10 tests. This
| thing was the core of a billion dollar business
| gpderetta wrote:
| And in the worst case you can have module-private
| headers. No need to pollute your interface.
| coldtea wrote:
| They don't have to and shouldn't, but it's convenient.
| That's the parent's point ("to make writing unit tests
| less painful").
| menaerus wrote:
| Opaque pointers usually impose the restriction on the API
| such that in order to use the handle one has to dynamically
| allocate the object on heap. That's a quite unfortunate
| tradeoff IMO.
| cozzyd wrote:
| there are ways around this, if VLAs are allowed.
| // in <opaque_foo.h> typedef struct opaque_foo
| opaque_foo_t; size_t opaque_foo_sz(void); void
| opaque_foo_init(opaque_foo_t* foo) // in your
| code, which you could write a helper macro for if you were
| so inclined char opaque_foo_mem[opaque_foo_sz()];
| opaque_foo_t * my_foo = (opaque_foo_t*) opaque_foo_mem;
| opaque_foo_init(my_foo);
| kopecs wrote:
| Doesn't this violate strict aliasing?
| cozzyd wrote:
| yes, though you can fix that with compiler flags (or,
| #pragma if you want strict aliasing elsewhere in your
| code)
|
| alternatively, gcc supports VLAs in unions, but I don't
| think clang does, but that makes it extra annoying to do.
|
| edit: apparently you can probably apply the may_alias
| attribute to the type? Or you could try using
| transparent_union. No idea if clang supports either...
| jenadine wrote:
| And alignment?
| cozzyd wrote:
| yes, you may need an alignas depending on platform
| (though you probably want it even if unaligned access is
| supported).
| simias wrote:
| I think what you say makes complete sense at module-level (as
| in, for a standalone lib for instance) but I never bother
| segregating things internally within a lib/module/exe and
| rely on good documentation and coding practices to avoid
| having member mutations all over the place.
|
| If I code in Rust or C++ I can use namespacing and
| public/private to give every single object in the codebase a
| clean interface, but in C doing that is just frustrating, not
| to mention potentially inefficient.
| Dwedit wrote:
| Link-time optimization means that you're probably not going to
| take that much of a performance hit.
|
| But yes, opaque structs do enforce that it will be treated as a
| plain pointer, and the compiler (usually) cannot treat it as an
| aggregate of variables.
| Athas wrote:
| If a linker did that, changing the layout would be an ABI-
| breaking change. I think this opaque struct design is most
| common for dynamically loaded libraries, where link time
| optimisation does not occur (unless dynamic linkers got a lot
| more fancy recently).
| zokier wrote:
| > The code remains binary compatible even, no need for a
| recompile if you're shipping a shared lib. This is something
| just using private members doesn't achieve since with private
| members the compiler still knows and uses the layout of the
| struct, it just forbids access to it.
|
| There is somewhat common PIMPL idiom to work around the binary
| compat issue. Iirc there were some macros floating around to
| make it easier to manage.
| maleldil wrote:
| What do the macros do? Isn't this as easy as forward
| declaring the impl class, adding std::unique_ptr<Impl> as a
| private field and have public methods refer to the field? I'm
| struggling to understand why macros would help here.
| comex wrote:
| Perhaps the macros help you forward methods on the outer
| class to methods on the impl class? While your approach of
| having public methods refer to the field also works, it's
| nice to have public and private methods in the same place
| (the impl class's definition) and using the same syntax
| (neither having to go through the impl field).
| saghm wrote:
| I'm also only familiar with this idiom in C++, but based on
| the description in the parent comment, I suspect that this
| is sometimes used in C too, in which case you obviously
| can't use unique_ptr or private fields; maybe macros might
| be a way to avoid having to write a bunch of boilerplate to
| achieve a similar effect?
| c-linkage wrote:
| I like the way that Windows does it, where they have as the
| first element of the struct a double-word size (dwSize) element
| that records in 32-bits the size of the structure. The size
| essentially acts as a version identifier, as long as you never
| rearrange the fields and only append fields for new versions.
| The opaque functions test the value of the dwSize element to
| see what actions can be performed on the object.
|
| The code that _you_ develop can still access the member fields
| directly, and those accesses can be inlined and optimized
| aggressively by the compiler.
| bluejekyll wrote:
| This implies a branch statement in every function call,
| doesn't it?
| veltas wrote:
| There's zero cost abstractions and then there's zero
| features abstractions.
| speed_spread wrote:
| The cost of that branch is insignificant next to that of
| the syscall you opted to make.
| nvy wrote:
| Jumping through all these hoops just to, what? Avoid the garbage
| collector?
|
| It's okay to admit C has shortcomings.
| zer8k wrote:
| The OP article seems to be someone trying to cram modern
| concepts into a language that explicitly rejects them. A
| symptom that the OP should've probably just used another
| language.
| lelanthran wrote:
| > It's okay to admit C has shortcomings.
|
| I did that, didn't I?
|
| >> Just to be clear, C is an old language lacking many, many,
| many modern features. One of the features it does not lack is
| encapsulation and isolation.
| bjourne wrote:
| Sure, but it still goes against the grain of the language. You
| could write a similar article explaining why everyone who thinks
| C doesn't have automatic memory management are wrong. Opaque
| structs are relatively annoying to work with especially if
| sibling modules in the same package have good reasons to
| manipulate the private parts directly. It is not nearly as
| convenient as it is in a language with better support for
| encapsulation (e.g Java). Most of the C code I write do not
| encapsulate anything. It's not worth the bother. Especially not
| when unit-testing for which encapsulation would force you to
| write lots of redundant getters and setters just for the unit
| tests themselves. My view is that you simply shouldn't use C if
| you need encapsulation.
| jasode wrote:
| To the author : your explanation can be interpreted as "correct"
| but also be aware that -- for some readers -- your argument is a
| variation of the Turing Tarpit:
| https://en.wikipedia.org/wiki/Turing_tarpit
|
| In other words, the 2 different possible receptions to your post:
|
| - YES, file-level modularity with opaque structs is _equivalent_
| to class private members --> for those mindsets already
| sympathetic to C Language
|
| - NO, using file-scoping rules and structs is _not equivalent_ to
| class private members because it 's a bunch of extra ceremonial
| syntax to implement a workaround. (The "Turing Tarpit"). It's
| using the opaque struct as a "design pattern" and as Peter Norvig
| famously said, _" Design patterns are bug reports against your
| programming language."_
| alpaca128 wrote:
| I don't think boilerplate code has much to do with Turing
| Tarpits. It might be annoying but not Brainfuck-grade insanity.
| colonwqbang wrote:
| I would argue that C++ "pimpl" design pattern brings more
| "ceremonial syntax" than the C equivalent.
|
| C++ style (without "pimpl") requires recompilation of the whole
| dependent tree when adding a new private member function. It's
| encapsulation only in a formal sense
| ReflectedImage wrote:
| Actually, you can just have a public version of a struct and a
| private version of a struct with more fields.
|
| You can give callers the public version and then cast it to the
| private version for internal usage.
| bluejekyll wrote:
| These will have different sizes though, so it's only safe to
| cast when used off the heap and where it's memory is allocated
| with the larger variant, right?
|
| I guess my point is that there's a whole bunch of caveats in
| regards to safety that need to be considered in your solution.
| zh3 wrote:
| That seemed a complicated way of reminding us that C++ (at least
| originally) is/was compiled down to pure C and thus can't do
| anything that C can't do.
| 3cats-in-a-coat wrote:
| So let's do polymorphic virtual methods now.
| adwn wrote:
| > _Hell, they cannot even malloc() their own StringBuilder
| instance, because even the size of the StringBuilder is hidden.
| They have to use creation and deletion functions provided in the
| implementation as specified in the interface._
|
| And you've just made it impossible for the users of your
| StringBuilder to pass it around by-value. Every instance has to
| be malloc'ed by your library, even though it's just a tiny, word-
| sized struct. Awesome! And each access needs to go through an
| additional pointer indirection. All this just to pretend that C
| supports proper encapsulation. Hooray!
|
| I'm sorry that I'm targeting your blog post specifically, but
| it's just so stereotypical of C proponents, that can't (or
| won't?) realize that their favorite programming language is
| inherently limiting and limited along several very important
| dimensions. It makes me think that although some of them might be
| excellent _programmers_ , they make for terrible _software
| engineers_.
| deadbeeves wrote:
| What do you mean? The fact that C can do this is an example of
| how it's _not_ limited. A lot of other languages instead
| _require_ you to allocate everything in the heap and there 's
| no possibility of passing things by copy, or of not accessing
| things through anything other than a pointer. C at least is
| capable of allocating and accessing at least some things
| directly on the stack.
| adwn wrote:
| > _The fact that C can do this is an example of how it 's not
| limited._
|
| No, C is limited because it's mutually exclusive: _either_
| encapsulation, _or_ zero overhead by-value passing. Other
| languages, like C++ or Rust, allow _both at the same time_.
| rightbyte wrote:
| You can do dummy defines of structs with the same size as
| the real one if you want the struct on the stack and
| encapsulation.
| vore wrote:
| At that point you might as well just name your fields
| like DONTTOUCHTHIS_foo if you're having to keep the
| private definition with fields in sync with the opaque
| public definition (and making sure the alignment and
| sizing are always in sync with the private one...)
| e4m2 wrote:
| I've done this before, works quite well actually, but
| isn't very popular for some reason.
|
| > You can do dummy defines of structs with the same size
|
| Don't forget alignment. The general pattern is:
| https://godbolt.org/z/6je9Yb3rf.
| sfpotter wrote:
| Maybe I'm missing the idea, but I'm not sure how this
| idea is supposed to work without using something like
| alloca.
|
| If you have: struct foo;
|
| in foo.h and: struct foo {
| int a; double x; ... };
|
| in foo.c, you won't have sizeof(struct foo) available
| from bar.c, so your construction won't work in bar.c. You
| could define a function: size_t
| sizeof_foo(void);
|
| which just returns sizeof(struct foo) from inside foo.c,
| but since this size is now only known at runtime, you'll
| need to resort to alloca or VLAs...
| gpderetta wrote:
| You define public_foo in the header with the public
| members and a appropriately sized byte array for the
| private members.
|
| In the .c file you define private_foo, same a public_foo
| except that the byte array is replaced with the actual
| members.
|
| You static assert that size and alignment match and cast
| at function boundaries.
|
| You hope not to have violated strict aliasing rules.
|
| This is not completely unlike type erasure with small
| buffer optimization done by some c++ classes like
| std::function.
| sfpotter wrote:
| OK, I understand what you're doing now. Thanks for
| clarifying.
|
| The big downside here is that you're leaking the size of
| the details into your ABI which wouldn't happen with a
| fully opaque type... I could see some uses for it but
| haven't felt a strong enough need to reach for it before,
| although it has occurred to me.
| gpderetta wrote:
| Of course there is no way around that. A partial
| mitigation is the same as done for network protocols:
| reserve some space for future extensions.
| deadbeeves wrote:
| Eh. Arguably in C++ you don't get proper encapsulation just
| with private members, because changing the layout for those
| members changes the ABI.
| adwn wrote:
| Don't let Perfect be the enemy of Good. The quality of
| encapsulation you can achieve in C++ is miles ahead of
| that of C, even if it isn't all that could ever be.
| pjmlp wrote:
| For example, user defined types that behave like built-
| ins, while preserving invariants.
|
| Specially great in IoT instead of macros accessing
| directly IO ports.
| dasyatidprime wrote:
| There's a third vertex to the triangle here: C++ and Rust
| allow both at the same time by dropping separate
| compilation. More thoroughly so in Rust than in C++, but
| header-focused libraries move C++ further toward whole-
| program compilation compared to C (maybe you could call it
| "large-overlapping-chunks-of-program compilation").
| jll29 wrote:
| Very valid point. I keep fond memories of Modula-2, which
| has DEFINITION modules and IMPLEMENTATION modules, such
| that you can compile the former and the latter seprately.
|
| In Modula-2, I can specify an API in its DEFINITION
| module, and after compiling it, client applications can
| use such an API without the implementation being ready
| yet, and still I can compile the client and check if it
| is free of syntax errors.
| _gabe_ wrote:
| > And you've just made it impossible for the users of your
| StringBuilder to pass it around by-value. Every instance has to
| be malloc'ed by your library, even though it's just a tiny,
| word-sized struct. Awesome! And each access needs to go through
| an additional pointer indirection.
|
| ...and?
|
| I'm guessing the implication here is that you'll trash
| performance by doing this. How can you assume that? The thing
| about optimizing code is, you don't know where your hot paths
| are until you _profile_ your code. And, the one thing
| experience has taught me, my _intuitions_ about what the hot
| spots will be _rarely_ match reality. There 's nothing wrong
| with that, complex systems are complex, and we have incredible
| profiling tools to eat through that complexity and highlight
| the hot spots for us.
|
| Now, I know you'll probably go on about a death by a thousand
| cuts etc. The thing is, well-crafted modules typically don't
| encapsulate on a fine-grained level. You usually have larger
| _systems_ that hide details. These systems are usually used a
| fraction of the time the rest of your program is. So the
| indirections end up usually being a very insignificant cost to
| the overall program.
|
| And if you _are_ coding in such a way that copying a string
| builder by value and /or the indirection imposed by
| encapsulating that information is a bottleneck, I highly doubt
| that "fixing" this by copying by value and/or removing the
| indirection will suddenly make your entire program performant.
|
| > It makes me think that although some of them might be
| excellent programmers, they make for terrible software
| engineers.
|
| You haven't actually highlighted any issues here and then go on
| to finish your argument with an ad hominem. Instead of
| attacking the competence of C programmers, you should
| illustrate the actual real world impact that this design
| philosophy results in. I know plenty of _really_ slow Java
| libraries, and plenty of _really_ fast C libraries that use
| this method of encapsulation. So if your argument is that using
| this method trashes performance, it 's a poor argument that
| doesn't have many real world examples (unless you know of some
| off the top of your head).
| adwn wrote:
| > _The thing about optimizing code is, you don 't know where
| your hot paths are until you profile your code._
|
| Absolutely! So, you profile your program, and it turns out
| that 95% of the runtime is caused by malloc/free in tight
| loops, which you can't get rid of, because they're hidden
| behind an API which had to choose between encapsulation and
| efficiency.
|
| > _And if you are coding in such a way that copying a string
| builder by value and /or the indirection imposed by
| encapsulating that information is a bottleneck [...]_
|
| You don't seem to realize that the _StringBuilder_ was just
| an example to illustrate this style of encapsulation?
| Oftentimes you want to encapsulate actual "value structs",
| where it is sensible to create millions of them in an array.
| In C, you're forced to choose between following good software
| engineering practices (=> encapsulation) and getting good
| performance.
| _gabe_ wrote:
| > So, you profile your program, and it turns out that 95%
| of the runtime is caused by malloc/free in tight loops,
| which you can't get rid of, because they're hidden behind
| an API which had to choose between encapsulation and
| efficiency.
|
| I have literally never run into a library that was written
| so badly that using the library encouraged you to use the
| API to create millions of small objects. That's what I'm
| saying. Sure, this _can_ happen, but in _reality_ I 've
| never seen it. Can you show me where this hypothetical
| scenario is occurring and trashing people's performance? We
| probably want to avoid using those libraries.
|
| Instead, I usually see encapsulation used like it is in
| GLFW, or libcurl, or stbi. The encapsulation covers
| _systems_ and not tiny objects, which encourages the user
| of the library to not make API calls millions of times or
| construct millions of tiny objects.
|
| > You don't seem to realize that the StringBuilder was just
| an example to illustrate this style of encapsulation?
| Oftentimes you want to encapsulate actual "value structs",
| where it is sensible to create millions of them in an
| array.
|
| I did realize this. Encapsulation is typically useful on
| larger systems. Once you get to the point of millions of
| objects, you usually have a larger system managing those
| millions of objects. And ideally, those millions of objects
| should be POD. If they're POD, encapsulating the data makes
| no sense at that point, because it makes more sense to
| encapsulate whatever is managing that data.
|
| > In C, you're forced to choose between following good
| software engineering practices (=> encapsulation) and
| getting good performance.
|
| This is a false dichotomy. There are plenty of large C
| projects that follow good software engineering practices
| (which is entirely subjective, what is "good"?). Look at
| any OS kernel, or the libraries I mentioned above.
|
| So, once again, I'm curious if you know of any C libraries
| (ab)using encapsulation in the hypothetical scenario you've
| laid out. If there aren't any libraries that do this, then
| this is a non-issue and attacking the competence of C
| developers is entirely unwarranted since you've built up a
| strawman that doesn't exist in reality.
| xbar wrote:
| The best software engineers I ever worked with were all C
| programmers.
|
| Further, and quite separate, the term "software engineer" was
| coined on behalf of assembly programmers, whose language lacks
| even further features.
|
| Finally, I am not sure what you mean when you say they make bad
| software engineers. Perhaps we have different definitions of
| software engineer.
| pjmlp wrote:
| The term Software Engineer is something that in plenty of
| countries is validated by the engineering organization and is
| a legal title, not something one feels like calling
| themselves.
|
| Which also validates that any university teaching software
| engineering has a certain quality level, and portofolio of
| lectures, to create a general background across all subjects
| of engineering practices besides writing code.
| adwn wrote:
| If someone doesn't recognize when and how their tools limit
| the quality of their work, they can't be good craftsmen.
| There might still be good reasons to use those tools (e.g.,
| no better alternatives, or an existing ecosystem), but if you
| don't realize that your programming language is fundamentally
| limiting in ways that other languages are not, then you'll
| never know how to build better software.
| c-linkage wrote:
| Passing around potentially large objects by value is wasteful
| and prone to move / copy semantics.
|
| Most languages pass objects by reference (C# and Java chief
| among them).
|
| Still, if you really want to pass by value -- even though
| you'll likely end up with pointer ownership problems -- you
| just add a few functions to the API to do so.
|
| Creating an opaque type on the stack _can_ be done, you just
| need a little more work.
| oneeyedpigeon wrote:
| Great article, OP, but you forgot to populate the href on your
| "swig" link.
| lelanthran wrote:
| Thanks, fixed.
| mojosam wrote:
| I think there's a better example, but whether it applies it
| depends on one of two major divisions of C code: that designed to
| run on systems with a MMU (as typically used for Linux and other
| large OSes) -- where virtual memory makes dynamic momory
| allocation practical -- and those without -- which today is
| primarily the very large world of embedded devices.
|
| For the latter, the industry best practice is to avoid malloc(),
| except maybe at init time, and instead allocate memory
| statically. And in that use case, you break your code into
| modules, which can contain private data, public data, private
| functions, and public functions.
|
| In other words, building an app out of C modules is a lot like
| building an app in a more modern language just using static
| classes, with no instantiation. And that design pattern -- which
| is extremely common in the embedded world -- we have a direct
| equivalent to the "private" qualifier, which is "static", which
| restricts the rest of the app from accessing so-marked file-scope
| variables and functions.
|
| Where this breaks down -- as always with C -- is when you need
| multiple instantiations of a module, which modern programming
| languages refer to as an object. The closest we can get in C is
| to pass the module's public functions a struct with some sort of
| data structure containing the object's n9n-static data. And the
| author explains, there are standard ways make that data structure
| opaque to calling code, but those are definitely workarounds to
| language shortcomings.
|
| But the bottom line is that those language shortcomings -- the
| lack of objects and a private qualifier for its members -- are
| only shortcomings if you need those features, and in the embedded
| world, most applications don't, they only require all the
| advantages offered by C. So as always, this is about picking the
| right language for the project, there's no one size fits all.
| ape4 wrote:
| I've seen code that prefixes private members with an underscore
| and adds a comment saying that its private. Not saying that's
| great but it does send a message.
___________________________________________________________________
(page generated 2023-07-17 23:00 UTC)