[HN Gopher] Exploring Polymorphism in C: Lessons from Linux and ...
___________________________________________________________________
Exploring Polymorphism in C: Lessons from Linux and FFmpeg's Code
Design (2019)
Author : dreampeppers99
Score : 208 points
Date : 2025-03-06 14:23 UTC (4 days ago)
(HTM) web link (leandromoreira.com)
(TXT) w3m dump (leandromoreira.com)
| KerrAvon wrote:
| > The interface type in golang is much more powerful than Java's
| similar construct because its definition is totally disconnected
| from the implementation and vice versa. We could even make each
| codec a ReadWriter and use it all around.
|
| This paragraph completely derailed me -- I'm not familiar with
| golang, but `interface` in Java is like `@protocol` in
| Objective-C -- it defines an interface without an implementation
| for the class to implement, decoupling it entirely from the
| implementation. Seems to be exactly the same thing?
| mananaysiempre wrote:
| The difference between Go and Java is that in Go a type need
| not declare its adherence to an interface up front--any type
| that has methods of appropriate names and signatures is
| considered to implement the interface, even if its designers
| were not aware of the interface's existence. (This involves a
| small bit of dynamism in the runtime; easily cached, though, as
| the set of methods of a given type and the set of all
| interfaces are both fixed by the time the program runs.)
| Whether that's a good thing depends on your design
| sensibilities (ETA: nominal vs structural).
| jeroenhd wrote:
| For those wishing Java had a similar feature, there's
| Manifold: https://github.com/manifold-
| systems/manifold/tree/master/man...
|
| Manifold is a very interesting project that adds a lot of
| useful features to Java (operator overloading, extension
| classes, and a whole bunch more). I don't know if it's smart
| to use it in production code because you basically go from
| writing Java to writing Manifold, but I still think it's a
| fun project to experiment with.
| sitkack wrote:
| http://manifold.systems/
|
| > Manifold is a Java compiler plugin, its features include
| Metaprogramming, Properties, Extension Methods, Operator
| Overloading, Templates, a Preprocessor, and more.
|
| Neat tool. It is like having a programmable compiler built
| into your language.
| kazinator wrote:
| GNU C++ once had this feature; it was called Signatures. It
| was removed, though.
|
| A signature declaration resembled an abstract base class. The
| target class did not have to inherit the signature: just have
| functions with matching names and types.
|
| The user of the class could cast a pointer to an instance of
| the class to a pointer to a compatible signature. Code not
| knowing anything about the class could indirectly call all
| the functions through the signature pointer.
| JavierFlores09 wrote:
| The funny thing about Java is that while its design is to be
| entirely nominally typed, the way it is implemented in the
| JVM is compatible with structural typing, but there are
| artificial limitations set to follow the intended design
| (though of course, if one were to disable these limitations
| then modeled type safety goes out of the window as Java was
| simply not designed to be used that way). One community which
| takes advantage of this fact is the Minecraft modding space,
| as it is the basis[1] of how modding platforms like Fabric
| work.
|
| 1: https://github.com/SpongePowered/Mixin/wiki/Introduction-
| to-...
| owlstuffing wrote:
| >The difference between Go and Java is that in Go a type need
| not declare its adherence to an interface up front.
|
| Go _can 't_ declare adherence up front, and in my view that's
| a problem. Most of the time, explicitly stating your intent
| is best, for both humans reading the code and tools analyzing
| it. That said, structural typing has its moments, like when
| you need type-safe bridging without extra boilerplate.
| duskwuff wrote:
| You can assert that your type implements an interface at
| compile time, though, e.g. var _
| AssertedInterface = &MyType{}
| relistan wrote:
| One of the main uses for interfaces in Go is defining the
| contact for _your_ dependencies. Rather than saying your
| function takes a socket, if you only ever call Write(), you
| can take a Writer, or define another interface that is only
| the set of functions you need. This is far more powerful
| than declaring that your type implements an interface up
| front. It allows for things like e.g. multiple image
| libraries to implement your interface without knowing it,
| enabling your project to use them interchangeably. And as
| another commenter said, you can have the compiler verify
| your compliance with an interface with a simple (though
| admittedly odd looking) declaration.
| williamdclt wrote:
| > It allows for things like e.g. multiple image libraries
| to implement your interface without knowing it
|
| That virtually never happens. Seriously, what would be
| the odds? It's so much more usual to purposefully
| implement an interface (eg a small wrapper the writer
| thingy that has the expected interface) than to use
| something that happens to fit the expected interface by
| pure chance.
|
| It's not a structural vs nominal problem but other,
| typescript is structural but has the implements keyword
| so that the interface compliance is checked at
| declaration, not at the point of use. You don't have to
| use it and it will work just like Go, but I found that in
| 99% of cases it's what I want: the whole point of me
| writing this class is because I need an interface
| implementation, might as well enforce it at this point.
| billfruit wrote:
| So golang supports 'duck typing'?
| theLiminator wrote:
| I think in a static context, it's generally referred to as
| structural typing, but yeah.
| williamdclt wrote:
| I don't agree it's a structural VS nominal difference.
| Typescript is structural, but it does have the "implements"
| keyword.
|
| Which makes a million times more sense to me, because
| realistically when do you ever have a structure that usefully
| implements an interface without being aware of it?? The
| common use-case is to implement an existing interface (in
| which case might as well enforce adherence to the interface
| at declaration point), not to plug an implementation into an
| unrelated functionality that happens to expect the right
| interface.
| cognisent wrote:
| TypeScript doesn't require a class to use it, though,
| because it's structurally typed. All that "implements Foo"
| in this example does is make sure that you get a type error
| on the definition of "One" if it doesn't have the members
| of "Foo".
|
| If "Two" didn't have a "name: string" member, then the
| error would be on the call to "test".
| interface Foo { name: string }
| class One implements Foo { constructor(public
| name: string) {} } class Two {
| constructor(public name: string) {} }
| function test(thing: Foo): void { //...
| } test(new One('joe')); test(new
| Two('jane'));
| juwjfoobar wrote:
| interfaces in Go are structural. Interfaces in Java are nominal
| and require immediate declaration of intent to implement at
| type definition.
| psychoslave wrote:
| Shouldn't this be named _phenomenal_ rather than
| _structural_? In both cases there is a structure assumed, but
| one is implicitly inferred while the other one is explicitly
| required.
| relistan wrote:
| I think you're making a joke, but in Go you get both. You
| can have the compiler enforce that you implement an
| interface with a simple declaration. Most people do.
| psychoslave wrote:
| No intended joke in that case, but it's nice to have the
| feedback it might seen as if it was.
|
| I don't know Go to be frank, just had a very shallow look
| at it once because of an interview, and apart big names
| behind it, it didn't shine in any obvious way -- but
| that's also maybe aligned with the "boring" tech label it
| seems associated with (that is, in positive manner for
| those who praise it).
| layer8 wrote:
| The difference is that in Go, an interface is assumed to
| match if the method signatures match. In other words, the
| match is done on the type structure of the interface, hence
| the "structural" designation. Nominal typing, on the other
| hand, considers that interfaces tend to be associated with
| important semantic requirements in addition to the type
| signature, and that mere type-structure matching doesn't at
| all guarantee a semantic match. For that reason, the
| semantics are implicitly bound to the declared name of the
| interface, and the way for an implementation to claim
| conformance to those semantics is to explicitly bind itself
| to that name.
| social_quotient wrote:
| I spend a ton of time in FFmpeg, and I'm still blown away by how
| it uses abstractions to stay modular--especially for a project
| that's been around forever and still feels so relevant. Those
| filtergraphs pulling off polymorphism-like tricks in C? It's such
| an elegant way to manage complex pipelines. e.g.
|
| ffmpeg -i input.wav -filter_complex " [0:a]asplit=2[a1][a2];
| [a1]lowpass=f=500[a1_low]; [a2]highpass=f=500[a2_high];
| [a1_low]volume=0.5[a1_low_vol]; [a2_high]volume=1.5[a2_high_vol];
| [a1_low_vol][a2_high_vol]amix=inputs=2[a_mixed];
| [a_mixed]aecho=0.8:0.9:1000:0.3[a_reverb] " -map "[a_reverb]"
| output.wav
|
| That said, keeping those interfaces clean and consistent as the
| codebase grows (and ages) takes some real dedication.
|
| Also recently joined the mailing lists and it's been awesome to
| get a step closer to the pulse of the project. I recommend if you
| want to casually get more exposure to the breadth of the project.
|
| https://ffmpeg.org/mailman/listinfo
| MuffinFlavored wrote:
| how similar are the C abstractions in ffmpeg and qemu given
| they were started by the same person?
| variadix wrote:
| I haven't worked with ffmpeg's code, but I have worked with
| QEMU. QEMU has a lot of OOP (implemented in C obviously) that
| is supported by macros and GCC extensions. I definitely think
| it would have been better (and the code would be easier to
| work with) to use C++ rather than roll your own object model
| in C, but QEMU is quite old so it's somewhat understandable.
| I say that as someone who mostly writes C and generally
| doesn't like using C++.
| shmerl wrote:
| What's the reason for ffmpeg to use C, also historic?
| defrost wrote:
| Fabrice _also_ wrote the Tiny C compiler, so very much
| his language of choice ..
|
| For those used to the language it was seen as "lighter"
| and easier to add OO like abstractions to your C usage
| than bog down in the weight and inconsistencies of
| (early) C++
|
| https://bellard.org/
|
| https://en.wikipedia.org/wiki/Fabrice_Bellard
| lelanthran wrote:
| > weight and inconsistencies of (early) C++
|
| Since very little is ever removed from C++, all the
| inconsistencies in C++ are still there.
| maccard wrote:
| Every language has inconsistencies, and C is not stranger
| to that. Much of c++'s baggage is due to C and you carry
| the same weight. That's not to say that initialization
| isn't broken in C++, but just like many features in many
| languages (off the top of my head in C - strcpy, sprintf,
| ctime are like hand grenades with the pin pre pulled for
| you) don't use them. There's a subset of C++17 that to me
| solves so many issues with C and C++ that it just makes
| sense to use. An example from a codebase I spend a lot of
| time in is int val; bool
| valueSet = getFoo(&val); if (valueSet) {}
| printf("%d", val); // oops
|
| The bug can be avoided entirely with C++
| if (int val; getFoo(&val)) // if you control getFoo this
| could be a reference which makes the null check in getFoo
| a compile time check {} printf("%d",
| val); // this doesn't compile.
| ablob wrote:
| You could write this in C, no? { int val;
| if (getFoo(&val)) { ... }}
|
| Both ways of expressing this are weird, but stating that
| this can't be achieved with C is dishonest in my opinion.
| i_am_a_peasant wrote:
| it's not _that_ weird to explicitly limit the scope of
| certain variables that logically belong together.
|
| But i agree the C++ if(init;cond) thing was new to me.
| maccard wrote:
| If I reformat this, { int
| val; if (getFoo(&val)) {
| } printf("%d", val); }
|
| The bug is still possible, as you've introduced an extra
| scope that doesn't exist in the C++ version.
|
| Also, this was one example. There are plenty of other
| examples.
| ho_schi wrote:
| For C users. And C++ users:
|
| In C++ we can declare variable in the while or if
| statement:
|
| https://en.cppreference.com/w/cpp/language/while
|
| https://en.cppreference.com/w/cpp/language/if
|
| It's value is the value of the decision. This is not
| possible with C [1].
|
| Since C++17 the if condition can contain an initializer:
| _Ctrl+F_ if statements with initializer
|
| https://en.cppreference.com/w/cpp/language/if
|
| Which sounds like the same? Now you can declare a
| variable and it value is not directly evaluated, you also
| can compare it in a condition. I think both are neat
| features of C++, without adding complexity.
|
| [1] Also not possible with Java.
| trelane wrote:
| Declaring a variable in a loop or if statement is
| supported since C99: https://en.wikipedia.org/wiki/C99
|
| Also in Java: https://www.geeksforgeeks.org/for-loop-
| java-important-points...
| tialaramex wrote:
| Variables like "valueSet" scream out that the language
| lacks a Maybe type instead. One of the worst things about
| C++ is that it's content to basically not bother
| improving on the C type system.
| rowanG077 wrote:
| C++ has a maybe type. It's called std::optional.
| uecker wrote:
| Here is my experimental maybe type for C:
| https://godbolt.org/z/YxnsY7Ted
| maccard wrote:
| C++ has optional, but I wanted to demonstrate that you
| could wrap a C API in a safer more ergonomic way.
|
| If you rewrote it in a more modern way and changed the
| API std::optional<int> getFoo();
| if (auto val = getFoo()) {}
|
| There are lots of improvements over C's type system -
| std.array, span, view, hell even a _string_ class
| uecker wrote:
| This will be in C2Y and is already supported by GCC 15:
| https://godbolt.org/z/szb5bovxq
| emmelaich wrote:
| Much easier to link / load into other language binaries
| surely.
| adastra22 wrote:
| extern "C" works just fine.
| josefx wrote:
| Only if your entire API doesn't contain any C++.
| pjmlp wrote:
| C is also C++, at least the C89 subset.
| procaryote wrote:
| it actually isn't
|
| https://www.stroustrup.com/bs_faq.html#C-is-subset
| pjmlp wrote:
| Do you usually post links without reading them?
|
| "In the strict mathematical sense, C isn't a subset of
| C++. There are programs that are valid C but not valid
| C++ and even a few ways of writing code that has a
| different meaning in C and C++. However, C++ supports
| every programming technique supported by C. Every C
| program can be written in essentially the same way in C++
| with the same run-time and space efficiency. It is not
| uncommon to be able to convert tens of thousands of lines
| of ANSI C to C-style C++ in a few hours. Thus, C++ is as
| much a superset of ANSI C as ANSI C is a superset of K&R
| C and much as ISO C++ is a superset of C++ as it existed
| in 1985.
|
| Well written C tends to be legal C++ also. For example,
| every example in Kernighan & Ritchie: "The C Programming
| Language (2nd Edition)" is also a C++ program. "
| josefx wrote:
| > For example, every example in Kernighan & Ritchie: "The
| C Programming Language (2nd Edition)" is also a C++
| program. "
|
| That is rather dated, they do things like explicitly cast
| the void* pointer returned by malloc, but point out in
| the appendix that ANSI C dropped the cast requirement for
| pointer conversions involving void _, C++ does not allow
| implicit void_ conversions to this day.
| pjmlp wrote:
| > Well written C tends to be legal C++ also
|
| The "well written" remark is relevant.
|
| Many style guides will consider implicit void conversions
| not well written C.
|
| Naturally we are now on C23, and almost every C developer
| considers language extensions as being C, so whatever.
| uecker wrote:
| Well written idiomatic C is certainly not valid C++.
| pjmlp wrote:
| Depends on the beholder, however it hardly matters on the
| days of C23, as mentioned.
| gjvc wrote:
| _Do you usually post links without reading them?_
|
| all the time -- I call it "crowd-sourcing intelligence"
| :-)
| p_l wrote:
| Only in certain limited cases, for example, can't have
| static class instances or anything else that could
| require calling before a call from "extern C" API.
|
| Also now you have to build enough of a C API to expose
| the features, extra annoying when you want the API to be
| fast so it better not involve extra level of indirections
| through marshalling (hello, KDE SMOKE)
|
| At some point you're either dealing with limited non-C++
| API, or you might find yourself doing a lot of the work
| twice.
| layer8 wrote:
| C has less moving parts -- it's more difficult to define
| a subset of C++ that actually works across all platforms
| featuring a C++ compiler, not to mention of all the
| binary-incompatible versions of the C++ standard library
| that tend to exist -- and C is supported on a wider
| variety of platforms. If you want to maximize
| portability, C is the way to go, and you run into much
| fewer problems.
| bonzini wrote:
| QEMU's abstractions were added when Fabrice was almost
| completely inactive already (starting in 2008 for some
| command line support).
| glouwbug wrote:
| *int (*encode)(*int);
|
| Why not compile your snippets? Heads up to the author.
| sitkack wrote:
| I'd say at 20kloc of C, https://www.lua.org/ gets you as far up
| the Object Oriented tower as you want.
| quietbritishjim wrote:
| The article is about using OO techniques directly in C code.
| Lua is implemented in C but it's an entirely separate language.
| Does its implementation use OO techniques as part of its C
| source code? If not, then it's not really relevant.
| sitkack wrote:
| I don't see how a distinction here is anything but
| semantically arbitrary.
|
| Transitively, it most definitely uses OO techniques.
| Furthermore, by having such a clean C ffi (in both
| directions) it allows for the weaving of the Lua based OO
| techniques back into C code.
| cbarrick wrote:
| For the record, this design pattern is called a virtual method
| table, or vtable.
|
| I'm surprised that this article never mentioned the term.
|
| C++ programmers will know this pattern from the `virtual`
| keyword.
| rzzzt wrote:
| You can take it a step further:
|
| - instead of setting the same function pointers on structs over
| and over again, point to a shared (singleton) struct named
| "vtable" which keeps track of all function pointers for this
| "type" of structs
|
| - create a factory function that allocates memory for the
| struct, initializes fields ("vtable" included), let's call it a
| "constructor"
|
| - make sure all function signatures in the shared struct start
| with a pointer to the original struct as the first parameter, a
| good name for this argument would be "this"
|
| - encode parameter types in the function name to support
| overloading, e.g. "func1_int_int"
|
| - call functions in the form of
| "obj->vtable->func1_int_int(obj, param1, param2)"
| drivebyhooting wrote:
| Is this satire? That's almost exactly the C++ way.
| relistan wrote:
| It's not satire, it's how you do full OO in plain C.
| actionfromafar wrote:
| It's overloaded - it _is_ satire, but also, it isn 't.
| rzzzt wrote:
| Exactly. Thank you for your time, I will and won't be
| here all week!
| procaryote wrote:
| Some say C++ is satire
| p_l wrote:
| It's how you do it in many C++ implementations, but IIRC
| it's not actually mandated in any way unless you strive for
| GCC's IA-64 ABI compatibility (the effective standard on
| Linux for C++)
|
| C++'s vtables are also, in my experience, especially bad
| compared to Objective-C or COM ones (MSVC btw generates
| vtables specifically aligned for use with COM, IIRC). Mind
| you it's been 15 years since I touched that part of crazy.
| magicalhippo wrote:
| This is pretty much how you do COM[1] in C[2].
|
| [1]: https://learn.microsoft.com/en-us/windows/win32/com/com-
| tech...
|
| [2]: https://www.codeproject.com/Articles/13601/COM-in-
| plain-C
| starspangled wrote:
| > You can take it a step further:
|
| No, that _is_ essentially what Linux does in this article
| (and by the looks of it also ffmpeg).
|
| struct file does not have a bunch of pointers to functions,
| it has a pointer to a struct file_operations, and that is set
| to a (usually / always?) const global struct defined by a
| filesystem.
|
| As you can see, the function types of the pointers in that
| file_operations struct take a struct file pointer as the
| first argument. This is not a hard and fast rule in Linux,
| arguments even to such ops structures are normally added as
| required not just-in-case (in part because ABI stability is
| not a high priority). Also the name is not mangled like that
| because it would be silly. But otherwise that's what these
| are, a "real" vtable.
|
| Surely this kind of thing came before C++ or the name vtable?
| The Unix V4 source code contains a pointers to functions (one
| in file name lookup code, even) (though not in a struct but
| passed as an argument). "Object oriented" languages and
| techniques must have first congealed out of existing
| practices with earlier languages, you would think.
| rzzzt wrote:
| The proto-C++ transpiler used C with this and similar
| techniques behind the scenes:
| https://en.wikipedia.org/wiki/Cfront
| chrsw wrote:
| I've noticed many large C projects resort to these sorts of
| OOP-like patterns to manage the complexity of the design and
| size of the code base. But I'm not aware of any one standard
| way of doing this in C. It seems C++ standardized a lot of
| these concepts, or C++ developers adopted standard patterns
| somehow.
| jcelerier wrote:
| and C++ also supports optimizing them, especially when you
| use `final` keyword and LTO which is able to devirtualize at
| the scale of a whole program.
| i_am_a_peasant wrote:
| Interesting, in Rust those optimizations are more implicit
| since there's no "final" keyword when you use dynamic
| dispatch via trait objects. + you also got LTO.
|
| I wonder if there are many cases where C++ will
| devirtualize and Rust won't.
|
| But then again Rust devs are more likely to use static
| dispatch via generics if performance is critical.
| tialaramex wrote:
| > But then again Rust devs are more likely to use static
| dispatch via generics if performance is critical.
|
| Put another way, in C++ the dynamic dispatch is implicit,
| so you might write code which (read literally) has
| dynamic dispatch but the optimizer will devirtualize it.
| However in Rust dynamic dispatch is explicit, so, you
| just would not write the dynamic dispatch - it's not
| really relevant whether an optimizer would "fix" that if
| you went out of your way to get it wrong. It's an
| idiomatic difference I'd say.
| discreteevent wrote:
| > I've noticed many large C projects resort to these sorts of
| OOP-like patterns to manage the complexity of the design and
| size of the code base
|
| The Power of Interoperability: Why Objects Are Inevitable
|
| https://www.cs.cmu.edu/~aldrich/papers/objects-essay.pdf
| trelane wrote:
| I learned it from a textbook. I think it was an earlier
| printing of https://docs.freebsd.org/en/books/design-44bsd/
| pjmlp wrote:
| The language is called Go.
|
| Other than that, yeah doing by hand what C++ and Objective-C do
| automatically.
| programmarchy wrote:
| For one, ffmpeg is 9 years older than Go. Plus, when dealing
| with video files a garbage collected language probably isn't
| going to cut it. C++ and Obj-C also feel overkill for ffmpeg.
| pjmlp wrote:
| Apparently someone has not read the article, otherwise you
| would have had understood my point about Go.
|
| Secondly, Apple and Microsoft, do just fine with Objective-C
| and C++ for their video codecs, without having to manually
| implement OOP in C.
| high_na_euv wrote:
| Who would have thought that OOP could be useful!
| vkazanov wrote:
| This is not oop but polymorphism that is useful. And various
| forms of polymorphism are used in all kinds of programming
| paradigms.
|
| Also, this is a nice way to get the damn banana without getting
| lost in the jungle.
| jcelerier wrote:
| polymorphism used that way is 100% the most traditional OOP
| possible ever
| vkazanov wrote:
| Yes, a similar approach to polymorhpism is supertypical for
| popular OOP-focused languages, along with other concepts.
| inopinatus wrote:
| This is an excellent pattern in C. The Dovecot mail server has
| many fine examples of the style as well e.g.
| struct dict dict_driver_ldap = { .name = "ldap",
| .v = { .init = ldap_dict_init,
| .deinit = ldap_dict_deinit, .wait =
| ldap_dict_wait, .lookup = ldap_dict_lookup,
| .lookup_async = ldap_dict_lookup_async,
| .switch_ioloop = ldap_dict_switch_ioloop, }
| };
|
| defines the virtual function table for the LDAP module, and any
| other subsystem that looks things up via the abstract dict
| interface can consequently be configured to use the ldap service
| without concrete knowledge of it.
|
| (those interested in a deeper dive might start at
| https://github.com/dovecot/core/blob/main/src/lib-dict/dict-...)
| dividuum wrote:
| So does the good old Quake 2 rendering API. The game exported a
| bunch of functions to the renderer via refimport_t and the
| renderer in return provided functions via refexport_t. The only
| visible symbol in a rendering DLL is GetRefAPI_t:
| https://github.com/id-Software/Quake-2/blob/master/client/re...
|
| I remember being impressed by this approach, so I shamelessly
| copied it for my programming game:
| https://github.com/dividuum/infon/blob/master/renderer.h :)
| dfox wrote:
| I somehow suspect that the reason why Quake2 does this lies
| in the legacy of Quake1 written in DJGPP. DJGPP supports
| dynamicaly loaded libraries (although the API is technically
| unsupported and internal-only), but does not have any kind of
| dynamic linker, thus passing around pair of such structs
| during library initialization is the only way to make that
| work.
| jrmg wrote:
| Reminds me of Apple's CoreFoundation.
| loph wrote:
| My recollection (which could be rusty, it has been >30 years) is
| that the Motif API, coded in C, implemented a kind of
| polymorphism.
| codr7 wrote:
| No discussion about polymorphism in C is complete without
| mentioning this macro:
|
| https://stackoverflow.com/questions/15832301/understanding-c...
| jdefr89 wrote:
| What is with the incorrect function declarations? I see:
|
| _int (_ func)().
|
| Maybe you meant: int * (*func)(void)?
|
| Don't mean to be pedantic. Just wanted to point it out so you can
| fix it.
| brcmthrowaway wrote:
| Does ffmpeg support SVE?
___________________________________________________________________
(page generated 2025-03-10 23:01 UTC)