[HN Gopher] Type Erasure in C++ Explained
       ___________________________________________________________________
        
       Type Erasure in C++ Explained
        
       Author : aptxkid
       Score  : 34 points
       Date   : 2021-04-15 16:03 UTC (6 hours ago)
        
 (HTM) web link (blog.the-pans.com)
 (TXT) w3m dump (blog.the-pans.com)
        
       | AlexanderDhoore wrote:
       | "Well, let's just add another level of indirection."
       | 
       | Modern software development in a nutshell.
        
         | slver wrote:
         | Indirection is the price of scale. As a system grows, you need
         | breaking points, boundaries. These often result in adding one
         | more level of indirection.
         | 
         | But we're also moving in the other direction, for example Java
         | is introducing value types ("inline classes") and low-level
         | memory allocation/manipulation APIs. That's a big step for a
         | language that started with the idea of everything is an object
         | (and an object is an indirection).
        
           | rwoerz wrote:
           | I would call indirection the price for abstraction.
        
         | winstonchecksin wrote:
         | Modern software development is a convoluted mess of poor
         | abstractions, new frameworks and flavors of the week, and
         | essentially a million different ways of solving the same
         | problem.
         | 
         | I work with some of the most brilliant people in the world (in
         | my opinion) and the problems we are working on are how to grab
         | peoples attention and show them relevant ads. And we don't call
         | them "ads" but recommendations.
         | 
         | Sorry I'm working right now and wondering what I'm doing with
         | my life.
        
         | ChrisSD wrote:
         | Nothing modern about it. This is basically the C philosophy in
         | a nutshell. Assuming "indirection" means "pointer indirection".
        
       | stephc_int13 wrote:
       | I am not sure this is an improvement.
       | 
       | What is the added value? What is the cost? Virtual methods are
       | not free. Indirections are not free (when you read the code)
        
         | aptxkid wrote:
         | Well if this is an improvement or not depends on the use case.
         | E.g. std::function does exactly this.
         | 
         | What it achieves is exactly erasing the type (conforming to an
         | affordance). The cost is definitely there. You pay vtable
         | lookup when using std::function.
        
         | AnimalMuppet wrote:
         | It lets you treat things that aren't from the same class
         | hierarchy as if they _are_ from the same hierarchy. You
         | probably wouldn 't do this unless you had that problem. (And,
         | in my entire career, I have _never_ had this problem.)
         | 
         | If you _do_ have that problem, the alternative is to write an
         | adapter or wrapper by hand. You may regard this as an
         | improvement, or not, depending on your specific circumstances.
        
           | LaLaLand122 wrote:
           | It's a strictly better solution. You don't do it because the
           | standard library doesn't give you tools to do it easily, but
           | if it were the same effort everybody would do it.
           | 
           | After all, Inheritance Is The Base Class of Evil
           | (https://www.youtube.com/watch?v=2bLkxj6EVoM).
        
             | AnimalMuppet wrote:
             | Cute title, I guess, but a YouTube video proves nothing.
             | You can (I presume) find YouTube videos "proving" that the
             | Earth is flat. So, if you want me to believe this position,
             | you're going to have to give me something better than the
             | title of a video.
             | 
             | Inheritance is just fine, used well. You've got a
             | counterargument? Give it. Not a video; give me the actual
             | argument.
        
               | LaLaLand122 wrote:
               | I didn't give you only "the title of a video" (which
               | sure, tries to be "cute"), I gave you the link to the
               | video, in which no other than Sean Parent gives you the
               | arguments.
               | 
               | The summary is in
               | https://www.youtube.com/watch?v=2bLkxj6EVoM&t=1275s
               | 
               | Sure, inheritance is fine, I use it all the time. If it
               | were _that_ bad we would already have Virtual Concepts in
               | the standard. But duck-typing is better, if only because
               | the argument you already gave:  "lets you treat things
               | that aren't from the same class hierarchy as if they are
               | from the same hierarchy". It's a lie that I want
               | something that inherits from my class, I simply want
               | something I can draw() with, but we keep telling that lie
               | all the time... which sure, is not the end of the world.
        
               | AnimalMuppet wrote:
               | Yes, you gave me a link to the video. No, I'm not going
               | to watch a video to find out what your point is, so what
               | you actually gave me is just the title.
               | 
               | Why am I not going to watch it? Well, how long is the
               | video? 5 minutes? 30 minutes? An hour? Two hours? But it
               | took me 30 seconds to read your post here. (I'm hoping
               | that the last paragraph was a summary of the video's
               | argument.) Even if it took you five minutes to write,
               | your five minutes plus my 30 seconds is still a big win
               | compared to a half-hour video.
               | 
               | But we're on a public forum. If 10 people, or 100, have
               | to go watch the video to figure out what your point is,
               | that gets _really_ inefficient. Which is why I yell at
               | people - not just you - about making the readers do the
               | work to figure out what the poster is talking about.
               | 
               | > But duck-typing is better, if only because the argument
               | you already gave: "lets you treat things that aren't from
               | the same class hierarchy as if they are from the same
               | hierarchy".
               | 
               | Right; duck typing is better _when you have that
               | problem_. But in 25 years of using C++, I have _never_
               | had that problem. So I 'm pushing back on you stating
               | "duck typing is better" like it's a universal. It's not.
               | 
               | > It's a lie that I want something that inherits from my
               | class, I simply want something I can draw() with, but we
               | keep telling that lie all the time...
               | 
               | Sometimes I want more than that. I want something I can
               | draw() that also satisfies the constraints of my class,
               | at which point I _do_ want something that inherits from
               | my class, which makes it not a lie.
               | 
               | Look, this approach has its place. That place is not
               | everywhere, but a more limited set of places. I don't
               | have a problem with people using this approach. I don't
               | have a problem with people teaching others how to use
               | this approach. I have a problem with people talking like
               | this is the one right way. It's not.
        
       | kazinator wrote:
       | > _But the result is beautiful._
       | 
       | Ugly as hell, ouch! Really?
       | 
       | GNU C++ had something called signatures years ago, which was
       | removed. It was far more elegant.
       | 
       | You could declare a _signature_ which was a class-like thing:
       | function declarations in curly braces.
       | 
       | Having that signature declared, you could lift a pointer of that
       | type to any object which had those functions (without any
       | relationship to the signature having to be declared by that
       | object).
       | 
       | Found a nice document on it:
       | 
       | https://csc.lsu.edu/~gb/Signatures/index.html
       | 
       | So, here is how the code would look:                 class Bar {
       | // nothing to inherit here       public:         void
       | doSomething() {  }       };            signature Do {
       | void doSomething();       };            void foo(Do &doer)
       | {          doer.doSomething();       }            int main()
       | {          Bar bar;          foo(bar);       }
       | 
       | When _foo_ is called with _bar_ , a _Do &_ signature reference is
       | taken to _bar_. This is allowed because the type Bar has all the
       | functions declared in the signature type Do, making it
       | compatible.
       | 
       | Sure, the implementation has to bend over backwards. But
       | signatures are more declarative, so the implementation has a
       | clearer idea of your intent. It can do whatever magic is
       | required.
       | 
       | It seems clear to me that there is a static way to bind the Do
       | signature reference to the Bar type. You probably have to
       | construct some vtable like object which does the right sort of
       | indirection.
       | 
       | The translation unit could emit some hidden __Do__Bar_table item
       | which does exactly that: it's a vtable-like table made in the
       | shape of Do, which is filled with pointers (perhaps fat pointers
       | with offsets and whatever is necessary) to the matching functions
       | in Bar.
       | 
       | If that can be set up at compile time, then maybe the _Do &doer_
       | argument just has to be some sort of fat reference consisting of
       | the pointer to _bar_ , and to the __Do_Bar_table which translates
       | the Do calls into Bar calls.
       | 
       | It seems cheaper than the convolution presented in this article.
       | 
       | Signatures didn't make it into ISO C++, but since that time, a
       | lot of cruft has which is worse.
       | 
       | Looks like this 1999 commit may be what removed signatures:
       | 
       | https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6eabb2412f6c4c...
       | 
       | It doesn't point to any information about the removal. We
       | probably have to dig into mailing lists. At least it gives a
       | date, thanks to which we can find this posting. Unfortunately,
       | the one from which it quotes is missing for some reason:
       | 
       | https://gcc.gnu.org/pipermail/gcc/1999-August/035433.html
       | 
       | "This patch removes support for `signature', a g++ extension that
       | is little-used and which Jason and I agreed should go. The
       | reduction in complexity elsewhere in the front-end will be a big
       | win."
       | 
       | OMG, you would absolutely not see this today in C++ development.
       | "Jason and I" decided that some C++ gadget is too little used and
       | we will remove it.
       | 
       | How naive that seems; these people had no idea about the deluge
       | of garbage that was coming down the pipe into standard C++ over
       | the following two decades, that they would have to implement.
       | 
       | They removed a good thing on a whim.
        
         | varajelle wrote:
         | In C++20, this is called a "concept"
        
           | thechao wrote:
           | It's more closely related to the `concept` from the Indiana
           | proposal from C++0x (which is structural) rather than the one
           | in C++20, which is most like the expression syntax. (That
           | syntax was lifted nearly unscathed from Spad.)
           | 
           | They're both missing the Indiana `concept_map` which were
           | runtime-free functors -- like -- _proper_ functors.
        
       | cobaltoxide wrote:
       | > But the result is beautiful.
       | 
       | Uhhhh, gonna have to disagree there.
        
         | aptxkid wrote:
         | LOL sure. I think it's arguably terrible behind the curtain.
         | But the interface provided is pretty elegant IMO.
        
           | kazinator wrote:
           | 1. Any piece of code that the program has to provide, rather
           | than the implementation, is by definition in front of the
           | curtain.
           | 
           | 2. The run time semantics is ugly. Let's see:
           | 
           |  _" Now if we pass our Bar1 to foo it will first implicitly
           | construct a Bar object with a pointer to BarWrapper<Bar1>
           | when bar.doSomething() is called inside foo, it will trigger
           | vtable lookup and find BarWrapper<Bar1>::doSomething which
           | then calls Bar1::doSomething which is exactly what we want."_
           | 
           | Which reads to me like:
           | 
           |  _" Now if we pass our Bar1 to foo it will first wastefully
           | construct an overhead object, with a pointer to overhead,
           | when bar.doSomething() is called inside foo, it will trigger
           | overhead lookup to find some overhead wrapper which then
           | finally calls the piece of code which is exactly what we
           | want."_
           | 
           | By the time you've done all this, a dynamic language function
           | call starts to look good.
        
       | einpoklum wrote:
       | That's not how you do type erasure in C++.
       | 
       | Have a look at std::any :
       | https://en.cppreference.com/w/cpp/utility/any/
       | 
       | you can place a value of any type T in it (well, obviously it
       | needs to be constructible etc.), or a std::nullopt, which is like
       | an "Empty" or "Nothing" indicator that is not in T.
       | 
       | Then you can pass the `any` around without knowing its type.
       | Finally, when you want to restore the typed value, you use
       | any::get<T>(). It will succeed if T is the correct type, and
       | throw an exception otherwise.
       | 
       | This was introduced into C++ in the C++17 version of the
       | standard. Before, it existed as a Boost library facility.
        
         | aptxkid wrote:
         | However the Type Erasure described here doesn't need to know T
         | even when you use it. E.g. it enables things like
         | 
         | for (auto x : vec) { x.foo(); }
        
           | LaLaLand122 wrote:
           | And if it were in the standard there wouldn't be at least
           | five well known libraries implementing it (including from
           | Adobe and from Facebook): https://github.com/boost-
           | ext/te#similar-libraries
           | 
           | As far as I know (I don't really follow the committee work)
           | the latest attempt to introduce run-time duck-typing in the
           | standard was https://github.com/andyprowl/virtual-concepts,
           | which seems dead.
        
           | jhgb wrote:
           | Is this the thing that's called "Voldemort types" in D?
        
       | ok123456 wrote:
       | Wouldn't this particular example be better served through C++2a
       | concepts? That is we can specify constraints about template type
       | parameters.
        
         | pjmlp wrote:
         | C++20, the standard has already been ratified.
        
         | dataflow wrote:
         | Do people end up finding concepts useful? They seem nice in
         | theory but I have yet to find a practical case where I want to
         | reach for them. The errors aren't (at least currently)
         | noticeably better than normal template errors, and they can end
         | up being redundant in practice.
        
           | CoastalCoder wrote:
           | > Do people end up finding concepts useful?
           | 
           | I'll get back to you when my work projects are allowed to use
           | a C++ version newer than C++11. </rant>
           | 
           | But, more seriously: I'm curious if so few people are
           | actually using C++20 (for work, at least) that it will take a
           | while to answer questions like yours.
        
           | ok123456 wrote:
           | I think a major benefit is you can use auto parameter types
           | and have the compiler propagate constraints about what
           | methods and fields are required at compile time.
        
         | pfultz2 wrote:
         | Concepts specify constraints on templates, but the types are
         | still templates, so you can't put the function definition in a
         | .cpp file nor put them into a vector(which is what type erasure
         | allows).
        
       | zackees wrote:
       | This article is not a good explainer of type erasure because it
       | uses inheritance to do it.
       | 
       | I'm waiting for the day that someone figures out and does a blog
       | post showing that every std data structure can be rewritten with
       | type erasure.
       | 
       | Here's how it works:
       | 
       | Let's use vector because it's simple. This vector impl works with
       | just plain void* memory with no type information.
       | 
       | The vector type wrapper (matching std::vector) is a template but
       | instead of recreating the entire data structure with a different
       | type, it will instead create the void* implementing vector in a
       | unique_ptr and pass boiler plate functions so that the vector
       | know how to: 1. the size of T 2. in place constructor of T at the
       | void* 3. in place destructor of T at the void* 4. A copy
       | constructor of T at void* src, dst.
       | 
       | The wrapper vector<T> that owns the vector impl then does the
       | necessary casts back to T. For example vector::at could be
       | implemented as such:                 <template typename T>
       | class vector { // type wrapper.         T& at(size_t i) {
       | void* p = vector_impl_->get(i);           T* t =
       | static_cast<T*>(p);           return *t;         }       ...
       | }
       | 
       | In this example, vector<T> wrapper would still be inlined
       | everywhere, however VectorImpl could be defined exactly once in a
       | cpp file. The template bloat problem is reduced from the entire
       | data structure to just the wrapper casting back and forth from
       | void* <--> T&.
       | 
       | This can be extrapolated to complex algorithms like std::map. And
       | as a bonus the polymorphic wrapper can do all the boiler plate
       | generation so that the interface could be made to match the
       | std::map.
       | 
       | Testing the bloat size reduction could be performed on a code
       | base with significant usage of std map across multiple types, and
       | see if the optimizing compiler will reduce the final binary size
       | with the type erased map swapped in.
        
         | pfultz2 wrote:
         | That seems more like type erasure as defined in Java where the
         | container is the same for all data types and the compiler
         | implicitly inserts casts from the base type to the parameter
         | type.
        
         | giomasce wrote:
         | I once thought about doing something similar too, but in the
         | end I never tried. I guess it's not difficul, you just have to
         | write a good deal of code to support all the operations.
        
         | adamrezich wrote:
         | wouldn't all of this templating lead to immensely increased
         | compile times?
        
           | haneefmubarak wrote:
           | Template heavy C++ (which is a lot of nontrivial C++) DOES
           | have immensely increased compile times hahaha. It's not too
           | bad if you have a decent number of hardware threads and
           | incrementally (re)build in parallel though.
        
           | edflsafoiewq wrote:
           | It would presumably take less compile time than a regular
           | vector because the template parts are small shims that call
           | out to the non-template type-erased implementation. A regular
           | vector has to recompile the implementations for each
           | instantiation.
        
       ___________________________________________________________________
       (page generated 2021-04-15 23:02 UTC)