[HN Gopher] Fast Virtual Functions: Hacking the VTable for Fun a...
       ___________________________________________________________________
        
       Fast Virtual Functions: Hacking the VTable for Fun and Profit
        
       Author : danny00
       Score  : 26 points
       Date   : 2024-03-19 12:39 UTC (2 days ago)
        
 (HTM) web link (medium.com)
 (TXT) w3m dump (medium.com)
        
       | 2716057 wrote:
       | I'm wondering why the "final" keyword is not mentioned in the
       | article.
        
         | repelsteeltje wrote:
         | What does "final" mean in C++? Is that new? Maybe you're
         | confusing with java?
        
           | canucker2016 wrote:
           | C++11 adds support for specifying "final" on derived virtual
           | functions to prevent further overriding of the function by
           | any derived classes of the current "final" class.
        
           | sudosysgen wrote:
           | https://en.cppreference.com/w/cpp/language/final
           | 
           | Final in C++ indicates that the function/class can no longer
           | be overridden in further child classes. This enables
           | devirtualization as the compiler can then know the function
           | pointer will not change, if it can infer the type at compile
           | time.
        
           | epcoa wrote:
           | > Is that new?
           | 
           | No. In fact in a few short months post C++11 will have
           | overtaken pre-C++11 as the majority of the 26 year history of
           | standardized C++ (and similar to prior standards, compilers
           | for the fortunate implemented much of the behavior prior to
           | the official publication).
        
         | cout wrote:
         | This is a good point. Using final and a static cast to the
         | derived type should eliminate the indirect call.
        
       | Dwedit wrote:
       | Branch prediction of Virtual Calls has traditionally been to look
       | at the address of the calling instruction, and assume it will be
       | the same target as last time.
       | 
       | But the real penalty for using Virtual Calls is the potential
       | loss of inlining. It depends on how smart your compiler is.
        
         | kazinator wrote:
         | Virtual function APIs have to be designed such that they don't
         | suffer from lack of inlining.
         | 
         | E.g. if you're making a graphics display driver, don't have
         | objref->putpixel(x, y) as your only virtual function, which has
         | to be called thousands of times to draw a line or millions to
         | fill an area.
        
         | ack_complete wrote:
         | Modern CPUs support indirect branch prediction based on global
         | branch history, they've advanced past simple predict-not-taken
         | or predict-last for indirect calls. Intel has been doing it
         | since Haswell.
        
       | smallmancontrov wrote:
       | They are also useful for hacking in the reverse engineering
       | sense: an object's vtable pointer usually leads to very strong
       | hints about what the object is, which is very important
       | information in the very common situation where the data and code
       | don't make it super clear :)
        
       | andy_xor_andrew wrote:
       | on this topic, it's interesting how my mindset changes when I'm
       | writing Rust vs Python, with regards to vtables, dispatching, and
       | allocation.
       | 
       | Writing Rust for a toy project: "I MUST avoid allocation, and the
       | dyn keyword, at all cost!!"
       | 
       | Writing the same toy project in Python: "lol just copy the string
       | who cares"
        
         | kukkamario wrote:
         | Perspective really changes when simple integer addition can
         | take more time than allocation in a compiled language.
        
       | mandarax8 wrote:
       | Just go all out C-style, forgoing any virtual functions and
       | instead only use struct-of-pointers instead of this weird
       | construction.
        
         | 95014_refugee wrote:
         | You... haven't worked on a large codebase, have you?
         | 
         | The part that baffles me about this entire post is that it's
         | trivial to obtain pointers to member functions legally, without
         | the fragility associated with guessing VTable offsets.
        
           | int_19h wrote:
           | If you're referring to C++ pointer-to-member types, the
           | semantics of those still requires them to perform virtual
           | dispatch at point of use. That is, given:
           | struct Base { virtual void foo(); }        struct Derived:
           | Base ( virtual void foo(); }        auto p = &Base::foo;
           | Derived d;        (d.*p)();
           | 
           | The last line must invoke Derived::foo, not Base::foo. Which
           | in turn means that the representation of a pointer-to-member-
           | function cannot be a simple function pointer in the most
           | general case. The representation must store enough
           | information to know whether it's pointing to virtual or non-
           | virtual member, and then store either direct pointer to code
           | or vtable offset depending on that.
           | 
           | Consequently, an indirect call via pointer-to-member has to
           | do the virtual/non-virtual check first, and then look up
           | vtable by index if virtual, before performing the actual
           | call. Which is in fact _more_ work than doing the virtual
           | call directly (since the first step is unnecessary in that
           | case), and thus pointers-to-members cannot be used to
           | optimize away the overhead.
        
             | o11c wrote:
             | > the representation of a pointer-to-member-function cannot
             | be a simple function pointer
             | 
             | Sure it can, you just need to have emitted separate
             | functions for "concrete call" (the traditional one stored
             | in the vtable) and "virtual call" (a bit of code that calls
             | through the vtable).
             | 
             | If this is not done it's probably due to the fact that
             | actually emitting the virtual calls involves more code than
             | just storing the vtable index. But if anyone is
             | implementing a new language I strongly suggest paying the
             | cost anyway, for sanity.
        
               | int_19h wrote:
               | There are other complications at play for your proposed
               | solution.
               | 
               | If you emit a stub for each virtual member function that
               | performs the vtable dispatch, at which point do you emit
               | it? If you do it if and when the address is actually
               | taken, then pointers to members in different compilation
               | units will have different stubs (COMDAT folding can take
               | care of this for static linking, but not for dynamic) -
               | which means that &Foo::bar taken in one unit will not
               | compare equal to the same taken in another, which is
               | counter to the C++ specification.
               | 
               | Alternatively, you could emit such a stub proactively for
               | each vtable entry on the basis that its address _might_
               | be taken in another compilation units - but that means a
               | lot of unused stubs.
               | 
               | The other problem is multiple inheritance combined with
               | upcasting. Consider:                  struct Base1 { ...
               | }        struct Base2 { virtual void foo(); }
               | struct Derived: Base1, Base2 { ... }        void
               | (Derived::*pf)() = &Base2::foo;        Derived d;
               | (d.*pf)();
               | 
               | The problem here is that your stub for Base2::foo expects
               | `this` to point at the beginning of Base2. But then when
               | it is invoked on an instance of Derived, what's readily
               | available at call site is a pointer to the beginning of
               | Derived. Which is _not the same_ , because Base2 is the
               | _second_ subobject of the Derived instance - so you need
               | to adjust that pointer to Derived by sizeof(Base1) chars
               | to upcast it to a pointer to Base2. However, you don 't
               | know at call site that your Derived::* actually points to
               | a member of Base2 in the first place - so the information
               | about this adjustment has to be stored in the pointer, as
               | well, and has to be dynamically loaded and applied at
               | call site (which is yet another step making it slower
               | than a regular virtual call, by the way). And this cannot
               | be done with pre-generated stubs since you don't know in
               | advance which classes in other compilation units might
               | inherit from Base2 later on, and what layout they are
               | going to have.
               | 
               | If anyone is designing a new language, I strongly suggest
               | not doing anything like C++ unbound member function
               | pointers in the first place; the practical need for such
               | a thing is very rare, since in most cases you want the
               | pointer to be bound to a particular object (and then
               | vtable resolution can be done at the point where the
               | pointer is produced, rather than at the point where it's
               | used). And, on the other hand, in those rare cases where
               | such a thing is needed, it can be trivially done with
               | closures if the language has those (and it should, given
               | their much broader utility). If C++ had lambdas from the
               | get go, I very much doubt that it would also have
               | pointers to member functions.
        
           | mandarax8 wrote:
           | I meant instead of casting the vtable pointer and guessing
           | the layout (imagine doing that in a large codebase...), just
           | define the struct yourself and disallow the virtual functions
           | in this case specifically if you're chasing the performance
           | that badly.
        
       | aappleby wrote:
       | Having spent a good chunk of my career optimizing C++ game
       | engines - don't do this.
       | 
       | If virtual vs non-virtual function calls are causing you
       | performance problems, you're calling way too many functions per
       | frame. Batch the objects by concrete type and process them all in
       | a loop.
        
         | nly wrote:
         | GCC even has a "bound member functions" extension to help here
         | 
         | https://gcc.gnu.org/onlinedocs/gcc/Bound-member-functions.ht...
        
       ___________________________________________________________________
       (page generated 2024-03-21 23:01 UTC)