[HN Gopher] Fast Virtual Functions: Hacking the VTable for Fun a...
___________________________________________________________________
Fast Virtual Functions: Hacking the VTable for Fun and Profit
Author : danny00
Score : 26 points
Date : 2024-03-19 12:39 UTC (2 days ago)
(HTM) web link (medium.com)
(TXT) w3m dump (medium.com)
| 2716057 wrote:
| I'm wondering why the "final" keyword is not mentioned in the
| article.
| repelsteeltje wrote:
| What does "final" mean in C++? Is that new? Maybe you're
| confusing with java?
| canucker2016 wrote:
| C++11 adds support for specifying "final" on derived virtual
| functions to prevent further overriding of the function by
| any derived classes of the current "final" class.
| sudosysgen wrote:
| https://en.cppreference.com/w/cpp/language/final
|
| Final in C++ indicates that the function/class can no longer
| be overridden in further child classes. This enables
| devirtualization as the compiler can then know the function
| pointer will not change, if it can infer the type at compile
| time.
| epcoa wrote:
| > Is that new?
|
| No. In fact in a few short months post C++11 will have
| overtaken pre-C++11 as the majority of the 26 year history of
| standardized C++ (and similar to prior standards, compilers
| for the fortunate implemented much of the behavior prior to
| the official publication).
| cout wrote:
| This is a good point. Using final and a static cast to the
| derived type should eliminate the indirect call.
| Dwedit wrote:
| Branch prediction of Virtual Calls has traditionally been to look
| at the address of the calling instruction, and assume it will be
| the same target as last time.
|
| But the real penalty for using Virtual Calls is the potential
| loss of inlining. It depends on how smart your compiler is.
| kazinator wrote:
| Virtual function APIs have to be designed such that they don't
| suffer from lack of inlining.
|
| E.g. if you're making a graphics display driver, don't have
| objref->putpixel(x, y) as your only virtual function, which has
| to be called thousands of times to draw a line or millions to
| fill an area.
| ack_complete wrote:
| Modern CPUs support indirect branch prediction based on global
| branch history, they've advanced past simple predict-not-taken
| or predict-last for indirect calls. Intel has been doing it
| since Haswell.
| smallmancontrov wrote:
| They are also useful for hacking in the reverse engineering
| sense: an object's vtable pointer usually leads to very strong
| hints about what the object is, which is very important
| information in the very common situation where the data and code
| don't make it super clear :)
| andy_xor_andrew wrote:
| on this topic, it's interesting how my mindset changes when I'm
| writing Rust vs Python, with regards to vtables, dispatching, and
| allocation.
|
| Writing Rust for a toy project: "I MUST avoid allocation, and the
| dyn keyword, at all cost!!"
|
| Writing the same toy project in Python: "lol just copy the string
| who cares"
| kukkamario wrote:
| Perspective really changes when simple integer addition can
| take more time than allocation in a compiled language.
| mandarax8 wrote:
| Just go all out C-style, forgoing any virtual functions and
| instead only use struct-of-pointers instead of this weird
| construction.
| 95014_refugee wrote:
| You... haven't worked on a large codebase, have you?
|
| The part that baffles me about this entire post is that it's
| trivial to obtain pointers to member functions legally, without
| the fragility associated with guessing VTable offsets.
| int_19h wrote:
| If you're referring to C++ pointer-to-member types, the
| semantics of those still requires them to perform virtual
| dispatch at point of use. That is, given:
| struct Base { virtual void foo(); } struct Derived:
| Base ( virtual void foo(); } auto p = &Base::foo;
| Derived d; (d.*p)();
|
| The last line must invoke Derived::foo, not Base::foo. Which
| in turn means that the representation of a pointer-to-member-
| function cannot be a simple function pointer in the most
| general case. The representation must store enough
| information to know whether it's pointing to virtual or non-
| virtual member, and then store either direct pointer to code
| or vtable offset depending on that.
|
| Consequently, an indirect call via pointer-to-member has to
| do the virtual/non-virtual check first, and then look up
| vtable by index if virtual, before performing the actual
| call. Which is in fact _more_ work than doing the virtual
| call directly (since the first step is unnecessary in that
| case), and thus pointers-to-members cannot be used to
| optimize away the overhead.
| o11c wrote:
| > the representation of a pointer-to-member-function cannot
| be a simple function pointer
|
| Sure it can, you just need to have emitted separate
| functions for "concrete call" (the traditional one stored
| in the vtable) and "virtual call" (a bit of code that calls
| through the vtable).
|
| If this is not done it's probably due to the fact that
| actually emitting the virtual calls involves more code than
| just storing the vtable index. But if anyone is
| implementing a new language I strongly suggest paying the
| cost anyway, for sanity.
| int_19h wrote:
| There are other complications at play for your proposed
| solution.
|
| If you emit a stub for each virtual member function that
| performs the vtable dispatch, at which point do you emit
| it? If you do it if and when the address is actually
| taken, then pointers to members in different compilation
| units will have different stubs (COMDAT folding can take
| care of this for static linking, but not for dynamic) -
| which means that &Foo::bar taken in one unit will not
| compare equal to the same taken in another, which is
| counter to the C++ specification.
|
| Alternatively, you could emit such a stub proactively for
| each vtable entry on the basis that its address _might_
| be taken in another compilation units - but that means a
| lot of unused stubs.
|
| The other problem is multiple inheritance combined with
| upcasting. Consider: struct Base1 { ...
| } struct Base2 { virtual void foo(); }
| struct Derived: Base1, Base2 { ... } void
| (Derived::*pf)() = &Base2::foo; Derived d;
| (d.*pf)();
|
| The problem here is that your stub for Base2::foo expects
| `this` to point at the beginning of Base2. But then when
| it is invoked on an instance of Derived, what's readily
| available at call site is a pointer to the beginning of
| Derived. Which is _not the same_ , because Base2 is the
| _second_ subobject of the Derived instance - so you need
| to adjust that pointer to Derived by sizeof(Base1) chars
| to upcast it to a pointer to Base2. However, you don 't
| know at call site that your Derived::* actually points to
| a member of Base2 in the first place - so the information
| about this adjustment has to be stored in the pointer, as
| well, and has to be dynamically loaded and applied at
| call site (which is yet another step making it slower
| than a regular virtual call, by the way). And this cannot
| be done with pre-generated stubs since you don't know in
| advance which classes in other compilation units might
| inherit from Base2 later on, and what layout they are
| going to have.
|
| If anyone is designing a new language, I strongly suggest
| not doing anything like C++ unbound member function
| pointers in the first place; the practical need for such
| a thing is very rare, since in most cases you want the
| pointer to be bound to a particular object (and then
| vtable resolution can be done at the point where the
| pointer is produced, rather than at the point where it's
| used). And, on the other hand, in those rare cases where
| such a thing is needed, it can be trivially done with
| closures if the language has those (and it should, given
| their much broader utility). If C++ had lambdas from the
| get go, I very much doubt that it would also have
| pointers to member functions.
| mandarax8 wrote:
| I meant instead of casting the vtable pointer and guessing
| the layout (imagine doing that in a large codebase...), just
| define the struct yourself and disallow the virtual functions
| in this case specifically if you're chasing the performance
| that badly.
| aappleby wrote:
| Having spent a good chunk of my career optimizing C++ game
| engines - don't do this.
|
| If virtual vs non-virtual function calls are causing you
| performance problems, you're calling way too many functions per
| frame. Batch the objects by concrete type and process them all in
| a loop.
| nly wrote:
| GCC even has a "bound member functions" extension to help here
|
| https://gcc.gnu.org/onlinedocs/gcc/Bound-member-functions.ht...
___________________________________________________________________
(page generated 2024-03-21 23:01 UTC)