Subj : Re: OO compilers and efficiency
To   : comp.programming
From : websnarf
Date : Wed Jul 20 2005 11:32 am

Flavius Vespasianus wrote:
> Brian <brian@rohan.sdsu.edu> wrote in news:dbj5f7$rdv$1@gondor.sdsu.edu:
> > Anyway, those are some of the things that kind of nag at me.
> > I think it's a safe statement to say C can beat any OO compiled
> > program pound for pound given the same programmer skill and
> > adherence to language goals.
>
> I once reviewed a book manuscript for a publisher, the author of which
> would entirely agree with you -- people who think they can code faster
> than the compiler can optimize. The book consisted of all kinds of tricks
> in C to make your code faster.
>
> I benchmarked all of his examples and EVERY one of his optimization
> tricks resulted in slower code.

Here's a trick I'd like to see "improved C++ code" for (requires
bstrlib, which can be found at http://bstring.sf.net/):

   /* The C++ way */
   a = CBString ("end");
   for (i=0; i < 100000; i++) {
       a = CBString (&tbstrTable[i%8]) + a;
   }

   /* The C way */
   a = bfromcstr ("end");
   for (i=0; i < 100000; i++) {
       bstring b = bstrcpy (&tbStrTable[i%8]);
       bconcat (b, a);
       bdestroy (a);
       a = b;
   }

The C++ way, invokes 2 constructors, 2 destructors, 2 string copies and
1 concatenation.  The C way, though more complicated looking, invokes 1
constructor (equivalent), 1 destructor (equivalent), 1 copy and 1
concatenation.  If there is any C++ compiler that is able to optimize
the first sequence to the equivalent of the second one, I'd like to
know about it.

(Of course, the real way to optimize this is to use the insert function
(a.insert(&table[i%8], 0), or binsert(a, &table[i%8], 0)) but one can
easily imagine slightly more complicated scenarios where this
simplification is not possible.)

But the reason that C is superior here is that it doesn't rely on
implicit storage for the base of the object.  The "a = b" line, just
does a pointer assignment, which essentially just swaps the object
base.

A similar case is for block initialization.  STL's vectors, for
example, requires individual constructing of each element as the vector
grow, including empty constructors for "anticipated elements".  A
comparable implementation in C can just malloc big blocks of pointers
and set them all to NULL in a single for-loop or memset or whatever.

This also applies to the creation of complicated data structures, say,
like tries.  In C you can make "custom allocators" which will allocate
many nodes at once in a single malloc, and hand them to your ADT
creator in a fast stack-like manner.  Then when you need to destroy the
ADT, you collapse the memory by freeing them in blocks at a time.
Since in C++ the base of your object's memory is implicit and
individulized, you can't leverage these sort of "block" allocation
schemes.  This doesn't just slow things down with respect to memory
management, but worsens your cache density (since each entry becomes
individually allocated, and thus incurrs some overhead per object.)

Perhaps there is something fundamental about C++, or OO in general that
I am not understanding, but how does C++ address this?

> > Does everyone drop into C for critical code?
>
> Never. C is a horrible programming language has caused the software
> industry decades of suffering.

Well, C is a *dangerous* language, that has a pitiful standard library.
 But one can write one's own libraries and use tools and standards that
help with coding safety.

-- 
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

.