Newsgroups: comp.lang.c
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!sarah!bingnews!kym
From: kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell)
Subject: Re: Log Library - How is it done in the library code?
Message-ID: <1991Mar20.204034.28931@bingvaxu.cc.binghamton.edu>
Organization: State University of New York at Binghamton
References: <702@newave.UUCP> <1991Mar16.201655.6104@bingvaxu.cc.binghamton.edu> <1991Mar20.173249.3819@zoo.toronto.edu>
Date: Wed, 20 Mar 1991 20:40:34 GMT

In article <1991Mar20.173249.3819@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <1991Mar16.201655.6104@bingvaxu.cc.binghamton.edu> kym@bingvaxu.cc.binghamton.edu (R. Kym Horsell) writes:
>>So we see that on _some_ hardware (like 68k's) the library routines are
>>at an apparent _big_ disadvantage...
>
>No, actually, we see that on some hardware/software combinations the library
>routines are at a big disadvantage.  In particular, on that Sun 3/60, did
>you compile with -f68881 and use the inlining facility for the math library?
>If not, you were timing the calling overhead, not the log function.

No, usually I just say `-O4' and let it go at that. However, if you
_wanna_ see what happens with `inlining', on a Sun 3 you get this
(I was surprised):

-O4			-f68881		-O4/-f68881	-fsoft		-O4/-fsoft

0.356631	0.414894	0.280899	0.360215	0.355993


Apparently the `inline' option -f68881 does cut down somewhat
on (presumably) the calling overhead to (essentially only) the log 
function.

The global analysis (+ a few fancy other things that don't really
apply to this program) done by -O4 is _almost_ as good as inlining (i.e.
calling overhead of about (0.415-0.357)/0.415 = 14% seems to have been
eliminated.

However, look at the comination of -O4 and -f68881! A bit hard to
understand how things can go _backwards_ for the library routine by
simply doing both things. Instead of my little subroutine running only
about 3 times faster, combining both switches makes it run almost 4
times faster than the library routine! Amazing! Perhaps Henry might
explain this one (my brain is hurting at the moment)?

As a kind of joke -- and a slight counterexample to Henry's statement
above -- I tried the -fsoft option that, I presume (from the man page
anyway), restricts everthing to using software fp routines. 
Almost the same comparison as the original -O4 result. Perhaps there
_isn't_ that much variation on a given hardware, despite the various fancy
compiler options. (Although I presume a DIFFERENT compiler and library on 
the same platform might behave quite differently).

Cheers,

-kym
