
   ***** THERE IS ABSOLUTELY NO WARRANTY ON THIS PACKAGE. *****

I provide this as a service to the Linux Community.  Don't sue me if
it's broken.

--------------------------------------------------------------------

            ELF shared LAPACK and BLAS libraries for Linux

INTRODUCTION:

LAPACK is a comprehensive FORTRAN library that does linear algebra
operations including matrix inversions, least squared solutions to
linear sets of equations, eigenanalysis, SVD etc. It is a very
comprehensive and reputable package that has found extensive use in
the scientific community. It is an updated LINPACK and EISPACK in one
package. The BLAS (Basic Linear Algebra Routines) do stuff like matrix
multiplies, Given's rotations etc. and are usually hand optimized for
specific machines (as in DEC's DXML library). In this case, the
original FORTRAN version is used. You can get more information from
http://www.netlib.org/lapack/lug/lapack_lug.html. 

Dave Weber (weber@young.ece.cmu.edu) recently (Nov. 1995) released a
binary ELF distribution of Lapack for Linux, compiled with f2c and
gcc-2.7.0.  This release is compiled with g77 (GNU Fortran), giving a
significant improvement in performance (almost a factor of two!).  The
library can be used with g77 or with f2c+gcc.  It was compiled using
g77 version 0.5.17 compiled into gcc version 2.7.2.  I used a version
of the ETIME written by David Klein.  The patch (modifications to the
Makefiles, and etime.c) is included, it was written by Dave Weber.
(All I did was compiling the stuff, most of this README file was also
shamelessly stolen from Dave Weber's distribution).

INSTALLATION:

Install the libraries as follows: Copy liblapack.so.2.0.1 and
libblas.so.1.0.1 (or static/libblas.a, see the section on performance)
to /usr/lib.  Then type "ldconfig -v".

You will also need a libf2c.  This comes with the f2c translator, or
with the g77 package.  I have included the version of the library that
comes with g77 version 0.5.17 (Are all versions identical?) as
f2c/libf2c.a.  NOTE: This is a static library, I didn't think about
shared libs when I compiled g77, and I have since then erased the
source.  It does work fine with shared versions of libf2c supplied with
f2c.

Link the libraries by adding the following to your link/compile
command line as in this example:

  gcc try.c -o try -liblapack -libblas -libf2c -lm
or
  g77 try.f -o try -liblapack -libblas -lm
or
  fort77 try.f -o try -liblapack -libblas -lm

(Use the first version for C/C++ programs, the second and third for
Fortran programs)

If you are new to ELF binaries, check out the ELF howto at 
http://www.cs.unc.edu/linux-docs/linux.html


HOW WAS IT COMPILED?

The entire library compiles fine with f2c+gcc or g77 and performs
almost flawlessly in the extensive testing supplied with LAPACK and
the BLAS. I have included all the test routine outputs and I also list
the routines that failed here.

List of tests that failed (g77):

DGEES:
 DES:    1 out of 3270 tests failed to pass the threshold
DGEESX:
 DSX:    1 out of 3502 tests failed to pass the threshold
(see the file TESTING/ded.out)

I have no idea why these fail, but there was only 1 failure in over
3000 tests (versus two failures if compiled with f2c+gcc).  If the
estimate of the rounding error returned by DLAMCH is doubled, all
tests are passed.

ALL STANDARD DISCLAIMERS APPLY (USE AT YOUR OWN PERIL). SEND YOUR
LAWYERS TO /dev/null.


PERFORMANCE

I have compared the speed of the level 3 BLAS routines, using a
benchmark program I found on Netlib.  I ran it using Dave Weber's
f2c+gcc compiled library, and also using g77 compiled static and
dynamic libraries.  Here are the results:

			   ------- Speed in Megaflops -------
Routine			f2c+gcc		g77		g77
			(shared)	(shared)	(static)
=================================================================
DSYMM			2.9		5.3		5.6
DSYRK			2.7		5.5		5.6
DSYR2K			3.2		6.0		6.0
DTRMM			2.6		4.6		5.1
DTRSM			2.4		4.1		4.5
-----------------------------------------------------------------
Average			2.76		5.10		5.36
-----------------------------------------------------------------
rel. to f2c+gcc 	100%		185%		194%
rel. to g77(sh)		54%		100%		105%
rel. to g77(st)		51%		95%		100%
=================================================================

So I see almost a factor two increase in performance!  I also see the
expected 5% penalty of the shared libraries, coming from losing one
register.  I think it might make sense to use a static BLAS with a
dynamic LAPACK, since most of the hard work AFAIK is done in the BLAS
part, and since the BLAS library is rather small.  The penalty is
slightly larger executables, but as long as only one program (using
BLAS) is in memory at the same time, it shouldn't use more RAM.  Of
course if two programs are running simultaneously, and just a little
swapping occur, the 5% gain is lost.  I have provided both the static
and the dynamic version of BLAS in this distribution.


AVAILABILITY:

The package is available from sunsite.unc.edu in
/pub/Linux/devel/lang/fortran/lapack-linux-2.0.1-elf.tar.gz

It is also available from my web page:
http://nils.wustl.edu/schiotz/lapack-linux.html 


CREDITS:

The original version of LAPACK and the LAPACK Users Guide can be found
at http://www.netlib.org/lapack/

The Linux port was made by Dave Weber, weber@young.ece.cmu.edu or 
http://www.ece.cmu.edu/afs/ece/usr/weber/.home-page.html.

etime.c was written by David Klein.

--
Jakob Schitz
Email:  schiotz@howdy.wustl.edu
WWW:    http://nils.wustl.edu/schiotz.html
