This directory contains the code for testing the PRISM matrix multiply
routine called BiMMeR.  The code has been converted to MPI.  When you
run the code (prism_bimmer_test) you get output to one file specified
by the last program argument.

All files have the following notice:
----------------------------------------------------------------------
   COPYRIGHT U.S. GOVERNMENT 
   
   This software is distributed without charge and comes with
   no warranty.

   Please feel free to send questions, comments, and problem reports
   to prism@super.org. 
----------------------------------------------------------------------


1) To make the standard version of the code, refer to the file
../../README_BIMMER for information on setting things up.  Normally
the make is done from that directory.  You can also issue a make from
this directory to only make the BiMMeR test codes.  However, it
depends on having also made other libraries in other directories.

2) Next, you will probably want to run some jobs.  The standard
scripts are "ARCH"-"COMM".run.  The ARCH and COMM are just like the
Makefile (see README_BIMMER).  These scripts use the environment
variable PRISM_PROG_DIR, which is set in the script.  The output file
goes into the directory $PRISM_PROG_DIR/outmm, which will be created
for you if one does not already exist.  A number of the ARCHs don't
depend on the COMM so the different COMM files are the same.  When
COMM != nx, then the script runs the MPI collective operations and the
PRISM collective operations using MPI.  For the sun4 and CS2, the
scripts use mpirun from ANL/MS.  You need to have $PRISM_MPI_HOME/util
in your path and $PRISM_MPI_HOME/util/machines/machines."arch" set to
the appropriate machines (this is usually done by the MPI
administrator).

For each run, as long as the error is small, everything is ok.  You
get the times for the slowest abd fastest node as well as the average
for the whole code and the three main parts.


The routine prism_bimmer_test has the following 16 input parameters:

M N K lambda vdim nw_x nw_y rows cols m_width n_width k_width spc
transa transb file

#1: The number of rows of matrix A and the number of rows of the
result matrix C.

#2: The number of columns of matrix B and the number of columns of the
result matrix C.

#3: The number of columns of A and rows of B.

#4: Factor to use when generating data for testing.  Not very
important what you use.  0.5 is fine but 0.0 is a bad choice.

#5: The number of virtual processors to use.  If you are using the 2D
MPI topology of size (X,Y) then this value is typically the least
common multiple of X and Y [LCM(X,Y)].  For example, on a (2,2) 2D
topology, the value is 2.  On a (3,2) 2D topology, the value is 6.

#6: The x coordinate of the node where the data should begin in the 2D
topology.  For unshifted matrices this is 0.

#7: The y coordinate of the node where the data should begin in the 2D
topology.  For unshifted matrices this is 0.

#8: # of nodes in the row direction.  If on an X,Y 2D topology, this
would normally be X.

#9: # of nodes in the column direction.  If on an X,Y 2D topology, this
would normally be Y.

#10: The width of the panels in dimension M.  This is the size of the
blocks.  You can use 1.

#11: The width of the panels in dimension N.  This is the size of the 
blocks.  You can use 1.

#12: The width of the panels in dimension K.  This is the size of the 
blocks.  You can use 1.

#13: The spacing between panels.  Often this is 1.  If not, the
value must divide the number of virtual processors, vdim.

#14: If false, uses A; if true, uses A^t. (A^t = A transpose).

#15: If false, uses B; if true, uses B^t. (B^t = B transpose).

Note:  combinations of #14 & #15 can compute C = A*B, C = A^t*B, and
C = A*B^t. (C = A^t*B^t has not been implemented yet).

#16: The name of the file in which to place the output.  You can use stderr
or stdout, in addition to file names.



The following files are included in this directory:

Makefile: Input for make.

README: This file.

prism_bimmer_test.c: This is the main routine.  It accepts input,
makes the logic choices and outputs the results.

prism_v_error_chk.c: Looks at result of matrix multiply from BiMMeR to
see if correct.

"ARCH"-"COMM".run: These are scripts to run test jobs.  The allowed
"ARCH" and "COMM" are described in README_BIMMER.


The routines have a large number of cpp options.  Whenever you change
options you should do a "make cleanobjs; make" to make sure that the new
cpp option(s) are compiled for all routines.  THIS IS NOT AUTOMATIC.
The common CPP options are in README_BIMMER.  The options are:

-DPRISM_MPI_COLL: If set, then use MPI routines to perform the
collective operations.  You cannot use this with PRISM_NX.

-DPRISM_BC_READY: If you don't use -DPRISM_MPI_COLL then you get the
PRISM routine.  If you use the PRISM routine you can set this to use
ready type sends in the broadcast.  It switches at run time to a
standard send if the message is too small (defined in routine).

-DPRISM_SKEW_READY: Does for skew what PRISM_BC_READY does for
broadcast.  Since the amount of data sent by each node can vary, not
all nodes will necessarily do ready sends.


Machine Specific Information
----------------------------

Paragon:

The Makefile uses extensions other than .o so that versions with
different cpp options can be made and maintained.  The icc linker will
give a message such as:
----------------------------------------------------------------------
icc-warning-file with unknown suffix passed to linker: prism_bimmer_test.o_mc
----------------------------------------------------------------------
This is only a warning message and can be ignored.


Delta:

The Makefile uses extensions other than .o so that versions with
different cpp options can be made and maintained.  The icc linker will
give a message such as:
----------------------------------------------------------------------
icc-warning-file with unknown suffix passed to linker: prism_bimmer_test.o_mc
----------------------------------------------------------------------
This is only a warning message and can be ignored.


SP:

The Makefile uses extensions other than .o so that versions with
different cpp options can be made and maintained.  The xlf linker will
give a message such as:
----------------------------------------------------------------------
xlf: 1501-218 file prism_test.o_ge_mc contains an incorrect file suffix
----------------------------------------------------------------------
This is only a warning message and can be ignored.


CM5:

The Makefile uses extensions other than .o so that versions with
different cpp options can be made and maintained.  The cmmd-ld linker will
give a message such as:
----------------------------------------------------------------------
cc: Warning: File with unknown suffix (prism_test_.o_ge_br) passed to ld
----------------------------------------------------------------------
This is only a warning message and can be ignored.


