Briefly:

Run htool from this directory.

Click "graph:" and select lu.gr.

Click "config" and fill in at least one host and a cost for each subroutine.

Click "store",  "cost.mat" and "ok".

Click "build", "write wrappers", "write makefile", "yes", "make clean",
	"make", "make install" (whew!)

Click "start pvm".

Click "execute".

The output should be a matrix that looks like this:

    6.0000   18.0000  -12.0000    3.0000
    0.6667   -9.0000   10.0000    2.0000
    0.5000   -0.8889   24.8889    1.2778
    0.3333    0.2222   -0.0089    0.5670

To see/modify the input matrix, go into "compose" mode and shift-right-click
the initarray node (you may have to enlarge the window and/or click cleanup
before it will be on the screen).

The stuff that looks like:

    static double _a[] = {
	3,  17,  10,  1,
	2,   4 , -2,  2,
	6,  18, -12,  3,
	4,   3,   2,  4,
    };
 
Is the input matrix.  Add or delete rows and columns as you please (leave
it square, current implementation only works for square matrices).

If you change the size of the matrix, then also right-click on the initarray
module and change the n parameter to reflect the new size of the array.

Also ensure that the nb parameter evenly divides n.  (nb is the block size).

The pivot permutation (terminology?) is computed, but not printed.  This could
easily be added to the output node if desired.

The equivalent sequential algorithm is:

    for(j = 0; j < n; j += nb) {                        /* loop */
        pivot(j, n, nb, a, nca, piv);
        for(l = 0; l <= ((n/nb-j/nb)-1)-1; l++) {       /* fanout */
            int jl;

            jl = (l+1)*nb+j;
            trisolve(nb, nb, &A(j,j), nca, &A(j,jl), nca);
        }
        for(q = 0; q <= ((n/nb-j/nb)-1)-1; q++) {       /* fanout */
            for(r = 0; r <= ((n/nb-j/nb)-1)-1; r++) {   /* fanout */
                int jq, jr;

                jq = (q+1)*nb+j;
                jr = (r+1)*nb+j;
                update2(nb, nb, nb, &A(jq,j),nca, &A(j,jr),nca, &A(jq,jr),nca);
            }
        }
    }


