The traditional algorithm for QR factorization is based on the use of elementary Householder matrices of the general form

where v is a column vector and

is a scalar. This leads to an algorithm with very good vector performance, especially if coded to use Level 2 BLAS.
The key to developing a block form of this algorithm is to represent a product of b elementary Householder matrices of order n as a block form of a Householder matrix . This can be done in various ways. LAPACK uses the following form [68]:

where V is an n-by-n matrix whose columns are the individual vectors

associated with the Householder matrices

, and T is an upper triangular matrix of order b. Extra work is required to compute the elements of T, but once again this is compensated for by the greater speed of applying the block form. Table 3.6 summarizes results obtained with the LAPACK routine SGEQRF /DGEQRF .
-------------------------------------------------
No. of Block Values of n
processors size 100 1000
-------------------------------------------------
CONVEX C-4640 1 64 81 521
CONVEX C-4640 4 64 94 1204
CRAY C90 1 128 384 859
CRAY C90 16 128 390 7641
DEC 3000-500X Alpha 1 32 50 86
IBM POWER2 model 590 1 32 108 208
IBM RISC Sys/6000-550 1 32 30 61
SGI POWER CHALLENGE 1 64 61 190
SGI POWER CHALLENGE 4 64 39 342
-------------------------------------------------