From messina Tue Oct 29 08:57:33 1985 Received: by anl-mcs.ARPA (4.12/4.9) id AA13772; Tue, 29 Oct 85 08:57:23 cst Date: Tue, 29 Oct 85 08:57:23 cst From: messina (Paul Messina) Message-Id: <8510291457.AA13772@anl-mcs.ARPA> To: supers Subject: NBS benchmarking Status: R Of possible interest: >From welch@ames-vmsb.ARPA Sat Sep 14 13:06:05 1985 Received: from su-aimvax.arpa (su-aimvax.arpa.ARPA) by lbl-csam.ARPA ; Sat, 14 Sep 85 13:06:05 pdt Message-Id: <8509142006.AA18366@lbl-csam.ARPA> Received: from ames-vmsb.ARPA by su-aimvax.arpa with TCP; Sat, 14 Sep 85 13:05:46 pdt Date: 14 Sep 85 12:53:00 PST From: welch@ames-vmsb.ARPA Subject: SIGBIG To: bayboards@diablo.ARPA Reply-To: welch@ames-vmsb.ARPA Reply-To: "DYMOND, KEN" NBS PARALLEL COMPUTER BENCHMARK COLLECTION The National Bureau of Standards, since its founding, has been concerned with measurement, determining the precise values and metrics for physical phenomena. The NBS has also made significant contri- butions to metrology in numerous scientific and engineering disciplines. In this tradition, the MPC (Measurement for Parallel Computing) project at NBS is developing a set of metrics and measure- ment techniques to characterize the performance of parallel processing systems. As part of that effort, NBS is collecting benchmarks and code kernels that represent a variety of applications which are candidates for parallel processing. NBS solicits benchmark codes and kernels from researchers and scientists. Programs which are computationally intensive, I/O intensive, vectorizable or not and from non-numeric as well as from numeric application areas are requested. Especially welcome are programs which have been used to produce timing or speedup data on parallel computers, whose measurement results have been or may be published in the technical literature, and which are in some fairly widely used and higher-level programming language such as FORTRAN, "C", LISP, Ada, etc. Contributions or inquiries should be directed to: Measurement for Parallel Computing Institute for Computer Sciences and Technology Materials Building MS B364 National Bureau of Standards Gaithersburg, MD 20899 USA Telephone: 301-921-3274 ARPANET: MEASURE@NBS-VMS From gustav.yktvmt%ibm-sj.csnet@CSNET-RELAY.ARPA Thu Nov 7 15:17:12 1985 Received: from CSNET-RELAY.ARPA (csnet-relay.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02489; Thu, 7 Nov 85 15:16:51 cst Message-Id: <8511072116.AA02489@anl-mcs.ARPA> Received: from ibm-sj by csnet-relay.csnet id ac01270; 7 Nov 85 16:07 EST Date: Thu, 7 Nov 85 15:01:48 EST From: "Fred G. Gustavson" To: dongarra@anl-mcs.ARPA Subject: reply of 11/7/85 morning phone call Status: R Jack: Here is the info you want. Routine Time(ms) MFLOPS DGEF 10258261 65 DGES 44364 45 SGEF 10317723 65 SGES 43425 46 DPPF 5384767 62 DPPS 51997 38 SPPF 5158187 65 SPPS 54770 36 Best regards, Fred From VMAA5%KSUVM.BITNET@WISCVM.ARPA Thu Nov 7 14:55:40 1985 Received: from WISCVM.ARPA (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02055; Thu, 7 Nov 85 14:55:23 cst Message-Id: <8511072055.AA02055@anl-mcs.ARPA> Received: from (MAILER)KSUVM.BITNET by WISCVM.ARPA on 11/07/85 at 14:54:33 CST Return-Path: VMAA5%KSUVM.BITNET@WISCVM.ARPA Received: by KSUVM (Mailer X1.20) id 6472; Thu, 07 Nov 85 14:37:38 CST Date: Thu, 7 Nov 85 14:29 CST From: Neil Erdwien Subject: LINPACK Benchmarks To: Jack J. Dongarra Status: RO I saw your benchmark programs using LINPACK available from NETLIB, so tried them on Kansas State University's National Advanced System's 6630. Listings from the runs follow: "Full" precision -- all VS FORTRAN 1.4.0 opt(3) PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 2.91923652E+00 1.29549149E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.150E+00 1.000E-01 3.250E+00 2.113E-01 9.466E+00 5.804E+01 3.130E+00 1.000E-01 3.230E+00 2.126E-01 9.408E+00 5.768E+01 3.160E+00 1.000E-01 3.260E+00 2.106E-01 9.495E+00 5.821E+01 3.151E+00 9.700E-02 3.248E+00 2.114E-01 9.460E+00 5.800E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.150E+00 9.000E-02 3.240E+00 2.119E-01 9.437E+00 5.786E+01 3.130E+00 9.999E-02 3.230E+00 2.126E-01 9.408E+00 5.768E+01 3.140E+00 9.999E-02 3.240E+00 2.119E-01 9.437E+00 5.786E+01 3.159E+00 9.800E-02 3.257E+00 2.108E-01 9.486E+00 5.816E+01 R; T=91.13/92.05 23:37:10 "Half" precision -- all VS FORTRAN 1.4.0 opt(3) PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 3.97839260E+00 7.59065151E-04 9.53674316E-07 9.99704897E-01 9.99731898E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.990E+00 6.000E-02 2.050E+00 3.350E-01 5.971E+00 3.661E+01 2.000E+00 6.000E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.000E+00 7.000E-02 2.070E+00 3.317E-01 6.029E+00 3.696E+01 2.015E+00 6.100E-02 2.076E+00 3.308E-01 6.047E+00 3.707E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.000E+00 6.000E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.000E+00 6.001E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 1.990E+00 6.999E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.017E+00 6.400E-02 2.081E+00 3.300E-01 6.061E+00 3.716E+01 R; T=60.64/61.70 23:40:09 The use of the assembler IBM BLAS routines did not speed up the times; the above results are without the "coded" BLAS routines. By the way, I think the NETLIB service is great -- a wonderful and useful idea. From AG2%CORNELLA.BITNET@ucbvax.berkeley.edu Mon Nov 11 08:05:09 1985 Received: from ucbvax.berkeley.edu (ucbvax.berkeley.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28614; Mon, 11 Nov 85 08:05:03 cst Received: by ucbvax.berkeley.edu (5.31/1.2) id AA03952; Mon, 11 Nov 85 06:02:48 PST Received: from CORNELLA by ucbjade.Berkeley.Edu (4.19/4.40.2) id AA25387; Mon, 11 Nov 85 06:04:32 pst Message-Id: <8511111404.AA25387@ucbjade.Berkeley.Edu> Date: 11 November 85 09:02 EST From: AG2%CORNELLA.BITNET@ucbvax.berkeley.edu Subject: Small Benchmark To: DONGARRA@anl-mcs.arpa Status: R Jack: a question on a previous topic. The performance guys in Endicott asked me to reconfirm a detail. Is it verboten for them to change the dimension statements (adding a dummy array) in the main calling routine for the LINPACK benchamrks - without touching SGEFA and SGESL at all (and letting you see the modified source)? Please understand that I have no personal stake in this - and that no-one is petitioning a change in your standard procedure - but they want to understand clearly what is kosher and what isn't, Regards, Alec From sun!sunmark!jricotta@ucbvax.berkeley.edu Sat Nov 23 18:26:27 1985 Received: from ucbvax.berkeley.edu (ucbvax.berkeley.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA26785; Sat, 23 Nov 85 18:26:21 cst Received: by ucbvax.berkeley.edu (5.31/1.7) id AA08928; Thu, 21 Nov 85 19:29:54 PST Received: from snail.sun.uucp by sun.uucp (3.0DEV4/SMI-2.0) id AA18847; Wed, 20 Nov 85 15:35:18 PST Received: from sunmark.sun.uucp by snail.sun.uucp (3.0ALPHA/SMI-3.0DEV4) id AA15624; Wed, 20 Nov 85 15:35:44 PST Return-Path: Received: by sunmark.sun.uucp (2.0/SMI-2.0) id AA18528; Wed, 20 Nov 85 15:31:46 pst Date: Wed, 20 Nov 85 15:31:46 pst From: sun!sunmark!jricotta@ucbvax.berkeley.edu (Jim Ricotta) Message-Id: <8511202331.AA18528@sunmark.sun.uucp> To: DONGARRA@anl-mcs.arpa Subject: Linpack Question Status: R Mr Dongarra, Do you have any Linpack results for some of the recently announced Masscomp systems? I am particularly interested in the MC5400 base system (68020 + 68881), and the 5400 plus "Lightning" FPA. All the performance claims they make for floating point are in KWhets, and this makes me somewhat suspicious. I'd like to find out how they actually stack up in the Linpack benchmark. Their new FPA ("Lightning") is based on the Weitek 1164/65 chips, and they are rating it at ">3 MWhets". The FP-501 is rated at 924 Kwhets. I know the Whetstone benchmark is algorithm dependent, and not well standardized, so I think Linpack will provide a better basis for comparison. Any information you could offer would be greatly appreciated. Thanks, Jim Ricotta From greenfield%dvinci.DEC@decwrl.DEC.COM Mon Dec 9 19:15:20 1985 Received: from decwrl.DEC.COM (decwrl.dec.com.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00191; Mon, 9 Dec 85 19:15:10 cst Received: from DEC-RHEA.ARPA (dec-rhea) by decwrl.DEC.COM (4.22.01/4.7.34) id AA08779; Mon, 9 Dec 85 16:54:44 pst Message-Id: <8512100054.AA08779@decwrl.DEC.COM> Date: Monday, 9 Dec 1985 16:49:09-PST From: greenfield%dvinci.DEC@decwrl.DEC.COM (Mike Greenfield mr3-1/e13,dtn297-7481 ) To: dongarra@anl-mcs Subject: revise linpack results for 8600 and 8650 cpus Status: RO Jack; A few little changes to the 8600/blas results and some new 8650 numbers. I think that these are in a decus presentation that will be given this week. regards, -Mike (Mflops are correct - other ratios have a new formula, so they need to be recomputed) Solving a system of linear equations with LINPACK in full precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS (coded BLAS) .96 VAX 8650 VMS .704 VAX 8600 VMS (coded BLAS) .660 VAX 8600 VMS .486 VAX 785 VMS (coded BLAS) .225 VAX 785 VMS .196 VAX 780 VMS (coded BLAS) .166 uVAX II VMS (coded BLAS) .156 uVAX II FG VMS (coded BLAS) .151 VAX 750 VMS (coded BLAS) .148 VAX 780 VMS .138 VAX 750 VMS .124 uVAX II VMS .126 uVAX II FG VMS .119 VAX 725 VMS (coded BLAS) .043 VAX 725 FG VMS (coded BLAS) .038 VAX 725 VMS .037 VAX 725 FG VMS .033 Note: FG = floating_G processing. ====================================================================== Solving a System of Linear Equations with LINPACK in Single Precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS (coded BLAS) 1.9 VAX 8600 VMS (coded BLAS) 1.26 VAX 8650 VMS 1.25 VAX 8600 VMS .88 VAX 785 VMS (coded BLAS) .511 VAX 785 VMS .398 VAX 780 VMS (coded BLAS) .339 VAX 780 VMS .250 VAX 750 VMS (coded BLAS) .242 uVAX II VMS (coded BLAS) .227 VAX 750 VMS .183 uVAX II VMS .174 VAX 725 VMS (coded BLAS) .066 VAX 725 VMS .052 From RKUJW%DKCCRE01.BITNET@WISCVM.WISC.EDU Mon Jan 6 13:42:20 1986 Received: from WISCVM.WISC.EDU (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA19870; Mon, 6 Jan 86 13:42:06 cst Message-Id: <8601061942.AA19870@anl-mcs.ARPA> Received: from (DKCCRE01)DKUCCC11.BITNET by WISCVM.WISC.EDU on 01/06/86 at 13:42:12 CST Date: 05 jan 86 at 14:20:10 DNT From: RKUJW%DKCCRE01.BITNET@WISCVM.WISC.EDU To: dongarra@anl-mcs.arpa Subject: Sperry 1100/82 and 1100/92 Linpack results. Status: RO *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Single Precision 1100/92 without SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 2.53175148+000 7.53998756-006 1.49011612-008 9.99996811-001 9.99997862-001 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.600-003 2.482-001 2.767+000 7.229-001 4.432+000 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.660-003 2.483-001 2.766+000 7.231-001 4.433+000 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.600-003 2.482-001 2.767+000 7.229-001 4.432+000 2.404-001 7.600-003 2.480-001 2.769+000 7.223-001 4.429+000 2.406-001 7.660-003 2.483-001 2.766+000 7.231-001 4.433+000 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 10.28 7.28 .02 2.99 98 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Double Precision 1100/92 without SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 9.78186593-001 6.78276879-016 3.46944695-018 1.00000000+000 1.00000000+000 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.138-002 3.744-001 1.834+000 1.091+000 6.686+000 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.138-002 3.744-001 1.834+000 1.090+000 6.686+000 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 16.31 10.79 .02 5.50 177 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Single Precision 1100/82 with SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 2.53175148+000 7.53998756-006 1.49011612-008 9.99996811-001 9.99997862-001 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 8.056-001 2.540-002 8.310-001 8.263-001 2.420+000 1.484+001 8.052-001 2.520-002 8.304-001 8.269-001 2.419+000 1.483+001 8.046-001 2.540-002 8.300-001 8.273-001 2.417+000 1.482+001 8.056-001 2.540-002 8.310-001 8.264-001 2.420+000 1.484+001 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 8.070-001 2.520-002 8.322-001 8.251-001 2.424+000 1.486+001 8.054-001 2.520-002 8.306-001 8.267-001 2.419+000 1.483+001 8.070-001 2.560-002 8.326-001 8.247-001 2.425+000 1.487+001 8.062-001 2.524-002 8.314-001 8.259-001 2.422+000 1.485+001 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 27.17 24.11 .04 3.02 98 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Double Precision 1100/82 with SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 9.78186593-001 6.78276879-016 3.46944695-018 1.00000000+000 1.00000000+000 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.148+000 3.640-002 1.184+000 5.799-001 3.449+000 2.115+001 1.148+000 3.620-002 1.184+000 5.800-001 3.449+000 2.114+001 1.148+000 3.620-002 1.184+000 5.800-001 3.449+000 2.114+001 1.148+000 3.624-002 1.184+000 5.798-001 3.449+000 2.115+001 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 1.149+000 3.620-002 1.185+000 5.793-001 3.453+000 2.117+001 1.149+000 3.620-002 1.185+000 5.793-001 3.453+000 2.117+001 1.149+000 3.640-002 1.186+000 5.792-001 3.453+000 2.117+001 1.149+000 3.624-002 1.185+000 5.793-001 3.453+000 2.117+001 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 39.54 33.97 .04 5.54 177 K **** **** *************************************************** Are you interested also in the results of the Sperry AP? J.Wasniewski, RECKU, Copenhagen. From maddog!sequent!rand@lll-crg.ARPA Wed Jan 8 17:27:11 1986 Received: from lll-crg.ARPA (lll-crg.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA25729; Wed, 8 Jan 86 17:26:46 cst Received: by lll-crg.ARPA id AA29194; Wed, 8 Jan 86 15:27:45 pst id AA29194; Wed, 8 Jan 86 15:27:45 pst Received: by maddog.uucp (4.12/3.14) id AA00315; Wed, 8 Jan 86 15:12:44 pst Message-Id: <8601082312.AA00315@maddog.uucp> Date: Wed, 8 Jan 86 14:01:41 pst From: maddog!sequent!rand@lll-crg.ARPA (Randall H. Dow) To: dongarra@anl-mcs.ARPA Subject: Latest LINPACK Results Cc: gf@lll-crg.ARPA Status: RO Jack: Here are our latest results. As you can see, our latest optimizing compiler gives us about 15-20% performance improvement. Rand Dow ****************************************************************************** LINPACK in Full Precision Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 ****************************************************************************** + dlp Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.132E+01 3.667E-01 1.168E+01 5.877E-02 3.403E+01 2.086E+02 1.130E+01 3.500E-01 1.165E+01 5.894E-02 3.393E+01 2.080E+02 1.130E+01 3.333E-01 1.163E+01 5.903E-02 3.388E+01 2.077E+02 1.132E+01 3.467E-01 1.167E+01 5.885E-02 3.399E+01 2.084E+02 times for array with leading dimension of 200 1.132E+01 3.500E-01 1.167E+01 5.886E-02 3.398E+01 2.083E+02 1.132E+01 3.500E-01 1.167E+01 5.886E-02 3.398E+01 2.083E+02 1.137E+01 3.500E-01 1.172E+01 5.861E-02 3.413E+01 2.092E+02 1.132E+01 3.467E-01 1.167E+01 5.883E-02 3.400E+01 2.084E+02 Programmed STOP ****************************************************************************** LINPACK in Full Precision, Coded BLAS Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlp.as Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.007E+01 3.167E-01 1.038E+01 6.613E-02 3.024E+01 1.854E+02 1.005E+01 3.167E-01 1.037E+01 6.624E-02 3.019E+01 1.851E+02 1.007E+01 3.167E-01 1.038E+01 6.613E-02 3.024E+01 1.854E+02 1.007E+01 3.100E-01 1.038E+01 6.616E-02 3.023E+01 1.853E+02 Programmed STOP ****************************************************************************** Caveat on the "Compiler Directive". This is essentially coded via the mechanism I reported in my "Parallel LINPACK Case Study". We are just a couple of months away from actually having the compiler directive working so that *ALL* of the handcoding that I did would be automatically generated by the compiler with one compiler directive comment. ****************************************************************************** ****************************************************************************** "Compiler Directive" to access multiprocessing, Full Precision Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + dlps -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.033E+00 3.500E-01 9.383E+00 7.318E-02 2.733E+01 1.676E+02 8.333E-01 3.500E-01 1.183E+00 5.803E-01 3.447E+00 2.113E+01 8.667E-01 3.500E-01 1.217E+00 5.644E-01 3.544E+00 2.173E+01 8.617E-01 3.383E-01 1.200E+00 5.722E-01 3.495E+00 2.143E+01 times for array with leading dimension of 200 8.933E+00 3.333E-01 9.267E+00 7.410E-02 2.699E+01 1.655E+02 8.333E-01 3.333E-01 1.167E+00 5.886E-01 3.398E+00 2.083E+01 8.667E-01 3.333E-01 1.200E+00 5.722E-01 3.495E+00 2.143E+01 8.567E-01 3.400E-01 1.197E+00 5.738E-01 3.485E+00 2.137E+01 Programmed STOP ****************************************************************************** "Compiler Directive" to access multiprocessing, Full Precision, Coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlps.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.083E+00 3.000E-01 9.383E+00 7.318E-02 2.733E+01 1.676E+02 8.167E-01 3.000E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.000E-01 3.167E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.100E-01 3.100E-01 1.120E+00 6.131E-01 3.262E+00 2.000E+01 times for array with leading dimension of 200 8.917E+00 3.333E-01 9.250E+00 7.423E-02 2.694E+01 1.652E+02 8.167E-01 3.167E-01 1.133E+00 6.059E-01 3.301E+00 2.024E+01 8.000E-01 3.167E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.117E-01 3.117E-01 1.123E+00 6.113E-01 3.272E+00 2.006E+01 Programmed STOP ****************************************************************************** LINPACK in Half Precision Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 ****************************************************************************** + slp Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.59605300E+00 3.80277600E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 8.833E+00 2.833E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.817E+00 2.667E-01 9.083E+00 7.560E-02 2.646E+01 1.622E+02 8.833E+00 2.667E-01 9.100E+00 7.546E-02 2.650E+01 1.625E+02 8.833E+00 2.750E-01 9.108E+00 7.539E-02 2.653E+01 1.626E+02 times for array with leading dimension of 200 8.850E+00 2.667E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.833E+00 2.833E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.833E+00 2.667E-01 9.100E+00 7.546E-02 2.650E+01 1.625E+02 8.832E+00 2.750E-01 9.107E+00 7.540E-02 2.652E+01 1.626E+02 Programmed STOP ****************************************************************************** LINPACK in Half Precision, coded BLAS Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + slp.as Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.59605300E+00 3.80277600E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 8.050E+00 2.500E-01 8.300E+00 8.273E-02 2.417E+01 1.482E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.062E+00 2.517E-01 8.313E+00 8.260E-02 2.421E+01 1.485E+02 times for array with leading dimension of 200 8.050E+00 2.667E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.050E+00 2.500E-01 8.300E+00 8.273E-02 2.417E+01 1.482E+02 8.058E+00 2.500E-01 8.308E+00 8.265E-02 2.420E+01 1.484E+02 Programmed STOP ****************************************************************************** No entry yet for the matrix-vector program, order 300 ****************************************************************************** ****************************************************************************** Note: for all of the order 1000 runs, the first pass requires page faulting of the data. After that, the pages remain valid. This explains, I believe, the lower result on the first pass of each array. ****************************************************************************** ****************************************************************************** Single precision, order 1000 Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + slps.1000 -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.13179700E+01 2.70068600E-03 1.19209300E-07 1.00016200E+00 9.99933500E-01 times are reported for matrices of order 1000 sgefa sgesl total mflops unit ratio times for array with leading dimension of1001 4.211E+02 2.382E+01 4.449E+02 1.503E+00 1.331E+00 7.945E+03 2.960E+02 2.377E+01 3.198E+02 2.091E+00 9.566E-01 5.711E+03 2.961E+02 2.382E+01 3.199E+02 2.090E+00 9.569E-01 5.713E+03 2.963E+02 2.435E+01 3.206E+02 2.086E+00 9.589E-01 5.725E+03 times for array with leading dimension of1000 4.184E+02 2.382E+01 4.422E+02 1.512E+00 1.323E+00 7.896E+03 2.957E+02 2.388E+01 3.196E+02 2.092E+00 9.559E-01 5.707E+03 2.956E+02 2.402E+01 3.196E+02 2.092E+00 9.560E-01 5.707E+03 2.964E+02 2.435E+01 3.208E+02 2.085E+00 9.594E-01 5.728E+03 Programmed STOP ****************************************************************************** Single precision, order 1000, coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + slps.1000.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.13179700E+01 2.70068600E-03 1.19209300E-07 1.00016200E+00 9.99933500E-01 times are reported for matrices of order 1000 sgefa sgesl total mflops unit ratio times for array with leading dimension of1001 4.003E+02 2.282E+01 4.231E+02 1.581E+00 1.265E+00 7.555E+03 2.762E+02 2.287E+01 2.991E+02 2.236E+00 8.946E-01 5.341E+03 2.763E+02 2.283E+01 2.991E+02 2.235E+00 8.947E-01 5.341E+03 2.763E+02 2.285E+01 2.991E+02 2.236E+00 8.946E-01 5.341E+03 times for array with leading dimension of1000 3.984E+02 2.263E+01 4.210E+02 1.588E+00 1.259E+00 7.518E+03 2.744E+02 2.273E+01 2.972E+02 2.250E+00 8.888E-01 5.307E+03 2.744E+02 2.270E+01 2.971E+02 2.251E+00 8.886E-01 5.305E+03 2.744E+02 2.268E+01 2.971E+02 2.251E+00 8.885E-01 5.305E+03 Programmed STOP ****************************************************************************** Caveat on the Full Precision, order 1000. The standard driver program reguires about 16 MB of memory. Since we only have 16 MB of virtual space it wouldn't fit. I commented out the second half of the run, and the second set of arrays. This should give the same results. ****************************************************************************** ****************************************************************************** Double Precision, order 1000 Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + dlps.1000 -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 9.50387011E+00 4.22017976E-12 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 1000 dgefa dgesl total mflops unit ratio times for array with leading dimension of1001 6.653E+02 3.228E+01 6.976E+02 9.585E-01 2.087E+00 1.246E+04 4.468E+02 3.230E+01 4.791E+02 1.396E+00 1.433E+00 8.556E+03 4.468E+02 3.228E+01 4.791E+02 1.396E+00 1.433E+00 8.554E+03 4.456E+02 3.233E+01 4.779E+02 1.399E+00 1.429E+00 8.534E+03 Programmed STOP ****************************************************************************** Double Precision, order 1000, coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlps.1000.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 9.50387011E+00 4.22017976E-12 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 1000 dgefa dgesl total mflops unit ratio times for array with leading dimension of1001 6.337E+02 2.847E+01 6.622E+02 1.010E+00 1.981E+00 1.182E+04 4.168E+02 2.842E+01 4.452E+02 1.502E+00 1.332E+00 7.949E+03 4.170E+02 2.838E+01 4.453E+02 1.501E+00 1.332E+00 7.952E+03 4.173E+02 2.844E+01 4.457E+02 1.500E+00 1.333E+00 7.959E+03 Programmed STOP From unido!ztivax!schnepf@seismo.CSS.GOV Wed Jan 29 08:09:23 1986 Received: from seismo.CSS.GOV (seismo.css.gov.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA03238; Wed, 29 Jan 86 08:08:33 cst Return-Path: Received: from unido.UUCP by seismo.CSS.GOV with UUCP; Wed, 29 Jan 86 09:01:18 EST From: unido!ztivax!schnepf@seismo.CSS.GOV Received: by unido.uucp with uucp; Wed, 29 Jan 86 14:35:16 -0100 Received: by ztivax.UUCP (4.12/4.8) id AA13233; Wed, 29 Jan 86 14:30:14 GMT Date: Wed, 29 Jan 86 14:30:14 GMT Message-Id: <8601291430.AA13233@ztivax.UUCP> To: dongarra@anl-mcs.arpa Status: R Dear Jack Dongarra, I have tested the LINPACK-benchmark on our Siemens/Fujitsu-VP 200. I fetched the program from netlib and compared the result with your report which I have fetched from netlib too. I could make some improvements in the performance by replacing the BLAS- routines by inline-FORTRAN-code and by inserting some compiler directives. Finally I reached about 33 MFLOPS on the VP 200 (cycle time 15 nsec.). I think it might be interesting for you to see the output of the compiler and the results for full precision (64 bit) and half precision (32 bit). It is worthwhile mentioning that no assembler has been used. Yours, Eric Schnepf Siemens AG, Vector Processor Systems Munich, Germany Arpanet: Na.schnepf at su-score 1 J E S J O B L O G -- S Y S T E M M S P A -- N O - 13.25.23 JOB 9708 KDS70001I ZT0118 LAST ACCESS AT 13:24:09 ON 86.029 13.25.23 JOB 9708 JEM373I ZT0118L STARTED - INIT 2 - CLASS A - SYS MSPA 13.25.24 JOB 9708 JDJ403I ZT0118L - STARTED - TIME=13.25.24 13.26.47 JOB 9708 JEM397I ZT0118L RE-ENQUEUED 13.26.47 JOB 9708 KDS70001I ZT0118 LAST ACCESS AT 13:25:23 ON 86.029 13.26.47 JOB 9708 JEM373I ZT0118L STARTED - INIT 2 - CLASS A - SYS VSP1 13.26.49 JOB 9708 JEM395I ZT0118L ENDED E20 V10L20 <<< JCL STATEMENTS LIST >>> DATE 01/29/86 T 1 //ZT0118L JOB G0130000, // ZT0118, **JOB STATEMENT GENERATED BY SUBMIT** // NOTIFY=ZT0118,USER=ZT0118, // PASSWORD=, // MSGLEVEL=(1,1) *** VP - LAUF * ***----------------------------------------------------------- 2 //FORT EXEC PGM=JZK@FORT,REGION=3072K, // PARM=('VP(200),VMSG(DETAIL),AD(DBL)'), // SYSTEM=MSPA 3 //SYSIN DD * 4 //STEPLIB DD DSN=SYS4.LINKLIB,DISP=SHR 5 // DD DSN=SYS4.LPALIB,DISP=SHR 6 //SYSLIN DD DSN=&&LOADSET,UNIT=SYSSQ,SPACE=(TRK,(10,5),RLSE), // DISP=(MOD,PASS),DCB=BLKSIZE=23440 7 //SYSPRINT DD SYSOUT=* ***------------------------------------------------------------------- 8 //LKED EXEC PGM=LINKEDIT,REGION=2048K,COND=(4,LT), // PARM='SIZE(1024K,512K)', // SYSTEM=MSPA 9 //SYSUT1 DD UNIT=SYSDA,SPACE=(TRK,(10,10)) 10 //SYSLIB DD DSN=SYS4.FORTLIB,DISP=SHR 11 // DD DSN=SYS4.LINKLIB,DISP=SHR 12 //SYSLMOD DD DSN=&&GOSET(MAIN),UNIT=SYSDA,DISP=(,PASS), // SPACE=(TRK,(10,10,1),RLSE),DCB=BLKSIZE=23476 13 //SYSLIN DD DSN=&&LOADSET,DISP=(OLD,DELETE) 14 //SYSPRINT DD SYSOUT=* ***------------------------------------------------------------------ 15 //GO EXEC PGM=*.LKED.SYSLMOD,COND=(4,LT), // SYSTEM=VSP1 16 //STEPLIB DD DSN=SYS4.LINKLIB,DISP=SHR 17 //SUBSYS DD SUBSYS=(VPCS,'SIZE=(200K,800K)') 18 //FT06F001 DD SYSOUT=* <<< SYSTEM MESSAGES LIST >>> KDS70001I ZT0118 LAST ACCESS AT 13:24:09 ON 86.029 JDJ142I ZT0118L FORT - STEP WAS EXECUTED - COND CODE 0000 JDJ373I STEP/FORT / START 86029.1325 JDJ374I STEP/FORT / STOP 86029.1326 CPU 0MIN 01.54SEC SRB 0MIN JDJ142I ZT0118L LKED - STEP WAS EXECUTED - COND CODE 0000 JDJ373I STEP/LKED / START 86029.1326 JDJ374I STEP/LKED / STOP 86029.1326 CPU 0MIN 00.19SEC SRB 0MIN KDS70001I ZT0118 LAST ACCESS AT 13:25:23 ON 86.029 JDJ142I ZT0118L GO - STEP WAS EXECUTED - COND CODE 0000 JDJ373I STEP/GO / START 86029.1326 JDJ374I STEP/GO / STOP 86029.1326 CPU 0MIN 00.87SEC SRB 0MIN ACTR001 STEP/GO / STOP CPU 0000MIN 00.87SEC VU 0000MIN ACTR003 JOB/GO I/O-NUM 000103 I/O-RATE 0001 JDJ375I JOB/ZT0118L / START 86029.1325 JDJ376I JOB/ZT0118L / STOP 86029.1326 CPU 0MIN 02.60SEC SRB 0MIN ACTR002 JOB/ZT0118L / STOP CPU 0000MIN 02.60SEC VU 0000MIN ACTR003 JOB/ZT0118L I/O-NUM 000111 1 FORTRAN 77/VP V10L20 MAIN DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000001 REAL AA(200,200),A(201,200),B(200),X(200) 000002 REAL TIME(8,6),CRAY,OPS,TOTAL,NORMA,NORMX 000003 REAL RESID,RESIDN,EPS,EPSLON 000004 INTEGER IPVT(200) 000005 LDA = 201 000006 LDAA = 200 C 000007 N = 100 000008 CRAY = .056 000009 WRITE(6,1) 000010 1 FORMAT(' PLEASE SEND THE RESULTS OF THIS RUN TO:'// $ ' JACK J. DONGARRA'/ $ ' MATHEMATICS AND COMPUTER SCIENCE DIVISION'/ $ ' ARGONNE NATIONAL LABORATORY'/ $ ' ARGONNE, ILLINOIS 60439'// $ ' TELEPHONE: 312-972-7246'// $ ' ARPANET: DONGARRA@ANL-MCS'/) 000011 OPS = (2.0E0*N**3)/3.0E0 + 2.0E0*N**2 C 000012 CALL MATGEN(A,LDA,N,B,NORMA) 000013 T1 = SECOND() 000014 CALL SGEFA(A,LDA,N,IPVT,INFO) 000015 TIME(1,1) = SECOND() - T1 000016 T1 = SECOND() 000017 CALL SGESL(A,LDA,N,IPVT,B,0) 000018 TIME(1,2) = SECOND() - T1 000019 TOTAL = TIME(1,1) + TIME(1,2) C C COMPUTE A RESIDUAL TO VERIFY RESULTS. C 000020 V DO 10 I = 1,N 000021 V X(I) = B(I) 000022 V 10 CONTINUE 000023 CALL MATGEN(A,LDA,N,B,NORMA) 000024 V DO 20 I = 1,N 000025 V B(I) = -B(I) 000026 V 20 CONTINUE 000027 CALL SMXPY(N,B,N,LDA,X,A) 000028 RESID = 0.0 000029 NORMX = 0.0 000030 V DO 30 I = 1,N 000031 V RESID = AMAX1( RESID, ABS(B(I)) ) 000032 V NORMX = AMAX1( NORMX, ABS(X(I)) ) 000033 V 30 CONTINUE 000034 EPS = EPSLON(1.0) 000035 RESIDN = RESID/( N*NORMA*NORMX*EPS ) 000036 WRITE(6,40) 000037 40 FORMAT(' NORM. RESID RESID MACHEP', $ ' X(1) X(N)') 000038 WRITE(6,50) RESIDN,RESID,EPS,X(1),X(N) 000039 50 FORMAT(1P5E16.8) C 000040 WRITE(6,60) N 000041 60 FORMAT(//' TIMES ARE REPORTED FOR MATRICES OF ORDER ',I5) 000042 WRITE(6,70) 000043 70 FORMAT(6X,'SGEFA',6X,'SGESL',6X,'TOTAL',5X,'MFLOPS',7X,'UNIT' 1 FORTRAN 77/VP V10L20 MAIN DATE 86.01.29 TIME 13.25.25 0 $ 6X,'RATIO') C 000044 TIME(1,3) = TOTAL 000045 TIME(1,4) = OPS/(1.0E6*TOTAL) 000046 TIME(1,5) = 2.0E0/TIME(1,4) 000047 TIME(1,6) = TOTAL/CRAY 000048 WRITE(6,80) LDA 000049 80 FORMAT(' TIMES FOR ARRAY WITH LEADING DIMENSION OF',I4) 000050 WRITE(6,110) (TIME(1,I),I=1,6) C 000051 CALL MATGEN(A,LDA,N,B,NORMA) 000052 T1 = SECOND() 000053 CALL SGEFA(A,LDA,N,IPVT,INFO) 000054 TIME(2,1) = SECOND() - T1 000055 T1 = SECOND() 000056 CALL SGESL(A,LDA,N,IPVT,B,0) 000057 TIME(2,2) = SECOND() - T1 000058 TOTAL = TIME(2,1) + TIME(2,2) 000059 TIME(2,3) = TOTAL 000060 TIME(2,4) = OPS/(1.0E6*TOTAL) 000061 TIME(2,5) = 2.0E0/TIME(2,4) 000062 TIME(2,6) = TOTAL/CRAY C 000063 CALL MATGEN(A,LDA,N,B,NORMA) 000064 T1 = SECOND() 000065 CALL SGEFA(A,LDA,N,IPVT,INFO) 000066 TIME(3,1) = SECOND() - T1 000067 T1 = SECOND() 000068 CALL SGESL(A,LDA,N,IPVT,B,0) 000069 TIME(3,2) = SECOND() - T1 000070 TOTAL = TIME(3,1) + TIME(3,2) 000071 TIME(3,3) = TOTAL 000072 TIME(3,4) = OPS/(1.0E6*TOTAL) 000073 TIME(3,5) = 2.0E0/TIME(3,4) 000074 TIME(3,6) = TOTAL/CRAY C 000075 NTIMES = 10 000076 TM2 = 0 000077 T1 = SECOND() 000078 DO 90 I = 1,NTIMES 000079 TM = SECOND() 000080 CALL MATGEN(A,LDA,N,B,NORMA) 000081 TM2 = TM2 + SECOND() - TM 000082 CALL SGEFA(A,LDA,N,IPVT,INFO) 000083 90 CONTINUE 000084 TIME(4,1) = (SECOND() - T1 - TM2)/NTIMES 000085 T1 = SECOND() 000086 DO 100 I = 1,NTIMES 000087 CALL SGESL(A,LDA,N,IPVT,B,0) 000088 100 CONTINUE 000089 TIME(4,2) = (SECOND() - T1)/NTIMES 000090 TOTAL = TIME(4,1) + TIME(4,2) 000091 TIME(4,3) = TOTAL 000092 TIME(4,4) = OPS/(1.0E6*TOTAL) 000093 TIME(4,5) = 2.0E0/TIME(4,4) 000094 TIME(4,6) = TOTAL/CRAY C 000095 WRITE(6,110) (TIME(2,I),I=1,6) 1 FORTRAN 77/VP V10L20 MAIN DATE 86.01.29 TIME 13.25.25 0000096 WRITE(6,110) (TIME(3,I),I=1,6) 000097 WRITE(6,110) (TIME(4,I),I=1,6) 000098 110 FORMAT(6(1PE11.3)) C 000099 CALL MATGEN(AA,LDAA,N,B,NORMA) 000100 T1 = SECOND() 000101 CALL SGEFA(AA,LDAA,N,IPVT,INFO) 000102 TIME(5,1) = SECOND() - T1 000103 T1 = SECOND() 000104 CALL SGESL(AA,LDAA,N,IPVT,B,0) 000105 TIME(5,2) = SECOND() - T1 000106 TOTAL = TIME(5,1) + TIME(5,2) 000107 TIME(5,3) = TOTAL 000108 TIME(5,4) = OPS/(1.0E6*TOTAL) 000109 TIME(5,5) = 2.0E0/TIME(5,4) 000110 TIME(5,6) = TOTAL/CRAY C 000111 CALL MATGEN(AA,LDAA,N,B,NORMA) 000112 T1 = SECOND() 000113 CALL SGEFA(AA,LDAA,N,IPVT,INFO) 000114 TIME(6,1) = SECOND() - T1 000115 T1 = SECOND() 000116 CALL SGESL(AA,LDAA,N,IPVT,B,0) 000117 TIME(6,2) = SECOND() - T1 000118 TOTAL = TIME(6,1) + TIME(6,2) 000119 TIME(6,3) = TOTAL 000120 TIME(6,4) = OPS/(1.0E6*TOTAL) 000121 TIME(6,5) = 2.0E0/TIME(6,4) 000122 TIME(6,6) = TOTAL/CRAY C 000123 CALL MATGEN(AA,LDAA,N,B,NORMA) 000124 T1 = SECOND() 000125 CALL SGEFA(AA,LDAA,N,IPVT,INFO) 000126 TIME(7,1) = SECOND() - T1 000127 T1 = SECOND() 000128 CALL SGESL(AA,LDAA,N,IPVT,B,0) 000129 TIME(7,2) = SECOND() - T1 000130 TOTAL = TIME(7,1) + TIME(7,2) 000131 TIME(7,3) = TOTAL 000132 TIME(7,4) = OPS/(1.0E6*TOTAL) 000133 TIME(7,5) = 2.0E0/TIME(7,4) 000134 TIME(7,6) = TOTAL/CRAY C 000135 NTIMES = 10 000136 TM2 = 0 000137 T1 = SECOND() 000138 DO 120 I = 1,NTIMES 000139 TM = SECOND() 000140 CALL MATGEN(AA,LDAA,N,B,NORMA) 000141 TM2 = TM2 + SECOND() - TM 000142 CALL SGEFA(AA,LDAA,N,IPVT,INFO) 000143 120 CONTINUE 000144 TIME(8,1) = (SECOND() - T1 - TM2)/NTIMES 000145 T1 = SECOND() 000146 DO 130 I = 1,NTIMES 000147 CALL SGESL(AA,LDAA,N,IPVT,B,0) 000148 130 CONTINUE 000149 TIME(8,2) = (SECOND() - T1)/NTIMES 1 FORTRAN 77/VP V10L20 MAIN DATE 86.01.29 TIME 13.25.25 0000150 TOTAL = TIME(8,1) + TIME(8,2) 000151 TIME(8,3) = TOTAL 000152 TIME(8,4) = OPS/(1.0E6*TOTAL) 000153 TIME(8,5) = 2.0E0/TIME(8,4) 000154 TIME(8,6) = TOTAL/CRAY C 000155 WRITE(6,140) LDAA 000156 140 FORMAT(/' TIMES FOR ARRAY WITH LEADING DIMENSION OF',I4) 000157 WRITE(6,110) (TIME(5,I),I=1,6) 000158 WRITE(6,110) (TIME(6,I),I=1,6) 000159 WRITE(6,110) (TIME(7,I),I=1,6) 000160 WRITE(6,110) (TIME(8,I),I=1,6) 000161 STOP 000162 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(MAIN ) JND201I-I ISN:00000021 - 00000022 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000025 - 00000026 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000031 - 00000033 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND267I-I ISN:00000078 THIS DO LOOP IS NOT VECTORIZED SINCE THE DO V INDUCTION VARIABLE IS NOT USED. JND228I-I ISN:00000079 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000080 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000081 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000082 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND267I-I ISN:00000086 THIS DO LOOP IS NOT VECTORIZED SINCE THE DO V INDUCTION VARIABLE IS NOT USED. JND228I-I ISN:00000087 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND267I-I ISN:00000138 THIS DO LOOP IS NOT VECTORIZED SINCE THE DO V INDUCTION VARIABLE IS NOT USED. JND228I-I ISN:00000139 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000140 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000141 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND228I-I ISN:00000142 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO JND267I-I ISN:00000146 THIS DO LOOP IS NOT VECTORIZED SINCE THE DO V INDUCTION VARIABLE IS NOT USED. JND228I-I ISN:00000147 THE EXTERNAL PROCEDURE REFERENCE IS NOT VECTO STATISTICS: 162 STEPS, PROCEDURE SIZE= 1904 BYTES, PROGRAM NAME=MAIN 186 LINES, PROGRAM SIZE= 6208 BYTES, DIAGNOSTICS = 0 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2240K BYTES, 1 FORTRAN 77/VP V10L20 MATGEN DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000163 SUBROUTINE MATGEN(A,LDA,N,B,NORMA) 000164 REAL A(LDA,1),B(1),NORMA C 000165 INIT = 1325 000166 NORMA = 0.0 000167 S DO 30 J = 1,N 000168 M DO 20 I = 1,N 000169 S INIT = MOD(3125*INIT,65536) 000170 M A(I,J) = (INIT - 32768.0)/16384.0 000171 V NORMA = AMAX1(A(I,J), NORMA) 000172 S 20 CONTINUE 000173 S 30 CONTINUE 000174 V DO 35 I = 1,N 000175 V B(I) = 0.0 000176 V 35 CONTINUE 000177 S DO 50 J = 1,N 000178 V DO 40 I = 1,N 000179 V B(I) = B(I) + A(I,J) 000180 V 40 CONTINUE 000181 S 50 CONTINUE 000182 RETURN 000183 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(MATGEN) JND278I-I ISN:00000169 - 00000169 SINCE REFERENCING OF VARIABLE INIT PRECEDES I BE VECTORIZED. JND278I-I ISN:00000169 - 00000170 SINCE REFERENCING OF VARIABLE INIT PRECEDES I BE VECTORIZED. JND201I-I ISN:00000170 - 00000171 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000175 - 00000176 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000179 - 00000180 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B STATISTICS: 21 STEPS, PROCEDURE SIZE= 1256 BYTES, PROGRAM NAME=MATGEN 22 LINES, PROGRAM SIZE= 2200 BYTES, DIAGNOSTICS = 0 DVT SIZE= 8 BYTES*ITERATION COUNT REMAINING SIZE= 2224K BYTES, 1 FORTRAN 77/VP V10L20 SGEFA DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000184 SUBROUTINE SGEFA(A,LDA,N,IPVT,INFO) 000185 INTEGER LDA,N,IPVT(1),INFO 000186 REAL A(LDA,1) C C SGEFA FACTORS A REAL MATRIX BY GAUSSIAN ELIMINATION. C C SGEFA IS USUALLY CALLED BY DGECO, BUT IT CAN BE CALLED C DIRECTLY WITH A SAVING IN TIME IF RCOND IS NOT NEEDED. C (TIME FOR DGECO) = (1 + 9/N)*(TIME FOR SGEFA) . C C ON ENTRY C C A REAL(LDA, N) C THE MATRIX TO BE FACTORED. C C LDA INTEGER C THE LEADING DIMENSION OF THE ARRAY A . C C N INTEGER C THE ORDER OF THE MATRIX A . C C ON RETURN C C A AN UPPER TRIANGULAR MATRIX AND THE MULTIPLIERS C WHICH WERE USED TO OBTAIN IT. C THE FACTORIZATION CAN BE WRITTEN A = L*U WHERE C L IS A PRODUCT OF PERMUTATION AND UNIT LOWER C TRIANGULAR MATRICES AND U IS UPPER TRIANGULAR. C C IPVT INTEGER(N) C AN INTEGER VECTOR OF PIVOT INDICES. C C INFO INTEGER C = 0 NORMAL VALUE. C = K IF U(K,K) .EQ. 0.0 . THIS IS NOT AN ERROR C CONDITION FOR THIS SUBROUTINE, BUT IT DOES C INDICATE THAT SGESL OR DGEDI WILL DIVIDE BY ZERO C IF CALLED. USE RCOND IN DGECO FOR A RELIABLE C INDICATION OF SINGULARITY. C C LINPACK. THIS VERSION DATED 08/14/78 . C CLEVE MOLER, UNIVERSITY OF NEW MEXICO, ARGONNE NATIONAL LAB. C C SUBROUTINES AND FUNCTIONS C C BLAS SAXPY,SSCAL,ISAMAX C C INTERNAL VARIABLES C 000187 REAL T,D99MAX 000188 INTEGER ISAMAX,J,K,KP1,L,NM1,I99 C C C GAUSSIAN ELIMINATION WITH PARTIAL PIVOTING C 000189 INFO = 0 1 FORTRAN 77/VP V10L20 SGEFA DATE 86.01.29 TIME 13.25.25 0000190 NM1 = N - 1 000191 IF (NM1 .LT. 1) GO TO 70 000192 S DO 60 K = 1, NM1 000193 KP1 = K + 1 C C FIND L = PIVOT INDEX C CTUNE HAND-INTEGRATION OF ISAMAX C L = ISAMAX(N-K+1,A(K,K),1) + K - 1 000194 S ISAMAX = 1 000195 S D99MAX = ABS(A(K,K)) *VOCL LOOP,REPEAT(100) 000196 V DO 9999 I99 = 2,N-K+1 000197 V IF(ABS(A(K+I99-1,K)).LE.D99MAX) GO TO 9999 000198 V ISAMAX = I99 000199 V D99MAX = ABS(A(K+I99-1,K)) 000200 V 9999 CONTINUE 000201 S L = ISAMAX + K - 1 000202 S IPVT(K) = L C C ZERO PIVOT IMPLIES THIS COLUMN ALREADY TRIANGULARIZED C 000203 S IF (A(L,K) .EQ. 0.0E0) GO TO 40 C C INTERCHANGE IF NECESSARY C 000204 S IF (L .EQ. K) GO TO 10 000205 S T = A(L,K) 000206 S A(L,K) = A(K,K) 000207 S A(K,K) = T 000208 10 CONTINUE C C COMPUTE MULTIPLIERS C 000209 S T = -1.0E0/A(K,K) CTUNE HAND-INTEGRATION OF SSCAL C CALL SSCAL(N-K,T,A(K+1,K),1) *VOCL LOOP,REPEAT(100) 000210 V DO 9998 I99 = 1,N-K 000211 V A(K+I99,K) = T*A(K+I99,K) 000212 V 9998 CONTINUE C C ROW ELIMINATION WITH COLUMN INDEXING C 000213 S DO 30 J = KP1, N 000214 S T = A(L,J) 000215 S IF (L .EQ. K) GO TO 20 000216 S A(L,J) = A(K,J) 000217 S A(K,J) = T 000218 20 CONTINUE CTUNE HAND-INTEGRATION OF SAXPY C CALL SAXPY(N-K,T,A(K+1,K),1,A(K+1,J),1) *VOCL LOOP,REPEAT(100) 000219 V DO 9997 I99 = 1,N-K 000220 V A(K+I99,J) = A(K+I99,J) + T*A(K+I99,K) 000221 V 9997 CONTINUE 000222 S 30 CONTINUE 000223 GO TO 50 1 FORTRAN 77/VP V10L20 SGEFA DATE 86.01.29 TIME 13.25.25 0000224 40 CONTINUE 000225 INFO = K 000226 S 50 CONTINUE 000227 S 60 CONTINUE 000228 70 CONTINUE 000229 IPVT(N) = N 000230 IF (A(N,N) .EQ. 0.0E0) INFO = N 000231 RETURN 000232 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(SGEFA ) JND273I-I ISN:00000192 PARTIAL VECTORIZATION OVERHEAD IS TOO LARGE. JND277I-I ISN:00000194 SINCE VARIABLE ISAMAX MAY BE USED IN THE INNE VECTORIZED. JND277I-I ISN:00000195 SINCE VARIABLE A MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000195 SINCE VARIABLE D99MAX MAY BE USED IN THE INNE VECTORIZED. JND201I-I ISN:00000197 - 00000200 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND277I-I ISN:00000201 SINCE VARIABLE L MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000201 SINCE VARIABLE ISAMAX MAY BE USED IN THE INNE VECTORIZED. JND277I-I ISN:00000203 SINCE VARIABLE A MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000205 SINCE VARIABLE A MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000205 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000206 SINCE VARIABLE A MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000207 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000209 SINCE VARIABLE A MAY BE USED IN THE INNER DO VECTORIZED. JND277I-I ISN:00000209 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND201I-I ISN:00000211 - 00000212 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND277I-I ISN:00000214 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND201I-I ISN:00000220 - 00000221 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B STATISTICS: 49 STEPS, PROCEDURE SIZE= 1244 BYTES, PROGRAM NAME=SGEFA 123 LINES, PROGRAM SIZE= 2484 BYTES, DIAGNOSTICS = 0 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2236K BYTES, 1 FORTRAN 77/VP V10L20 SGESL DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000233 SUBROUTINE SGESL(A,LDA,N,IPVT,B,JOB) 000234 INTEGER LDA,N,IPVT(1),JOB 000235 REAL A(LDA,1),B(1) C C SGESL SOLVES THE REAL SYSTEM C A * X = B OR TRANS(A) * X = B C USING THE FACTORS COMPUTED BY DGECO OR SGEFA. C C ON ENTRY C C A REAL(LDA, N) C THE OUTPUT FROM DGECO OR SGEFA. C C LDA INTEGER C THE LEADING DIMENSION OF THE ARRAY A . C C N INTEGER C THE ORDER OF THE MATRIX A . C C IPVT INTEGER(N) C THE PIVOT VECTOR FROM DGECO OR SGEFA. C C B REAL(N) C THE RIGHT HAND SIDE VECTOR. C C JOB INTEGER C = 0 TO SOLVE A*X = B , C = NONZERO TO SOLVE TRANS(A)*X = B WHERE C TRANS(A) IS THE TRANSPOSE. C C ON RETURN C C B THE SOLUTION VECTOR X . C C ERROR CONDITION C C A DIVISION BY ZERO WILL OCCUR IF THE INPUT FACTOR CONTAINS A C ZERO ON THE DIAGONAL. TECHNICALLY THIS INDICATES SINGULARITY C BUT IT IS OFTEN CAUSED BY IMPROPER ARGUMENTS OR IMPROPER C SETTING OF LDA . IT WILL NOT OCCUR IF THE SUBROUTINES ARE C CALLED CORRECTLY AND IF DGECO HAS SET RCOND .GT. 0.0 C OR SGEFA HAS SET INFO .EQ. 0 . C C TO COMPUTE INVERSE(A) * C WHERE C IS A MATRIX C WITH P COLUMNS C CALL DGECO(A,LDA,N,IPVT,RCOND,Z) C IF (RCOND IS TOO SMALL) GO TO ... C DO 10 J = 1, P C CALL SGESL(A,LDA,N,IPVT,C(1,J),0) C 10 CONTINUE C C LINPACK. THIS VERSION DATED 08/14/78 . C CLEVE MOLER, UNIVERSITY OF NEW MEXICO, ARGONNE NATIONAL LAB. C C SUBROUTINES AND FUNCTIONS C 1 FORTRAN 77/VP V10L20 SGESL DATE 86.01.29 TIME 13.25.25 0 C BLAS SAXPY,SDOT C C INTERNAL VARIABLES C 000236 REAL SDOT,T 000237 INTEGER K,KB,L,NM1,I99 C 000238 NM1 = N - 1 000239 IF (JOB .NE. 0) GO TO 50 C C JOB = 0 , SOLVE A * X = B C FIRST SOLVE L*Y = B C 000240 IF (NM1 .LT. 1) GO TO 30 000241 S DO 20 K = 1, NM1 000242 L = IPVT(K) 000243 S T = B(L) 000244 IF (L .EQ. K) GO TO 10 000245 S B(L) = B(K) 000246 S B(K) = T 000247 10 CONTINUE CTUNE HAND-INTEGRATION OF SAXPY C CALL SAXPY(N-K,T,A(K+1,K),1,B(K+1),1) *VOCL LOOP,REPEAT(100) 000248 V DO 9999 I99 = 1,N-K 000249 V B(K+I99) = B(K+I99) + T*A(K+I99,K) 000250 V 9999 CONTINUE 000251 20 CONTINUE 000252 30 CONTINUE C C NOW SOLVE U*X = Y C 000253 S DO 40 KB = 1, N 000254 K = N + 1 - KB 000255 S B(K) = B(K)/A(K,K) 000256 S T = -B(K) CTUNE HAND-INTEGRATION OF SAXPY C CALL SAXPY(K-1,T,A(1,K),1,B(1),1) *VOCL LOOP,REPEAT(100) 000257 V DO 9998 I99 = 1,K-1 000258 V B(I99) = B(I99) + T*A(I99,K) 000259 V 9998 CONTINUE 000260 S 40 CONTINUE 000261 GO TO 100 000262 50 CONTINUE C C JOB = NONZERO, SOLVE TRANS(A) * X = B C FIRST SOLVE TRANS(U)*Y = B C 000263 S DO 60 K = 1, N CTUNE HAND-INTEGRATION OF SDOT C T = SDOT(K-1,A(1,K),1,B(1),1) 000264 S SDOT = 0.0E0 *VOCL LOOP,REPEAT(100) 000265 V DO 9997 I99 = 1,K-1 000266 V SDOT = SDOT + A(I99,K)*B(I99) 000267 V 9997 CONTINUE 000268 T = SDOT 1 FORTRAN 77/VP V10L20 SGESL DATE 86.01.29 TIME 13.25.25 0000269 S B(K) = (B(K) - T)/A(K,K) 000270 S 60 CONTINUE C C NOW SOLVE TRANS(L)*X = Y C 000271 IF (NM1 .LT. 1) GO TO 90 000272 S DO 80 KB = 1, NM1 000273 K = N - KB CTUNE HAND-INTEGRATION OF SDOT C B(K) = B(K) + SDOT(N-K,A(K+1,K),1,B(K+1),1) 000274 S SDOT = 0.0E0 *VOCL LOOP,REPEAT(100) 000275 V DO 9996 I99 = 1,N-K 000276 V SDOT = SDOT + A(K+I99,K)*B(K+I99) 000277 V 9996 CONTINUE 000278 S B(K) = B(K) + SDOT 000279 L = IPVT(K) 000280 IF (L .EQ. K) GO TO 70 000281 S T = B(L) 000282 S B(L) = B(K) 000283 S B(K) = T 000284 70 CONTINUE 000285 80 CONTINUE 000286 90 CONTINUE 000287 100 CONTINUE 000288 RETURN 000289 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(SGESL ) JND273I-I ISN:00000241 PARTIAL VECTORIZATION OVERHEAD IS TOO LARGE. JND238I-I ISN:00000243 - 00000246 ARRAY B CANNOT BE VECTORIZED BECAUSE VARIABLE EXPRESSION IS DEFINED IN THIS DO LOOP. JND277I-I ISN:00000243 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND201I-I ISN:00000249 - 00000250 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND277I-I ISN:00000256 SINCE VARIABLE T MAY BE USED IN THE INNER DO VECTORIZED. JND201I-I ISN:00000258 - 00000259 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND277I-I ISN:00000264 SINCE VARIABLE SDOT MAY BE USED IN THE INNER VECTORIZED. JND201I-I ISN:00000266 - 00000267 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND277I-I ISN:00000269 SINCE VARIABLE SDOT MAY BE USED IN THE INNER VECTORIZED. JND273I-I ISN:00000272 PARTIAL VECTORIZATION OVERHEAD IS TOO LARGE. JND277I-I ISN:00000274 SINCE VARIABLE SDOT MAY BE USED IN THE INNER VECTORIZED. JND201I-I ISN:00000276 - 00000277 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND238I-I ISN:00000278 - 00000283 ARRAY B CANNOT BE VECTORIZED BECAUSE VARIABLE EXPRESSION IS DEFINED IN THIS DO LOOP. JND277I-I ISN:00000278 SINCE VARIABLE SDOT MAY BE USED IN THE INNER VECTORIZED. STATISTICS: 57 STEPS, PROCEDURE SIZE= 1322 BYTES, PROGRAM NAME=SGESL 141 LINES, PROGRAM SIZE= 2606 BYTES, DIAGNOSTICS = 0 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2224K BYTES, 1 FORTRAN 77/VP V10L20 EPSLON DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000290 REAL FUNCTION EPSLON (X) 000291 REAL X C C ESTIMATE UNIT ROUNDOFF IN QUANTITIES OF SIZE X. C 000292 REAL A,B,C,EPS C C THIS PROGRAM SHOULD FUNCTION PROPERLY ON ALL SYSTEMS C SATISFYING THE FOLLOWING TWO ASSUMPTIONS, C 1. THE BASE USED IN REPRESENTING FLOATING POINT C NUMBERS IS NOT A POWER OF THREE. C 2. THE QUANTITY A IN STATEMENT 10 IS REPRESENTED TO C THE ACCURACY USED IN FLOATING POINT VARIABLES C THAT ARE STORED IN MEMORY. C THE STATEMENT NUMBER 10 AND THE GO TO 10 ARE INTENDED TO C FORCE OPTIMIZING COMPILERS TO GENERATE CODE SATISFYING C ASSUMPTION 2. C UNDER THESE ASSUMPTIONS, IT SHOULD BE TRUE THAT, C A IS NOT EXACTLY EQUAL TO FOUR-THIRDS, C B HAS A ZERO FOR ITS LAST BIT OR DIGIT, C C IS NOT EXACTLY EQUAL TO ONE, C EPS MEASURES THE SEPARATION OF 1.0 FROM C THE NEXT LARGER FLOATING POINT NUMBER. C THE DEVELOPERS OF EISPACK WOULD APPRECIATE BEING INFORMED C ABOUT ANY SYSTEMS WHERE THESE ASSUMPTIONS DO NOT HOLD. C C **************************************************************** C THIS ROUTINE IS ONE OF THE AUXILIARY ROUTINES USED BY EISPACK II C TO AVOID MACHINE DEPENDENCIES. C **************************************************************** C C THIS VERSION DATED 4/6/83. C 000293 A = 4.0E0/3.0E0 000294 10 B = A - 1.0E0 000295 C = B + B + B 000296 EPS = ABS(C-1.0E0) 000297 IF (EPS .EQ. 0.0E0) GO TO 10 000298 EPSLON = EPS*ABS(X) 000299 RETURN 000300 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(EPSLON) JND211I-I ISN:00000294 DO WHILE, DO UNTIL, OR IF/GOTO LOOP IS NOT VE STATISTICS: 11 STEPS, PROCEDURE SIZE= 46 BYTES, PROGRAM NAME=EPSLON 41 LINES, PROGRAM SIZE= 622 BYTES, DIAGNOSTICS = 0 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2344K BYTES, 1 FORTRAN 77/VP V10L20 SMXPY DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000301 SUBROUTINE SMXPY (N1, Y, N2, LDM, X, M) 000302 REAL Y(*), X(*), M(LDM,*) C C PURPOSE: C MULTIPLY MATRIX M TIMES VECTOR X AND ADD THE RESULT TO VECTOR Y. C C PARAMETERS: C C N1 INTEGER, NUMBER OF ELEMENTS IN VECTOR Y, AND NUMBER OF ROWS I C MATRIX M C C Y REAL(N1), VECTOR OF LENGTH N1 TO WHICH IS ADDED THE PRODUCT M* C C N2 INTEGER, NUMBER OF ELEMENTS IN VECTOR X, AND NUMBER OF COLUMN C IN MATRIX M C C LDM INTEGER, LEADING DIMENSION OF ARRAY M C C X REAL(N2), VECTOR OF LENGTH N2 C C M REAL(LDM,N2), MATRIX OF N1 ROWS AND N2 COLUMNS C C -------------------------------------------------------------------- C C CLEANUP ODD VECTOR C 000303 J = MOD(N2,2) 000304 IF (J .GE. 1) THEN 000305 V DO 10 I = 1, N1 000306 V Y(I) = (Y(I)) + X(J)*M(I,J) 000307 V 10 CONTINUE 000308 ENDIF C C CLEANUP ODD GROUP OF TWO VECTORS C 000309 J = MOD(N2,4) 000310 IF (J .GE. 2) THEN 000311 V DO 20 I = 1, N1 000312 V Y(I) = ( (Y(I)) $ + X(J-1)*M(I,J-1)) + X(J)*M(I,J) 000313 V 20 CONTINUE 000314 ENDIF C C CLEANUP ODD GROUP OF FOUR VECTORS C 000315 J = MOD(N2,8) 000316 IF (J .GE. 4) THEN 000317 V DO 30 I = 1, N1 000318 V Y(I) = ((( (Y(I)) $ + X(J-3)*M(I,J-3)) + X(J-2)*M(I,J-2)) $ + X(J-1)*M(I,J-1)) + X(J) *M(I,J) 000319 V 30 CONTINUE 000320 ENDIF C C CLEANUP ODD GROUP OF EIGHT VECTORS C 1 FORTRAN 77/VP V10L20 SMXPY DATE 86.01.29 TIME 13.25.25 0000321 J = MOD(N2,16) 000322 IF (J .GE. 8) THEN 000323 V DO 40 I = 1, N1 000324 V Y(I) = ((((((( (Y(I)) $ + X(J-7)*M(I,J-7)) + X(J-6)*M(I,J-6)) $ + X(J-5)*M(I,J-5)) + X(J-4)*M(I,J-4)) $ + X(J-3)*M(I,J-3)) + X(J-2)*M(I,J-2)) $ + X(J-1)*M(I,J-1)) + X(J) *M(I,J) 000325 V 40 CONTINUE 000326 ENDIF C C MAIN LOOP - GROUPS OF SIXTEEN VECTORS C 000327 JMIN = J+16 000328 S DO 60 J = JMIN, N2, 16 000329 V DO 50 I = 1, N1 000330 V Y(I) = ((((((((((((((( (Y(I)) $ + X(J-15)*M(I,J-15)) + X(J-14)*M(I,J-14)) $ + X(J-13)*M(I,J-13)) + X(J-12)*M(I,J-12)) $ + X(J-11)*M(I,J-11)) + X(J-10)*M(I,J-10)) $ + X(J- 9)*M(I,J- 9)) + X(J- 8)*M(I,J- 8)) $ + X(J- 7)*M(I,J- 7)) + X(J- 6)*M(I,J- 6)) $ + X(J- 5)*M(I,J- 5)) + X(J- 4)*M(I,J- 4)) $ + X(J- 3)*M(I,J- 3)) + X(J- 2)*M(I,J- 2)) $ + X(J- 1)*M(I,J- 1)) + X(J) *M(I,J) 000331 V 50 CONTINUE 000332 S 60 CONTINUE 000333 RETURN 000334 END FORTRAN 77/VP VECTORIZATION MESSAGES: PROGRAM NAME(SMXPY ) JND201I-I ISN:00000306 - 00000307 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000312 - 00000313 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000318 - 00000319 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000324 - 00000325 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B JND201I-I ISN:00000330 - 00000331 THE STATEMENTS IN THIS RANGE ARE VECTORIZED B STATISTICS: 34 STEPS, PROCEDURE SIZE= 3708 BYTES, PROGRAM NAME=SMXPY 85 LINES, PROGRAM SIZE= 5548 BYTES, DIAGNOSTICS = 0 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2172K BYTES, 1 FORTRAN 77/VP V10L20 SECOND DATE 86.01.29 TIME 13.25.25 0 ISN V SOURCE 000335 REAL FUNCTION SECOND(DUMMY) C 000336 REAL TIME C 000337 CALL CLOCK(TIME,0,2) C 000338 SECOND=TIME C 000339 RETURN 000340 END FORTRAN 77/VP ERROR MESSAGES: PROGRAM NAME(SECOND),FLAG(I) JZK523I-I NAM:DUMMY THIS NAME IS DECLARED BUT NOT USED IN THIS STATISTICS: 6 STEPS, PROCEDURE SIZE= 34 BYTES, PROGRAM NAME=SECOND 10 LINES, PROGRAM SIZE= 502 BYTES, DIAGNOSTICS = 1 DVT SIZE= 0 BYTES*ITERATION COUNT REMAINING SIZE= 2344K BYTES, SPECIFIED OPTIONS: VP(200),VMSG(DETAIL),AD(DBL) FORTRAN 77/VP OPTION LISTS: AUTODBL(30000) AE NOIHALF NONAME NOSOURCE EBCDIC ERRISN(2) NOALIGNC NOINCLUDE NONUM NOSEQ FIXED FLAG(I) NOASTER NOINSOURCE OBJECT NOTERM LMSG ISN(C) NOBYNAME NOITR NOPI NOSRCMSG LANGLVL(77) NODEBUG NOLIL PRINT XI LINECOUNT(60) NOFIPS NOLIST NORENT NOXREF SIZE( 2534K) NOGO NOMAP NOSDF XOPT( IL, AMOVE,NOMSG) VP(200) VOCL VSOURCE ADVANCED(EVL) VMSG(DETAIL) STATISTICS: 7 UNITS, 340 STEPS, 608 LINES, DIAGNOSTICS= 1, HIGHEST SEVE END OF COMPILATION 1 PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS RESULTS IN FULL PRECISION (WITH AD(DBL)-OPTION COMPUTED) (INLINE-FORTRAN-CODING OF BLAS AND INSERTING OF COMPILER-DIRECTIVES) NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 2.120E-02 8.854E-04 2.208E-02 3.109E+01 6.432E-02 3.943E-01 2.120E-02 8.594E-04 2.206E-02 3.113E+01 6.424E-02 3.939E-01 2.117E-02 8.594E-04 2.203E-02 3.117E+01 6.417E-02 3.934E-01 2.119E-02 7.578E-04 2.195E-02 3.129E+01 6.393E-02 3.919E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.120E-02 8.333E-04 2.203E-02 3.117E+01 6.417E-02 3.934E-01 2.117E-02 8.333E-04 2.201E-02 3.120E+01 6.409E-02 3.930E-01 2.120E-02 8.333E-04 2.203E-02 3.117E+01 6.417E-02 3.934E-01 2.119E-02 7.552E-04 2.195E-02 3.129E+01 6.393E-02 3.919E-01 1 PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS RESULTS IN HALF PRECISION (WITHOUT AD(DBL)-OPTION COMPUTED) (INLINE-FORTRAN-CODING OF BLAS AND INSERTING OF COMPILER-DIRECTIVES) NORM. RESID RESID MACHEP X(1) X(N) 3.68801689E+00 7.03105237E-04 9.53674316E-07 9.99935091E-01 9.99949932E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 2.047E-02 7.813E-04 2.125E-02 3.231E+01 6.189E-02 3.795E-01 2.044E-02 8.073E-04 2.125E-02 3.231E+01 6.189E-02 3.795E-01 2.042E-02 7.812E-04 2.120E-02 3.239E+01 6.174E-02 3.785E-01 2.042E-02 6.979E-04 2.112E-02 3.251E+01 6.151E-02 3.771E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.026E-02 8.073E-04 2.107E-02 3.259E+01 6.136E-02 3.762E-01 2.023E-02 7.812E-04 2.102E-02 3.267E+01 6.121E-02 3.753E-01 2.026E-02 7.552E-04 2.102E-02 3.267E+01 6.121E-02 3.753E-01 2.024E-02 6.953E-04 2.094E-02 3.280E+01 6.098E-02 3.739E-01 From mips!escargot.earl@su-glacier.arpa Wed Jan 22 22:47:31 1986 Received: from su-glacier.arpa (su-glacier.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA21988; Wed, 22 Jan 86 22:47:18 cst Received: by su-glacier.arpa with Sendmail; Wed, 22 Jan 86 20:46:43 pst Received: from escargot.UUCP (escargot.ARPA) by mips.UUCP (4.12/4.7) id AA10567; Wed, 22 Jan 86 14:46:19 pst Received: by escargot.UUCP (4.12/4.7) id AA00222; Wed, 22 Jan 86 14:45:59 pst Date: Wed, 22 Jan 86 14:45:59 pst From: mips!escargot.earl@su-glacier.arpa (Earl Killian) Message-Id: <8601222245.AA00222@escargot.UUCP> To: DONGARRA@ANL-MCS.ARPA Subject: linpack benchmark for new VAX Unix fortran compiler Status: RO I ran your full precision linpack benchmark on a 4.2bsd Unix VAX 780 w/FPA using the LLL S-1 Project fortran compiler instead of the 4.2bsd Unix f77 compiler and got the following results. The compiler generally does much better than f77 (2x faster on Spice), but the times aren't much different on linpack, probably because the VAX spends most of its time cache missing on linpack, so the code quality is secondary. By the way, have you considered modifying the source of DAXPY to copy the DA parameter to a local before using it? Fortran's pass-by-reference semantics mean that the store to DY might change the value of DA, and thus it can't be loaded into a register outside of the loop. If it were a local then a compiler could do that. That would make compiler times a tiny bit closer to coded BLAS times, which must assume there is no hazard there, and in addition that assignments to DY don't affect DX (otherwise it can't be vectorized). I think that would make it a better benchmark. I'll include the times for that modification last. Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 2.10272591E+00 1.16642807E-14 2.77555756E-17 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 5.100E+00 1.600E-01 5.260E+00 1.305E-01 1.532E+01 9.393E+01 5.180E+00 1.700E-01 5.350E+00 1.283E-01 1.558E+01 9.554E+01 5.010E+00 1.500E-01 5.160E+00 1.331E-01 1.503E+01 9.214E+01 5.070E+00 1.540E-01 5.224E+00 1.314E-01 1.522E+01 9.329E+01 times for array with leading dimension of 200 5.100E+00 1.600E-01 5.260E+00 1.305E-01 1.532E+01 9.393E+01 5.430E+00 1.500E-01 5.580E+00 1.231E-01 1.625E+01 9.964E+01 5.260E+00 1.500E-01 5.410E+00 1.269E-01 1.576E+01 9.661E+01 5.176E+00 1.540E-01 5.330E+00 1.288E-01 1.552E+01 9.518E+01 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 2.10272591E+00 1.16642807E-14 2.77555756E-17 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.960E+00 1.500E-01 5.110E+00 1.344E-01 1.488E+01 9.125E+01 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.945E+00 1.580E-01 5.103E+00 1.346E-01 1.486E+01 9.112E+01 times for array with leading dimension of 200 4.930E+00 1.600E-01 5.090E+00 1.349E-01 1.483E+01 9.089E+01 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.830E+00 1.600E-01 4.990E+00 1.376E-01 1.453E+01 8.911E+01 4.915E+00 1.510E-01 5.066E+00 1.355E-01 1.476E+01 9.046E+01 From GR105%VTVM1.BITNET@WISCVM.WISC.EDU Mon Feb 10 16:11:00 1986 Received: from WISCVM.WISC.EDU (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA23112; Mon, 10 Feb 86 16:10:40 cst Message-Id: <8602102210.AA23112@anl-mcs.ARPA> Received: from (GR105)VTVM1.BITNET by WISCVM.WISC.EDU on 02/10/86 at 16:10:47 CST Date: Mon, 10 Feb 86 16:52:22 EST To: DONGARRA@ANL-MCS.ARPA From: GR105%VTVM1.BITNET@WISCVM.WISC.EDU Subject: TIMINGS FROM A PERKIN-ELMER 3230 Status: R Dear Dr. Dongarra, Here are some benchmark results from a Perkin Elmer 3230 which the Va Tech Aerospace Dept has as a departmental mini-computer. Enclosed first are the single precision Linpack timings and then the double precision timings. The last item enclosed is the timing routine (REAL FUNCTION SECOND(T)) that was used to time these bench- marks. Both of the benchmark cases were obtained from netlib (as in SEND LINPACKS LINPACKD FROM BENCHMARK). The computer was a Perkin Elmer 3230. The operating system was OS/32 rel 6.2.2. The compiler was the fortran O compiler (rel 5.2). The fortran O compiler is a highly optimizing compiler. The particular 3230 we have has the floating point hardware so all 32 bit single precision and 64 bit double precision floating point operations were executed in hardware. Sincerely yours, David Whitaker D T 02/10/86 16:11:13 LINS ,,,,,,,,,,,,, PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 1.05047800E+00 2.00271600E-04 9.53674300E-07 9.99929400E-01 9.99917700E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.841E+00 1.740E-01 5.015E+00 1.369E-01 1.461E+01 8.955E+01 4.142E+00 1.240E-01 4.266E+00 1.610E-01 1.243E+01 7.618E+01 4.139E+00 1.260E-01 4.265E+00 1.610E-01 1.242E+01 7.616E+01 4.148E+00 1.258E-01 4.274E+00 1.607E-01 1.245E+01 7.633E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.145E+00 1.260E-01 4.271E+00 1.608E-01 1.244E+01 7.627E+01 4.151E+00 1.240E-01 4.275E+00 1.606E-01 1.245E+01 7.634E+01 4.153E+00 1.240E-01 4.277E+00 1.605E-01 1.246E+01 7.637E+01 4.150E+00 1.263E-01 4.276E+00 1.606E-01 1.245E+01 7.636E+01 STOP LINS -END OF TASK CODE= 0 CPUTIME=2:00.705/0.092 D T 02/10/86 16:13:17 D A USER TIME 2:00.705 SVC TIME 0.092 WAIT TIME 0.421 ROLL TIME 0.000 I/O 30 ROLLS 0 SIGNOFF ELAPSED TIME=2:07 CPUTIME=2:00.705/0.092 TIME OFF=02/10/86 16:13:18 D T 02/10/86 16:19:57 LIND ,,,,,,,,,,,,, PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 1.07075157E+00 4.75175455E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 6.109E+00 1.850E-01 6.294E+00 1.091E-01 1.833E+01 1.124E+02 6.099E+00 1.810E-01 6.280E+00 1.093E-01 1.829E+01 1.121E+02 6.099E+00 1.810E-01 6.280E+00 1.093E-01 1.829E+01 1.121E+02 6.104E+00 1.855E-01 6.290E+00 1.092E-01 1.832E+01 1.123E+02 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 6.124E+00 1.830E-01 6.307E+00 1.089E-01 1.837E+01 1.126E+02 6.109E+00 1.810E-01 6.290E+00 1.092E-01 1.832E+01 1.123E+02 6.127E+00 1.840E-01 6.311E+00 1.088E-01 1.838E+01 1.127E+02 6.118E+00 1.865E-01 6.304E+00 1.089E-01 1.836E+01 1.126E+02 STOP LIND -END OF TASK CODE= 0 CPUTIME=2:54.896/0.100 D T 02/10/86 16:22:56 D A USER TIME 2:54.896 SVC TIME 0.100 WAIT TIME 0.452 ROLL TIME 0.000 I/O 31 ROLLS 0 SIGNOFF ELAPSED TIME=3:01 CPUTIME=2:54.896/0.100 TIME OFF=02/10/86 16:22:56 REAL FUNCTION SECOND(T) REAL T SAVE ISTART,TIME IF (ISTART .NE. 1) THEN CALL MTIME(I) TIME=0.0 ISTART=1 RETURN END IF C CALL MTIME (I) TIME = TIME + FLOAT(I)/1000. SECOND = TIME CALL MTIME (I) RETURN C END SUBROUTINE MTIME(I) $ASSM ST 0,SAV0 L 0,FLAG BNZ $$1 XR 0,0 SVC 2,SETTIME SVC 2,GETTIME SVC 2,CANTIME L 0,ITIME S 0,NTIME ST 0,HOLDTIME LIS 0,1 ST 0,FLAG XR 0,0 ST 0,J SVC 2,SETTIME L 0,SAV0 $FORT I=J RETURN $ASSM $$1 XR 0,0 SVC 2,GETTIME SVC 2,CANTIME ST 0,FLAG L 0,ITIME S 0,NTIME S 0,HOLDTIME ST 0,J L 0,SAV0 $FORT I=J RETURN $ASSM IMPUR ALIGN 4 SETTIME DB X'00',23 DC H'0' ITIME DC Y'1FFFFFFF' ALIGN 4 GETTIME DB X'20',23 DC H'0' NTIME DC Y'10000000' ALIGN 4 CANTIME DB X'10',23 DC H'0' DC Y'10000000' ALIGN 4 FLAG DC Y'0' HOLDTIME DS 4 SAV0 DS 4 PURE $FORT END From dgh%dgh@SUN.ARPA Sun Feb 16 12:10:11 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28995; Sun, 16 Feb 86 12:04:46 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA14369; Sun, 16 Feb 86 09:31:21 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA22882; Sun, 16 Feb 86 09:30:22 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA00628; Sun, 16 Feb 86 09:33:06 PST Date: Sun, 16 Feb 86 09:33:06 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602161733.AA00628@dgh.sun.uucp> To: Terry_Ratcliffe%mts%cheviot.newcastle@cs.ucl.ac.uk, Terry_Ratcliffe%newcastle.mailnet@mit-multics.arpa, alastair@ucbopal.berkeley.edu, decwrl!turtlevax!weitek!eric, dgh@dgh, dongarra@anl-mcs.arpa, hplabs!hpda!sra1!grimes, hplabs!motsj1!kjm, ihnp4!ima!pbear!peterb, ihnp4!ima!pbear!spastic!maggot!barada, ihnp4!ima!pbear!spastic!maggot!turner, ihnp4!ima!spastic!maggot!turner, ihnp4!inmet!pbear!peterb, ihnp4!inmet!pbear!spastic!maggot!barada, ihnp4!inmet!pbear!spastic!maggot!turner, ihnp4!inmet!spastic!maggot!turner, kcng@kim.berkeley.edu, oakhill!davet, oakhill!van, seismo!ukc!cheviot!robert, ucbvax!ibmpa!lmb, ucbvax!tektronix!ogcvax!inteloa!jimv, ucbvax!tektronix!ogcvax!moler, ucbvax!ucbdali.Berkeley.EDU!mcdonald, ucbvax!ucsfcgl!cca.ucsf!dick, zliu@weyl.berkeley.edu Subject: whetstone discussion Status: R Terry Ratcliffe of the University of Newcastle was kind enough to reply to my Whetstone attacks, raising some good points; my comments are interpolated: > David, > I've seen a copy of your "Benchmarking and the 68020 Cache" > and the "Weitek 1164/65 FPA" papers. > > a) Now you mention that the cache is on the 68020, and hence I would > expect a similar variation in KWips without the FPA. Have you > measured this and what's the results? > The cache effect is only profound (20%) when the P3 routine takes an amount of time comparable to an iteration of the calling loop. For double precision, for instance, the variation is more like 10%; if a 68881 were used instead of a Sun FPA, then the variation would be even smaller and probably not interesting, so I haven't bothered to measure it. > b) re your comments on whetstones. > > First my "credentials": > I have to run the "official (Government)" benchmarks from the > Central Computer and Telecommunications Agency whenever we get > approval (rarely) to buy a new large mainframe as part of the > evaluation process; > and I have to help tune some of the big 'orrible Fortran programs that > our number crunchers write (or more often inherit or import). > > So when is it a whetstone: The rules are quite simple. You can NOT alter > any line of the code, except those changes needed to get it to COMPILE, > (and those must be reported back and approved). (If it then won't run > that is logged as the result). You are allowed complete freedom of > where you place modules as part of the linking/loading process, and > it is assumed that potential suppliers will put them in the best place > for their box. > Things are far less defined here, which is why I think of Whetstone as a marketing benchmark. For instance, a customer called me to say that he had gotten Whetstone results rather different from what our marketing literature had led him to anticipate. It turned out he wasn't using any version of the usual Whetstone program, but rather some other code of unknown origin whose output was calibrated in Whetstones. That's why I tend to disregard Whetstone results that I haven't measured myself. > As to it's appropriateness re Linpack, you really need both benchmarks > as they are looking at two distinct sets of application areas. > Linpack is relevant to solving lots of linear equations. Though even > there it's not completely relevant as our really big number crunchers in > that area have taken to pulling out the inner loops and rewriting them > in assembler and hand tuning the assembler to the particular CPU model. That's a commentary on the compilers rather than the benchmark. As I indicated in one of my papers, our Sun-3 compilers produce truly optimal code for the 68881 or the FPA, on the inner loop of Linpack provided it's rolled. > Whetstone is more appropriate where you are finding minima of a > function in multi-dimensional space (lots of CPU time used by physicists > doing that); usually they call a library routine(s) to do the searching > but have to provide a subroutine to generate the function, which gets > called lots and lots of times (just like the infamous P3). > The dreaded divide by 2.0 in P3 just represents the fact that most > of these functions seem to end up with at least one inescapable divide. > I've always maintained that the value of T2 should alter between calls to > PA or P3 (to make it more difficult to produce a fiddled compiler) but > it's far too late to change it now. > > Terry Ratcliffe > >Terry_Ratcliffe%newcastle.mailnet@mit-multics.ARPA >Terry_Ratcliffe%mts%cheviot.newcastle@ucl-cs.ARPA I'd still maintain that the P3-calling loop is a poor model of nonlinear optimization calculations. In my own limited experience the nonlinear function is typically much more complicated that P3 suggests and the calling routine is much more complicated than a simple loop, although a simple loop might be a good model for an optimizer that calls an external function to compute many partial derivatives. More to the point, interprocedural optimization is becoming more common in other languages and will likely appear in Fortran compilers before long. Physicists that are calling short P3 style functions may soon figure out that by compiling their function with the source for the optimizer, they can get a noticeable improvement. To the extent interprocedural optimization is done, the Whetstone benchmark will be obsolete; so I would suggest that if you think it's worth preserving you should push your campaign to make T2 truly variable with the Whetstone authors. Jim Valerio of Intel mentioned that short routines that do complex arithmetic by calling subroutines fit the P3 mold rather well; I feel that's a compiler shortcoming since complex +-* should be compiled inline, but I must admit that Sun's compilers presently don't satisfy me in that regard. From dgh%dgh@SUN.ARPA Tue Feb 18 12:14:34 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA03407; Tue, 18 Feb 86 12:14:15 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA17170; Tue, 18 Feb 86 10:11:58 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA02735; Tue, 18 Feb 86 10:10:57 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA02960; Tue, 18 Feb 86 10:13:54 PST Date: Tue, 18 Feb 86 10:13:54 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602181813.AA02960@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: do you believe this? Status: RO This is a Sun-3 with an FPA, single precision, unrolled loops. What surprises me is the resid and norm resid values... This was run on a 4 Meg machine, accounting for a few page faults. norm. resid resid machep x(1) x(n) 0.00000000e+00 0.00000000e+00 2.23517418e-07 1.00016224e+00 9.99933422e-01 times are reported for matrices of order 1000 factor solve total mflops unit ratio times for array with leading dimension of1001 1.641e+03 6.360e+00 1.647e+03 4.059e-01 4.927e+00 2.942e+04 1083.2u 615.1s 2:25:37 19% 0+139k 2+4io 111045pf+2w From dgh%dgh@SUN.ARPA Tue Feb 18 22:48:56 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00269; Tue, 18 Feb 86 22:48:39 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA00206; Tue, 18 Feb 86 17:38:36 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA05405; Tue, 18 Feb 86 15:21:41 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA03630; Tue, 18 Feb 86 15:24:38 PST Date: Tue, 18 Feb 86 15:24:38 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602182324.AA03630@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: 1000x1000 single precision FPA results Status: RO Do these look better: S.ROLLffpa.huge.out Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1)-1 x(n)-1 1.13179665e+01 2.70068646e-03 1.19209290e-07 1.62243843e-04 -6.65783881e-05 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 1518.78 5.50 1524.28 439. 4.56 27219.28 953.5u 622.9s 2:24:47 18% 0+139k 14+4io 111909pf+2w From dgh%dgh@SUN.ARPA Thu Feb 20 16:26:37 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA06060; Thu, 20 Feb 86 16:26:27 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA10383; Thu, 20 Feb 86 14:11:20 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA17229; Thu, 20 Feb 86 13:57:03 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA07481; Thu, 20 Feb 86 13:59:18 PST Date: Thu, 20 Feb 86 13:59:18 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602202159.AA07481@dgh.sun.uucp> To: aoki@sisyphus, carrie@lunar, crao@dgh, cwoo@dgh, dgh@dgh, dongarra@anl-mcs.arpa, edh@dragon, fetter@cygnus, jharman@dgh, jlc@maple, jricotta@dgh, katy@beluga, kmobley@dgh, mmm@dusk, psager@dgh, rcheng@dgh, weiss@genesis, wu@funshine Subject: 1000x1000 single precision Linpack test results Status: R The following output is from the new ultra-Linpack test included in the latest versions of Dongarra's paper. It is for single precision, release 3.1 f77 -O -ffpa: norm. resid resid machep x(1)-1 x(n)-1 1.13179665e+01 2.70068646e-03 1.19209290e-07 1.62243843e-04 -6.65783881e-05 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 846.62 2.56 849.18 787. 2.54 15163.93 888.1u 2.6s 14:55 99% 4+245k 14+5io 0pf+0w From dgh%dgh@SUN.ARPA Mon Feb 24 17:13:04 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA07190; Mon, 24 Feb 86 17:12:48 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA00282; Mon, 24 Feb 86 15:10:46 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08707; Mon, 24 Feb 86 14:09:10 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA13437; Mon, 24 Feb 86 14:11:06 PST Date: Mon, 24 Feb 86 14:11:06 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602242211.AA13437@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: 12 meg results for 1000x1000 double precision Cc: aoki@sisyphus, carrie@lunar, crao@dgh, cwoo@dgh, dgh@dgh, dlambright@dgh, fetter@cygnus, jharman@dgh, jlc@maple, jricotta@dgh, katy@beluga, kmobley@dgh, mmm@dusk, psager@dgh, rcheng@dgh, weiss@genesis, wu@funshine Status: R Using Sun FPA; you can publish these: norm. resid resid machep x(1)-1 x(n)-1 9.50387011e+00 4.22017976e-12 2.22044605e-16 1.09912079e-13 5.08926234e-13 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 1443.78 5.16 1448.94 461. 4.33 25873.93 From JCG%ibm-b.rutherford.ac.uk@cs.ucl.ac.uk Wed Feb 26 04:43:24 1986 Received: from BRL-AOS.ARPA (brl-aos.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA24871; Wed, 26 Feb 86 04:43:18 cst Received: from ucl-cs.arpa by AOS.BRL.ARPA id a020289; 26 Feb 86 5:39 EST Received: from ibm-b.rutherford.ac.uk by 44d.Cs.Ucl.AC.UK via Janet with NIFTP id a000432; 26 Feb 86 9:43 GMT Message-Id: <25 February 1986 16:50:01 GMT JCG@UK.AC.RL.IB> Date: Tuesday, 25 February 1986 16:50:01 GMT From: John Gordon ext 6574 (JCG at RLVM370) Address: User Support Group R27 Rutherford Appleton Lab To: DONGARRA Status: R RESULTS FOR A LINPACKD RUN USING SOURCE FROM NETLIB@ANL-MCS MACHINE = FUJITSU M-830 (WITH NO HYPERVISOR) COMPILER = IBM VS OPT=3 REL 1.4.1 SYSTEM = VM/CMS RELEASE 3 HPO LEVEL 32 THE TIMES RETURNED BY FUNCTION SECOND ARE VIRTUAL TIMES. I NOTICED THAT THE OVERHEADS CAUSED BY THE FUNTION SECOND COULD LEAD TO SIGNIFICANT ERRORS IF THEY WERE LARGE. REGARDS: ANDREW BANKS. CENTRAL COMPUTING DIVISION, RUTHERFORD APPLETON LAB, CHILTON, DIDCOT, OXON OX11 OQX ENGLAND. PLEASE SEND THE RESULTS OF THIS RUN TO: 0JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 0TELEPHONE: 312-972-7246 0ARPANET: DONGARRA@ANL-MCS 0 NORM. RESID RESID MACHEP X(1) X(N) 2.91923652E+00 1.29549149E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 - TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.194E-01 3.937E-03 1.234E-01 5.566E+00 3.593E-01 2.203E+00 1.202E-01 4.080E-03 1.243E-01 5.524E+00 3.620E-01 2.220E+00 1.194E-01 4.003E-03 1.234E-01 5.565E+00 3.594E-01 2.203E+00 1.198E-01 3.900E-03 1.237E-01 5.552E+00 3.602E-01 2.208E+00 0TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 1.194E-01 3.903E-03 1.233E-01 5.568E+00 3.592E-01 2.202E+00 1.188E-01 3.958E-03 1.228E-01 5.593E+00 3.576E-01 2.192E+00 1.193E-01 3.896E-03 1.232E-01 5.574E+00 3.588E-01 2.200E+00 1.193E-01 3.831E-03 1.231E-01 5.579E+00 3.585E-01 2.198E+00 From fae.wu@ames-vmsb.ARPA Tue Mar 4 10:40:43 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA16659; Tue, 4 Mar 86 10:40:30 cst Message-Id: <8603041640.AA16659@anl-mcs.ARPA> Date: 4 Mar 86 08:23:00 PST From: fae.wu@ames-vmsb.ARPA Subject: To: dongarra@anl-mcs Reply-To: fae.wu@ames-vmsb.ARPA Status: RO LU DECOMPOSITION TIMING SP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.103E-01 mflops = 7.9555 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.772E-02 mflops = 10.6352 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.632E-02 mflops = 12.9893 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.580E-02 mflops = 14.1676 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.589E-02 mflops = 13.9461 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.485E-01 mflops = 13.6331 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.345E-01 mflops = 19.1772 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.268E-01 mflops = 24.6920 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.236E-01 mflops = 28.0376 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.225E-01 mflops = 29.4309 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.127E+00 mflops = 17.6700 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.877E-01 mflops = 25.5389 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.681E-01 mflops = 32.8638 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.596E-01 mflops = 37.5585 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.566E-01 mflops = 39.5518 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.275E+00 mflops = 19.3239 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.207E+00 mflops = 25.6562 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.156E+00 mflops = 34.1686 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.130E+00 mflops = 40.9673 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.118E+00 mflops = 44.9886 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.460E+00 mflops = 22.5728 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.360E+00 mflops = 28.8679 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.266E+00 mflops = 39.1114 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.215E+00 mflops = 48.3334 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.181E+00 mflops = 57.3544 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.708E+00 mflops = 25.3668 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.556E+00 mflops = 32.2936 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.394E+00 mflops = 45.5153 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.349E+00 mflops = 51.4380 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.329E+00 mflops = 54.5877 CHECK = 0.100E+01 ------ From fae.wu@ames-vmsb.ARPA Tue Mar 4 10:41:08 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA16681; Tue, 4 Mar 86 10:40:50 cst Message-Id: <8603041640.AA16681@anl-mcs.ARPA> Date: 4 Mar 86 08:23:00 PST From: fae.wu@ames-vmsb.ARPA Subject: To: dongarra@anl-mcs Reply-To: fae.wu@ames-vmsb.ARPA Status: RO pLEASE SEND THE RESULTS OF THIS RUN TO: jACK j. dONGARRA mATHEMATICS AND cOMPUTER sCIENCE dIVISION aRGONNE nATIONAL lABORATORY aRGONNE, iLLINOIS 60439 tELEPHONE: 312-972-7246 arpaNET: dongarra@anl-mcs NORM. RESID RESID MACHEP X(1) X(N) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 5.190E-02 1.483E-03 5.338E-02 1.286E+01 1.555E-01 9.533E-01 5.051E-02 1.489E-03 5.200E-02 1.321E+01 1.514E-01 9.285E-01 5.056E-02 1.517E-03 5.207E-02 1.319E+01 1.517E-01 9.299E-01 5.284E-02 1.450E-03 5.429E-02 1.265E+01 1.581E-01 9.695E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 5.082E-02 1.489E-03 5.231E-02 1.313E+01 1.523E-01 9.340E-01 5.229E-02 1.507E-03 5.379E-02 1.277E+01 1.567E-01 9.606E-01 5.251E-02 1.490E-03 5.400E-02 1.272E+01 1.573E-01 9.643E-01 5.036E-02 1.499E-03 5.186E-02 1.324E+01 1.510E-01 9.260E-01 ------ From fae.wu@ames-vmsb.ARPA Wed Mar 5 09:22:49 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA29070; Wed, 5 Mar 86 09:22:42 cst Message-Id: <8603051522.AA29070@anl-mcs.ARPA> Date: 5 Mar 86 07:15:00 PST From: fae.wu@ames-vmsb.ARPA Subject: Re: To: dongarra@anl-mcs.ARPA Reply-To: fae.wu@ames-vmsb.ARPA Status: RO Jack, Sorry, they are for the CRAY-2 running CFT ver 2.63. Pretty disappointing. Slower than a CRAY-1. Alex. ------ From eugene@AMES-NAS.ARPA Fri Mar 7 11:48:09 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02089; Fri, 7 Mar 86 11:48:04 cst Date: Fri, 7 Mar 86 09:52:00 pst From: eugene@AMES-NAS.ARPA (Eugene Miya) Message-Id: <8603071752.AA09206@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:52:00 pst To: dongarra@anl-mcs.ARPA Status: R Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.154E-02 1.475E-03 5.301E-02 1.295E+01 1.544E-01 9.467E-01 5.159E-02 1.485E-03 5.308E-02 1.294E+01 1.546E-01 9.478E-01 5.179E-02 1.474E-03 5.326E-02 1.289E+01 1.551E-01 9.511E-01 5.100E-02 1.485E-03 5.248E-02 1.308E+01 1.529E-01 9.372E-01 times for array with leading dimension of 200 5.168E-02 1.485E-03 5.316E-02 1.292E+01 1.548E-01 9.494E-01 5.154E-02 1.501E-03 5.305E-02 1.294E+01 1.545E-01 9.472E-01 5.156E-02 1.476E-03 5.304E-02 1.295E+01 1.545E-01 9.472E-01 5.149E-02 1.501E-03 5.299E-02 1.296E+01 1.543E-01 9.463E-01 From barton@AMES-NAS.ARPA Fri Mar 7 11:54:07 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02154; Fri, 7 Mar 86 11:54:00 cst Date: Fri, 7 Mar 86 09:57:52 pst From: barton@AMES-NAS.ARPA (John Barton) Message-Id: <8603071757.AA09278@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:57:52 pst To: dongarra@anl-mcs.ARPA Subject: Re: timings on the cray-2 Cc: +out@AMES-NAS.ARPA, blaylock@AMES-NAS.ARPA, eugene@AMES-NAS.ARPA, rbailey@AMES-NAS.ARPA, stevens@mercury Status: R Jack, The NAS project is glad to be of help to you in working on the Cray-2. We are quite interested in any timing information which you have. Please let me know exactly what the results are that you would be publicizing, and I will be better able to comment on their distribution. John Barton From eugene@AMES-NAS.ARPA Fri Mar 7 11:48:09 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02089; Fri, 7 Mar 86 11:48:04 cst Date: Fri, 7 Mar 86 09:52:00 pst From: eugene@AMES-NAS.ARPA (Eugene Miya) Message-Id: <8603071752.AA09206@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:52:00 pst To: dongarra@anl-mcs.ARPA Status: R Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.154E-02 1.475E-03 5.301E-02 1.295E+01 1.544E-01 9.467E-01 5.159E-02 1.485E-03 5.308E-02 1.294E+01 1.546E-01 9.478E-01 5.179E-02 1.474E-03 5.326E-02 1.289E+01 1.551E-01 9.511E-01 5.100E-02 1.485E-03 5.248E-02 1.308E+01 1.529E-01 9.372E-01 times for array with leading dimension of 200 5.168E-02 1.485E-03 5.316E-02 1.292E+01 1.548E-01 9.494E-01 5.154E-02 1.501E-03 5.305E-02 1.294E+01 1.545E-01 9.472E-01 5.156E-02 1.476E-03 5.304E-02 1.295E+01 1.545E-01 9.472E-01 5.149E-02 1.501E-03 5.299E-02 1.296E+01 1.543E-01 9.463E-01 From Lewis_W._Kellum%UMich-MTS.MAILNET@MIT-MULTICS.ARPA Wed Mar 12 12:10:21 1986 Received: from MIT-MULTICS.ARPA (mit-multics.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA01691; Wed, 12 Mar 86 12:08:53 cst Received: from UMich-MTS.Mailnet by MIT-MULTICS.ARPA with Mailnet id <2688487467539634@MIT-MULTICS.ARPA>; 12 Mar 1986 13:04:27 est Date: Wed, 12 Mar 86 11:28:26 EST From: Lewis_W._Kellum%UMich-MTS.Mailnet@MIT-MULTICS.ARPA To: DONGARRA@ANL-MCS.ARPA Message-Id: <1183783@UMich-MTS.Mailnet> Status: RO Dr. Dongarra - Here are linpack benchmark results for the Apollo dn3000. There was a problem with the 4th and 8th timing loop in the double precision version which I have since fixed. However the dn3000 was a demo, and left before I could rerun the bench. The other times should be fine. I've included a run on a dn660 for comparison. - Woody Kellum@um-mts. Lewis_W._Kellum%UMich-MTS.Mailnet%MIT-MULTICS.ARPA%YALE.ARPA%yal@MIT-Multics.ARPA ------------------------------------------------------------------------------ ----------------------------------------------------------------------------- Apollo dn3000, Double precision w/ -cpu 330 opt -------------------------------------------------- Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.64502611E+00 7.30025578E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.072E+01 3.241E-01 1.104E+01 6.220E-02 3.215E+01 1.971E+02 1.072E+01 3.299E-01 1.105E+01 6.213E-02 3.219E+01 1.973E+02 1.072E+01 3.240E-01 1.105E+01 6.216E-02 3.217E+01 1.973E+02 1.073E+02 3.233E-01 1.076E+02 6.379E-03 3.135E+02 1.922E+03 times for array with leading dimension of 200 1.081E+01 3.242E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.081E+01 3.242E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.081E+01 3.247E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.083E+01 3.244E+00 1.407E+01 4.879E-02 4.099E+01 2.513E+02 Fortran STOP ------------------------------------------------------------------------ ----------------------------------------------------------------------- Apollo dn3000, Single Precision w/ -cpu 330 opt ----------------------------------------------------------------------- Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.54914700E+00 3.69101800E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 9.439E+00 2.852E-01 9.724E+00 7.062E-02 2.832E+01 1.736E+02 9.440E+00 2.850E-01 9.725E+00 7.061E-02 2.833E+01 1.737E+02 9.437E+00 2.850E-01 9.722E+00 7.063E-02 2.832E+01 1.736E+02 9.489E+00 2.847E-01 9.774E+00 7.025E-02 2.847E+01 1.745E+02 times for array with leading dimension of 200 9.436E+00 2.850E-01 9.721E+00 7.064E-02 2.831E+01 1.736E+02 9.437E+00 2.846E-01 9.722E+00 7.063E-02 2.832E+01 1.736E+02 9.435E+00 2.846E-01 9.720E+00 7.065E-02 2.831E+01 1.736E+02 9.448E+00 2.847E-01 9.733E+00 7.055E-02 2.835E+01 1.738E+02 Fortran STOP ------------------------------------------------------------------------- ------------------------------------------------------------------------- Apollo DN660, Double precision,compiled w/ -cpu 660 option ---------------------------------------------------------- $ BENCH/DOUBLE_LINPACK_BENCH_X60 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.62238876E+00 7.19979631E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.731E+00 2.980E-01 1.003E+01 6.847E-02 2.921E+01 1.791E+02 9.783E+00 3.012E-01 1.008E+01 6.809E-02 2.937E+01 1.801E+02 9.814E+00 2.983E-01 1.011E+01 6.790E-02 2.945E+01 1.806E+02 9.776E+00 2.983E-01 1.007E+01 6.816E-02 2.934E+01 1.799E+02 times for array with leading dimension of 200 9.754E+00 2.983E-01 1.005E+01 6.831E-02 2.928E+01 1.795E+02 9.736E+00 3.050E-01 1.004E+01 6.839E-02 2.925E+01 1.793E+02 9.776E+00 2.985E-01 1.007E+01 6.816E-02 2.934E+01 1.799E+02 9.775E+00 3.096E-01 1.008E+01 6.809E-02 2.937E+01 1.801E+02 Fortran STOP ------------------------------------------------------------------ From dgh%dgh@SUN.ARPA Fri Apr 4 16:38:10 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02544; Fri, 4 Apr 86 16:38:04 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA28335; Fri, 4 Apr 86 14:35:11 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08749; Fri, 4 Apr 86 14:34:09 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08920; Fri, 4 Apr 86 14:38:10 PST Date: Fri, 4 Apr 86 14:38:10 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042238.AA08920@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO single precision Sun-2/50 3.0 f77 -O -fsoft (contrary to what I said earlier this week, I think the software release number should come first rather than last) norm. resid resid machep x(1) x(n) 1.59605336e+00 3.80277633e-05 1.19209290e-07 9.99986172e-01 9.99992490e-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.408e+01 1.600e+00 5.568e+01 1.233e-02 1.622e+02 9.943e+02 5.344e+01 1.640e+00 5.508e+01 1.247e-02 1.604e+02 9.836e+02 5.486e+01 1.660e+00 5.652e+01 1.215e-02 1.646e+02 1.009e+03 5.344e+01 1.612e+00 5.505e+01 1.247e-02 1.603e+02 9.830e+02 times for array with leading dimension of 200 5.334e+01 1.640e+00 5.498e+01 1.249e-02 1.601e+02 9.818e+02 5.340e+01 1.600e+00 5.500e+01 1.248e-02 1.602e+02 9.821e+02 5.342e+01 1.600e+00 5.502e+01 1.248e-02 1.603e+02 9.825e+02 5.517e+01 1.618e+00 5.678e+01 1.209e-02 1.654e+02 1.014e+03 From dgh%dgh@SUN.ARPA Fri Apr 4 13:30:18 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28499; Fri, 4 Apr 86 13:28:07 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA27671; Fri, 4 Apr 86 11:24:49 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA07253; Fri, 4 Apr 86 11:22:38 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08618; Fri, 4 Apr 86 11:26:31 PST Date: Fri, 4 Apr 86 11:26:31 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604041926.AA08618@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO Single precision Sun-2/50 + Sky FFP f77 -O -fsky 3.0 norm. resid resid machep x(1) x(n) 1.43082821e+00 3.40938568e-05 1.19209290e-07 9.99960721e-01 9.99971569e-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 1.448e+01 4.800e-01 1.496e+01 4.590e-02 4.357e+01 2.671e+02 1.404e+01 4.400e-01 1.448e+01 4.742e-02 4.217e+01 2.586e+02 1.440e+01 5.400e-01 1.494e+01 4.596e-02 4.351e+01 2.668e+02 1.446e+01 4.380e-01 1.489e+01 4.610e-02 4.338e+01 2.660e+02 times for array with leading dimension of 200 1.412e+01 4.600e-01 1.458e+01 4.710e-02 4.247e+01 2.604e+02 1.410e+01 4.200e-01 1.452e+01 4.729e-02 4.229e+01 2.593e+02 1.452e+01 4.200e-01 1.494e+01 4.596e-02 4.351e+01 2.668e+02 1.407e+01 4.300e-01 1.450e+01 4.737e-02 4.222e+01 2.589e+02 From dgh%dgh@SUN.ARPA Fri Apr 4 14:23:00 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00331; Fri, 4 Apr 86 14:22:49 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA27873; Fri, 4 Apr 86 12:19:47 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA07655; Fri, 4 Apr 86 12:18:52 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08686; Fri, 4 Apr 86 12:22:54 PST Date: Fri, 4 Apr 86 12:22:54 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042022.AA08686@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO double precision Sun-2/50 + Sky FFP f77 -O -fsky 3.0 norm. resid resid machep x(1) x(n) 1.50887158e+00 6.69603262e-14 2.22044605e-16 1.00000000e+00 1.00000000e+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 2.478e+01 7.600e-01 2.554e+01 2.689e-02 7.439e+01 4.561e+02 2.468e+01 7.400e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.474e+01 7.600e-01 2.550e+01 2.693e-02 7.427e+01 4.554e+02 2.537e+01 7.460e-01 2.612e+01 2.629e-02 7.607e+01 4.664e+02 times for array with leading dimension of 200 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.491e+01 7.480e-01 2.566e+01 2.676e-02 7.474e+01 4.582e+02 From dgh%dgh@SUN.ARPA Fri Apr 4 15:43:42 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA01754; Fri, 4 Apr 86 15:43:31 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA28134; Fri, 4 Apr 86 13:39:34 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08292; Fri, 4 Apr 86 13:38:35 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08816; Fri, 4 Apr 86 13:42:36 PST Date: Fri, 4 Apr 86 13:42:36 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042142.AA08816@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO double precision Sun-2/50 3.0 f77 -O -fsoft norm. resid resid machep x(1) x(n) 1.67117300e+00 7.41628980e-14 2.22044605e-16 1.00000000e+00 1.00000000e+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.232e+02 3.700e+00 1.269e+02 5.412e-03 3.696e+02 2.266e+03 1.260e+02 3.880e+00 1.298e+02 5.289e-03 3.782e+02 2.319e+03 1.212e+02 3.780e+00 1.250e+02 5.492e-03 3.641e+02 2.232e+03 1.209e+02 3.622e+00 1.245e+02 5.513e-03 3.628e+02 2.224e+03 times for array with leading dimension of 200 1.192e+02 3.580e+00 1.228e+02 5.592e-03 3.577e+02 2.193e+03 1.201e+02 3.560e+00 1.236e+02 5.554e-03 3.601e+02 2.208e+03 1.188e+02 3.560e+00 1.224e+02 5.612e-03 3.564e+02 2.185e+03 1.205e+02 3.620e+00 1.241e+02 5.532e-03 3.615e+02 2.217e+03 From unido!ztivax!schnepf@seismo.CSS.GOV Tue Apr 8 03:22:42 1986 Received: from seismo.CSS.GOV (seismo.css.gov.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA07950; Tue, 8 Apr 86 03:21:58 cst Received: from unido.UUCP by seismo.CSS.GOV with UUCP; Tue, 8 Apr 86 03:28:43 EST Received: by unido.uucp with uucp; Tue, 8 Apr 86 09:58:29 -0200 Return-Path: Received: by ztivax.LOCAL (4.12/4.8) id AA14752; Tue, 8 Apr 86 08:44:55 -0100 (MET) Date: Tue, 8 Apr 86 08:44:55 -0100 From: unido!ztivax!schnepf@seismo.CSS.GOV (Eric Schnepf) Posted-Date: Tue, 8 Apr 86 08:44:55 -0100 Message-Id: <8604080744.AA14752@ztivax.LOCAL> To: dongarra@anl-mcs.arpa Status: RO Dear Jack , below I'm sending you the results I obtained on the VP 100 and VP 50 for the LINPACK benchmarks (100 equations and 300 equations). Additional the result for 1000 equations on a VP 200 (peak performance) is appended. I hope this information is satisfactorily for your performance paper. If you need any further information please let me know. Regards, Eric Schnepf Siemens AG, Vector Processor Systems Munich, Germany Arpanet: Na.schnepf at su-score SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN FULL PRECISION (FORTRAN77/VP, ROLLED BLAS): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.201E-02 1.693E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.198E-02 1.693E-03 4.367E-02 1.572E+01 1.272E-01 7.799E-01 4.195E-02 1.693E-03 4.365E-02 1.573E+01 1.271E-01 7.794E-01 4.197E-02 1.576E-03 4.355E-02 1.577E+01 1.268E-01 7.776E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.204E-02 1.576E-03 4.361E-02 1.574E+01 1.270E-01 7.788E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.883E-02 1.927E-03 5.076E-02 1.353E+01 1.478E-01 9.063E-01 4.878E-02 1.901E-03 5.068E-02 1.355E+01 1.476E-01 9.049E-01 4.878E-02 1.901E-03 5.068E-02 1.355E+01 1.476E-01 9.049E-01 4.879E-02 1.797E-03 5.058E-02 1.357E+01 1.473E-01 9.033E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.883E-02 1.805E-03 5.064E-02 1.356E+01 1.475E-01 9.042E-01 ----------------------------------------------------------------------- SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN FULL PRECISION (FORTRAN77/VP, ROLLED BLAS WITH COMPILER DIRECTIVES): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.924E-02 1.589E-03 4.083E-02 1.682E+01 1.189E-01 7.292E-01 3.924E-02 1.562E-03 4.081E-02 1.683E+01 1.189E-01 7.287E-01 3.922E-02 1.562E-03 4.078E-02 1.684E+01 1.188E-01 7.282E-01 3.922E-02 1.461E-03 4.068E-02 1.688E+01 1.185E-01 7.265E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.932E-02 1.562E-03 4.089E-02 1.679E+01 1.191E-01 7.301E-01 3.927E-02 1.536E-03 4.081E-02 1.683E+01 1.189E-01 7.287E-01 3.924E-02 1.536E-03 4.078E-02 1.684E+01 1.188E-01 7.282E-01 3.928E-02 1.453E-03 4.073E-02 1.686E+01 1.186E-01 7.273E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.237E-02 1.719E-03 4.409E-02 1.557E+01 1.284E-01 7.873E-01 4.237E-02 1.693E-03 4.406E-02 1.558E+01 1.283E-01 7.868E-01 4.240E-02 1.667E-03 4.406E-02 1.558E+01 1.283E-01 7.868E-01 4.239E-02 1.581E-03 4.397E-02 1.562E+01 1.281E-01 7.852E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.258E-02 1.719E-03 4.430E-02 1.550E+01 1.290E-01 7.910E-01 4.255E-02 1.693E-03 4.424E-02 1.552E+01 1.289E-01 7.901E-01 4.253E-02 1.667E-03 4.419E-02 1.554E+01 1.287E-01 7.892E-01 4.254E-02 1.581E-03 4.412E-02 1.556E+01 1.285E-01 7.879E-01 ----------------------------------------------------------------------- SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN HALF PRECISION (FORTRAN77/VP, ROLLED BLAS): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 3.68801689E+00 7.03105237E-04 9.53674316E-07 9.99935091E-01 9.99949932E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 202 4.029E-02 1.615E-03 4.190E-02 1.639E+01 1.220E-01 7.482E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.508E-03 4.164E-02 1.649E+01 1.213E-01 7.435E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.016E-02 1.589E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.012E-02 1.505E-03 4.163E-02 1.650E+01 1.212E-01 7.433E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 3.68801689E+00 7.03105237E-04 9.53674316E-07 9.99935091E-01 9.99949932E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 202 4.758E-02 1.875E-03 4.945E-02 1.389E+01 1.440E-01 8.831E-01 4.747E-02 1.823E-03 4.930E-02 1.393E+01 1.436E-01 8.803E-01 4.750E-02 1.823E-03 4.932E-02 1.392E+01 1.437E-01 8.808E-01 4.749E-02 1.737E-03 4.923E-02 1.395E+01 1.434E-01 8.790E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.753E-02 1.823E-03 4.935E-02 1.391E+01 1.437E-01 8.812E-01 4.753E-02 1.823E-03 4.935E-02 1.391E+01 1.437E-01 8.812E-01 4.755E-02 1.849E-03 4.940E-02 1.390E+01 1.439E-01 8.822E-01 4.754E-02 1.740E-03 4.928E-02 1.393E+01 1.435E-01 8.800E-01 ------------------------------------------------------------------- SOLVING A SYSTEM OF 300 LINEAR EQUATIONS USING THE VECTOR UNROLLING TECHNIQUE: (PROGRAM LUD) TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.471E-02 MFLOPS = 17.4232 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.430E-02 MFLOPS = 19.1127 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.464E-02 MFLOPS = 17.7168 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.490E-02 MFLOPS = 16.7745 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.531E-02 MFLOPS = 15.4588 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.163E-01 MFLOPS = 40.7230 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.142E-01 MFLOPS = 46.4556 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.152E-01 MFLOPS = 43.5871 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.158E-01 MFLOPS = 41.8636 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.170E-01 MFLOPS = 38.8550 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.367E-01 MFLOPS = 61.0168 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.303E-01 MFLOPS = 73.9868 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.323E-01 MFLOPS = 69.3889 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.333E-01 MFLOPS = 67.2714 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.359E-01 MFLOPS = 62.3896 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.686E-01 MFLOPS = 77.4339 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.533E-01 MFLOPS = 99.7255 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.564E-01 MFLOPS = 94.2876 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.587E-01 MFLOPS = 90.5630 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.627E-01 MFLOPS = 84.6984 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.116E+00 MFLOPS = 89.1982 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.850E-01 MFLOPS = 122.1454 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.897E-01 MFLOPS = 115.7636 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.933E-01 MFLOPS = 111.3053 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.100E+00 MFLOPS = 103.7757 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.183E+00 MFLOPS = 97.9239 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.129E+00 MFLOPS = 139.3180 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.134E+00 MFLOPS = 133.5431 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E+00 MFLOPS = 127.8239 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.151E+00 MFLOPS = 119.2251 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): (WITH COMPILER DIRECTIVES) -------------------------- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.385E-02 MFLOPS = 21.3081 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.346E-02 MFLOPS = 23.7113 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.388E-02 MFLOPS = 21.1651 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.422E-02 MFLOPS = 19.4667 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.471E-02 MFLOPS = 17.4232 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.135E-01 MFLOPS = 48.9617 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.116E-01 MFLOPS = 56.8481 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.129E-01 MFLOPS = 51.1291 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E-01 MFLOPS = 47.3206 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.157E-01 MFLOPS = 42.2815 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.317E-01 MFLOPS = 70.5273 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.251E-01 MFLOPS = 89.2760 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.280E-01 MFLOPS = 79.9005 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.299E-01 MFLOPS = 74.7589 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.337E-01 MFLOPS = 66.4910 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.617E-01 MFLOPS = 86.0559 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.445E-01 MFLOPS = 119.3207 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.498E-01 MFLOPS = 106.6589 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.530E-01 MFLOPS = 100.2153 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.597E-01 MFLOPS = 88.9831 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.107E+00 MFLOPS = 96.9865 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.732E-01 MFLOPS = 141.9740 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.804E-01 MFLOPS = 129.2319 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.861E-01 MFLOPS = 120.5592 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.960E-01 MFLOPS = 108.2247 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.172E+00 MFLOPS = 104.4982 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.113E+00 MFLOPS = 159.3439 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.122E+00 MFLOPS = 146.9791 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.131E+00 MFLOPS = 137.4838 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.145E+00 MFLOPS = 124.0296 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.549E-02 MFLOPS = 14.9460 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.529E-02 MFLOPS = 15.5350 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.547E-02 MFLOPS = 15.0171 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.581E-02 MFLOPS = 14.1417 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.620E-02 MFLOPS = 13.2504 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.207E-01 MFLOPS = 31.9236 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.189E-01 MFLOPS = 35.0017 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.191E-01 MFLOPS = 34.6673 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.199E-01 MFLOPS = 33.1739 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.208E-01 MFLOPS = 31.8836 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.514E-01 MFLOPS = 43.5746 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.437E-01 MFLOPS = 51.1743 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.432E-01 MFLOPS = 51.8846 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.448E-01 MFLOPS = 49.9261 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.469E-01 MFLOPS = 47.7626 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.103E+00 MFLOPS = 51.6030 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.827E-01 MFLOPS = 64.2236 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.813E-01 MFLOPS = 65.3760 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.833E-01 MFLOPS = 63.8016 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.867E-01 MFLOPS = 61.2729 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.182E+00 MFLOPS = 57.1933 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.142E+00 MFLOPS = 73.0684 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.137E+00 MFLOPS = 76.0069 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E+00 MFLOPS = 74.3906 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.145E+00 MFLOPS = 71.7668 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.294E+00 MFLOPS = 61.1622 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.225E+00 MFLOPS = 79.7273 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.215E+00 MFLOPS = 83.6649 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.218E+00 MFLOPS = 82.3360 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.225E+00 MFLOPS = 79.6352 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): (WITH COMPILER DIRECTIVES) -------------------------- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.445E-02 MFLOPS = 18.4421 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.409E-02 MFLOPS = 20.0866 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.453E-02 MFLOPS = 18.1241 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.495E-02 MFLOPS = 16.5979 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.549E-02 MFLOPS = 14.9460 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.170E-01 MFLOPS = 39.0341 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.147E-01 MFLOPS = 45.1354 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.162E-01 MFLOPS = 40.7886 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.176E-01 MFLOPS = 37.7021 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.193E-01 MFLOPS = 34.2469 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.424E-01 MFLOPS = 52.7764 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.341E-01 MFLOPS = 65.7284 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.376E-01 MFLOPS = 59.4967 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.398E-01 MFLOPS = 56.1914 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.442E-01 MFLOPS = 50.6020 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.863E-01 MFLOPS = 61.5872 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.653E-01 MFLOPS = 81.3874 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.705E-01 MFLOPS = 75.3187 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.760E-01 MFLOPS = 69.9480 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.828E-01 MFLOPS = 64.1631 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.154E+00 MFLOPS = 67.6405 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.113E+00 MFLOPS = 92.2313 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.120E+00 MFLOPS = 86.2099 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.128E+00 MFLOPS = 80.8286 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.140E+00 MFLOPS = 74.3764 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.250E+00 MFLOPS = 71.7312 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.180E+00 MFLOPS = 99.6071 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.191E+00 MFLOPS = 94.2432 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.202E+00 MFLOPS = 89.0342 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.219E+00 MFLOPS = 81.9932 CHECK = 0.100E+01 ------------------------------------------------------------------- TOWARD PEAK PERFORMANCE (1000 LINEAR EQUATIONS): (OPTIMIZED SUBROUTINE) TIMINGS FOR FUJITSU VP 200 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 1.39417418E+02 6.19081175E-11 2.22044659E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 1000 FACTOR SOLVE TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 1001 1.568E+00 1.557E-02 1.584E+00 4.222E+02 4.737E-03 2.828E+01 From GREENFIELD@MARLBORO.DEC.COM Tue Apr 8 16:30:09 1986 Received: from MARLBORO.DEC.COM (marlboro.dec.com.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA19266; Tue, 8 Apr 86 16:29:53 cst Date: 8 Apr 1986 1726-EST From: GREENFIELD@MARLBORO.DEC.COM To: dongarra@anl-mcs, greenfield@dvinci Subject: ["Mike Greenfield (mr3-1/e13,dtn297-7481)" : Linpack Results for Jack Dongarra] Message-Id: <"MS11(5146)+GLXLIB0(4)-4" 12197281171.16.279.33826 at MARLBORO.DEC.COM> Status: RO Not sure if the mail system took the last message - try again.... - - - - - - - Begin message from: "Mike Greenfield (mr3-1/e13,dtn297-7481)" Sender: GREENFIELD@DVINCI Date: 8 Apr 1986 1716-EST From: "Mike Greenfield (mr3-1/e13,dtn297-7481)" To: GREENFIELD@MARKET Subject: Linpack Results for Jack Dongarra Mailed to: MARKET::GREENFIELD Jack; Here are the latest VAX cpu results, including for the new VAX 8500. Note that since the VAX 8300 and 8800 can only give up to 100% of a processor to a process, there is no benefit for this benchmark in the single stream case. If you send me your driver program (mentioned in the appendix), we will see what the multi-thread version does. regards, Mike (Mflops are correct - other ratios have a new formula, so they may need to be recomputed) Solving a system of linear equations with LINPACK in full precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8800(UP) VMS v4 (coded BLAS) 11 1.13 .606 1.76 VAX 8800(UP) VMS v4 13 .970 .708 2.06 VAX 8650 VMS v4 (coded BLAS) 13 .96 .715 2.08 VAX 8500 VMS v4 (coded BLAS) 16 .763 .900 2.62 VAX 8650 VMS v4 17 .70 .975 2.84 VAX 8600 VMS v4 (coded BLAS) 19 .66 1.04 3.03 VAX 8500 VMS v4 19 .652 1.05 3.07 VAX 8600 VMS v4 25 .49 1.41 4.11 VAX 785 VMS v4 (coded BLAS) 54 .225 3.01 8.77 VAX 785 VMS v4 63 .196 3.50 10.2 VAX 8200 VMS v4 (coded BLAS) 68 .180 3.81 11.1 VAX 780 VMS v4 (coded BLAS) 74 .166 4.12 12.0 uVAX II VMS v4 (coded BLAS) 79 .156 4.40 12.8 VAX 8200 VMS v4 81 .151 4.54 13.2 VAX 780 VMS v4 89 .138 4.96 14.4 uVAX II VMS v4 97 .126 5.45 15.9 Note: 88UP is VAX 8800 using only a single CPU 8300 is same as 8200 since only one cpu is used ====================================================================== Solving a System of Linear Equations with LINPACK in Single Precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS v4 (coded BLAS) 6.4 1.9 .361 1.05 VAX 88UP VMS v4 (coded BLAS) 7.4 1.65 .416 1.21 VAX 88UP VMS v4 9.1 1.35 .509 1.48 VAX 8650 VMS v4 9.7 1.3 .545 1.59 VAX 8600 VMS v4 (coded BLAS) 9.8 1.3 .546 1.59 VAX 8500 VMS v4 (coded BLAS) 13 .958 .717 2.09 VAX 8600 VMS v4 14 .88 .780 2.27 VAX 8500 VMS v4 15 .800 .859 2.50 VAX 785 VMS v4 (coded BLAS) 24 .511 1.34 3.91 VAX 785 VMS v4 31 .398 1.72 5.02 VAX 780 VMS v4 (coded BLAS) 36 .339 2.02 5.88 VAX 8200 VMS v4 (coded BLAS) 40 .307 2.23 6.51 VAX 780 VMS v4 49 .250 2.74 7.98 VAX 8200 VMS v4 54 .227 3.03 8.82 uVAX II VMS v4 (coded BLAS) 54 .227 3.04 8.81 uVAX II VMS v4 70 .174 3.95 11.5 Note: 88UP is VAX 8800 using only a single CPU 8300 is same as 8200 since only one cpu is used Posted: Tue 8-Apr-1986 15:36 Eastern Standard Time To: RHEA::DECWRL::"""dongarra@anl-mcs""" -------- - - - - - - - End forwarded message -------- .