From messina Tue Oct 29 08:57:33 1985 Received: by anl-mcs.ARPA (4.12/4.9) id AA13772; Tue, 29 Oct 85 08:57:23 cst Date: Tue, 29 Oct 85 08:57:23 cst From: messina (Paul Messina) Message-Id: <8510291457.AA13772@anl-mcs.ARPA> To: supers Subject: NBS benchmarking Status: R Of possible interest: >From welch@ames-vmsb.ARPA Sat Sep 14 13:06:05 1985 Received: from su-aimvax.arpa (su-aimvax.arpa.ARPA) by lbl-csam.ARPA ; Sat, 14 Sep 85 13:06:05 pdt Message-Id: <8509142006.AA18366@lbl-csam.ARPA> Received: from ames-vmsb.ARPA by su-aimvax.arpa with TCP; Sat, 14 Sep 85 13:05:46 pdt Date: 14 Sep 85 12:53:00 PST From: welch@ames-vmsb.ARPA Subject: SIGBIG To: bayboards@diablo.ARPA Reply-To: welch@ames-vmsb.ARPA Reply-To: "DYMOND, KEN" NBS PARALLEL COMPUTER BENCHMARK COLLECTION The National Bureau of Standards, since its founding, has been concerned with measurement, determining the precise values and metrics for physical phenomena. The NBS has also made significant contri- butions to metrology in numerous scientific and engineering disciplines. In this tradition, the MPC (Measurement for Parallel Computing) project at NBS is developing a set of metrics and measure- ment techniques to characterize the performance of parallel processing systems. As part of that effort, NBS is collecting benchmarks and code kernels that represent a variety of applications which are candidates for parallel processing. NBS solicits benchmark codes and kernels from researchers and scientists. Programs which are computationally intensive, I/O intensive, vectorizable or not and from non-numeric as well as from numeric application areas are requested. Especially welcome are programs which have been used to produce timing or speedup data on parallel computers, whose measurement results have been or may be published in the technical literature, and which are in some fairly widely used and higher-level programming language such as FORTRAN, "C", LISP, Ada, etc. Contributions or inquiries should be directed to: Measurement for Parallel Computing Institute for Computer Sciences and Technology Materials Building MS B364 National Bureau of Standards Gaithersburg, MD 20899 USA Telephone: 301-921-3274 ARPANET: MEASURE@NBS-VMS From gustav.yktvmt%ibm-sj.csnet@CSNET-RELAY.ARPA Thu Nov 7 15:17:12 1985 Received: from CSNET-RELAY.ARPA (csnet-relay.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02489; Thu, 7 Nov 85 15:16:51 cst Message-Id: <8511072116.AA02489@anl-mcs.ARPA> Received: from ibm-sj by csnet-relay.csnet id ac01270; 7 Nov 85 16:07 EST Date: Thu, 7 Nov 85 15:01:48 EST From: "Fred G. Gustavson" To: dongarra@anl-mcs.ARPA Subject: reply of 11/7/85 morning phone call Status: R Jack: Here is the info you want. Routine Time(ms) MFLOPS DGEF 10258261 65 DGES 44364 45 SGEF 10317723 65 SGES 43425 46 DPPF 5384767 62 DPPS 51997 38 SPPF 5158187 65 SPPS 54770 36 Best regards, Fred From VMAA5%KSUVM.BITNET@WISCVM.ARPA Thu Nov 7 14:55:40 1985 Received: from WISCVM.ARPA (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02055; Thu, 7 Nov 85 14:55:23 cst Message-Id: <8511072055.AA02055@anl-mcs.ARPA> Received: from (MAILER)KSUVM.BITNET by WISCVM.ARPA on 11/07/85 at 14:54:33 CST Return-Path: VMAA5%KSUVM.BITNET@WISCVM.ARPA Received: by KSUVM (Mailer X1.20) id 6472; Thu, 07 Nov 85 14:37:38 CST Date: Thu, 7 Nov 85 14:29 CST From: Neil Erdwien Subject: LINPACK Benchmarks To: Jack J. Dongarra Status: RO I saw your benchmark programs using LINPACK available from NETLIB, so tried them on Kansas State University's National Advanced System's 6630. Listings from the runs follow: "Full" precision -- all VS FORTRAN 1.4.0 opt(3) PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 2.91923652E+00 1.29549149E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.150E+00 1.000E-01 3.250E+00 2.113E-01 9.466E+00 5.804E+01 3.130E+00 1.000E-01 3.230E+00 2.126E-01 9.408E+00 5.768E+01 3.160E+00 1.000E-01 3.260E+00 2.106E-01 9.495E+00 5.821E+01 3.151E+00 9.700E-02 3.248E+00 2.114E-01 9.460E+00 5.800E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.150E+00 9.000E-02 3.240E+00 2.119E-01 9.437E+00 5.786E+01 3.130E+00 9.999E-02 3.230E+00 2.126E-01 9.408E+00 5.768E+01 3.140E+00 9.999E-02 3.240E+00 2.119E-01 9.437E+00 5.786E+01 3.159E+00 9.800E-02 3.257E+00 2.108E-01 9.486E+00 5.816E+01 R; T=91.13/92.05 23:37:10 "Half" precision -- all VS FORTRAN 1.4.0 opt(3) PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 3.97839260E+00 7.59065151E-04 9.53674316E-07 9.99704897E-01 9.99731898E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.990E+00 6.000E-02 2.050E+00 3.350E-01 5.971E+00 3.661E+01 2.000E+00 6.000E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.000E+00 7.000E-02 2.070E+00 3.317E-01 6.029E+00 3.696E+01 2.015E+00 6.100E-02 2.076E+00 3.308E-01 6.047E+00 3.707E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.000E+00 6.000E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.000E+00 6.001E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 1.990E+00 6.999E-02 2.060E+00 3.333E-01 6.000E+00 3.679E+01 2.017E+00 6.400E-02 2.081E+00 3.300E-01 6.061E+00 3.716E+01 R; T=60.64/61.70 23:40:09 The use of the assembler IBM BLAS routines did not speed up the times; the above results are without the "coded" BLAS routines. By the way, I think the NETLIB service is great -- a wonderful and useful idea. From AG2%CORNELLA.BITNET@ucbvax.berkeley.edu Mon Nov 11 08:05:09 1985 Received: from ucbvax.berkeley.edu (ucbvax.berkeley.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28614; Mon, 11 Nov 85 08:05:03 cst Received: by ucbvax.berkeley.edu (5.31/1.2) id AA03952; Mon, 11 Nov 85 06:02:48 PST Received: from CORNELLA by ucbjade.Berkeley.Edu (4.19/4.40.2) id AA25387; Mon, 11 Nov 85 06:04:32 pst Message-Id: <8511111404.AA25387@ucbjade.Berkeley.Edu> Date: 11 November 85 09:02 EST From: AG2%CORNELLA.BITNET@ucbvax.berkeley.edu Subject: Small Benchmark To: DONGARRA@anl-mcs.arpa Status: R Jack: a question on a previous topic. The performance guys in Endicott asked me to reconfirm a detail. Is it verboten for them to change the dimension statements (adding a dummy array) in the main calling routine for the LINPACK benchamrks - without touching SGEFA and SGESL at all (and letting you see the modified source)? Please understand that I have no personal stake in this - and that no-one is petitioning a change in your standard procedure - but they want to understand clearly what is kosher and what isn't, Regards, Alec From sun!sunmark!jricotta@ucbvax.berkeley.edu Sat Nov 23 18:26:27 1985 Received: from ucbvax.berkeley.edu (ucbvax.berkeley.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA26785; Sat, 23 Nov 85 18:26:21 cst Received: by ucbvax.berkeley.edu (5.31/1.7) id AA08928; Thu, 21 Nov 85 19:29:54 PST Received: from snail.sun.uucp by sun.uucp (3.0DEV4/SMI-2.0) id AA18847; Wed, 20 Nov 85 15:35:18 PST Received: from sunmark.sun.uucp by snail.sun.uucp (3.0ALPHA/SMI-3.0DEV4) id AA15624; Wed, 20 Nov 85 15:35:44 PST Return-Path: Received: by sunmark.sun.uucp (2.0/SMI-2.0) id AA18528; Wed, 20 Nov 85 15:31:46 pst Date: Wed, 20 Nov 85 15:31:46 pst From: sun!sunmark!jricotta@ucbvax.berkeley.edu (Jim Ricotta) Message-Id: <8511202331.AA18528@sunmark.sun.uucp> To: DONGARRA@anl-mcs.arpa Subject: Linpack Question Status: R Mr Dongarra, Do you have any Linpack results for some of the recently announced Masscomp systems? I am particularly interested in the MC5400 base system (68020 + 68881), and the 5400 plus "Lightning" FPA. All the performance claims they make for floating point are in KWhets, and this makes me somewhat suspicious. I'd like to find out how they actually stack up in the Linpack benchmark. Their new FPA ("Lightning") is based on the Weitek 1164/65 chips, and they are rating it at ">3 MWhets". The FP-501 is rated at 924 Kwhets. I know the Whetstone benchmark is algorithm dependent, and not well standardized, so I think Linpack will provide a better basis for comparison. Any information you could offer would be greatly appreciated. Thanks, Jim Ricotta From greenfield%dvinci.DEC@decwrl.DEC.COM Mon Dec 9 19:15:20 1985 Received: from decwrl.DEC.COM (decwrl.dec.com.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00191; Mon, 9 Dec 85 19:15:10 cst Received: from DEC-RHEA.ARPA (dec-rhea) by decwrl.DEC.COM (4.22.01/4.7.34) id AA08779; Mon, 9 Dec 85 16:54:44 pst Message-Id: <8512100054.AA08779@decwrl.DEC.COM> Date: Monday, 9 Dec 1985 16:49:09-PST From: greenfield%dvinci.DEC@decwrl.DEC.COM (Mike Greenfield mr3-1/e13,dtn297-7481 ) To: dongarra@anl-mcs Subject: revise linpack results for 8600 and 8650 cpus Status: RO Jack; A few little changes to the 8600/blas results and some new 8650 numbers. I think that these are in a decus presentation that will be given this week. regards, -Mike (Mflops are correct - other ratios have a new formula, so they need to be recomputed) Solving a system of linear equations with LINPACK in full precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS (coded BLAS) .96 VAX 8650 VMS .704 VAX 8600 VMS (coded BLAS) .660 VAX 8600 VMS .486 VAX 785 VMS (coded BLAS) .225 VAX 785 VMS .196 VAX 780 VMS (coded BLAS) .166 uVAX II VMS (coded BLAS) .156 uVAX II FG VMS (coded BLAS) .151 VAX 750 VMS (coded BLAS) .148 VAX 780 VMS .138 VAX 750 VMS .124 uVAX II VMS .126 uVAX II FG VMS .119 VAX 725 VMS (coded BLAS) .043 VAX 725 FG VMS (coded BLAS) .038 VAX 725 VMS .037 VAX 725 FG VMS .033 Note: FG = floating_G processing. ====================================================================== Solving a System of Linear Equations with LINPACK in Single Precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS (coded BLAS) 1.9 VAX 8600 VMS (coded BLAS) 1.26 VAX 8650 VMS 1.25 VAX 8600 VMS .88 VAX 785 VMS (coded BLAS) .511 VAX 785 VMS .398 VAX 780 VMS (coded BLAS) .339 VAX 780 VMS .250 VAX 750 VMS (coded BLAS) .242 uVAX II VMS (coded BLAS) .227 VAX 750 VMS .183 uVAX II VMS .174 VAX 725 VMS (coded BLAS) .066 VAX 725 VMS .052 From RKUJW%DKCCRE01.BITNET@WISCVM.WISC.EDU Mon Jan 6 13:42:20 1986 Received: from WISCVM.WISC.EDU (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA19870; Mon, 6 Jan 86 13:42:06 cst Message-Id: <8601061942.AA19870@anl-mcs.ARPA> Received: from (DKCCRE01)DKUCCC11.BITNET by WISCVM.WISC.EDU on 01/06/86 at 13:42:12 CST Date: 05 jan 86 at 14:20:10 DNT From: RKUJW%DKCCRE01.BITNET@WISCVM.WISC.EDU To: dongarra@anl-mcs.arpa Subject: Sperry 1100/82 and 1100/92 Linpack results. Status: RO *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Single Precision 1100/92 without SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 2.53175148+000 7.53998756-006 1.49011612-008 9.99996811-001 9.99997862-001 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.600-003 2.482-001 2.767+000 7.229-001 4.432+000 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.660-003 2.483-001 2.766+000 7.231-001 4.433+000 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 2.404-001 7.800-003 2.482-001 2.767+000 7.229-001 4.432+000 2.406-001 7.600-003 2.482-001 2.767+000 7.229-001 4.432+000 2.404-001 7.600-003 2.480-001 2.769+000 7.223-001 4.429+000 2.406-001 7.660-003 2.483-001 2.766+000 7.231-001 4.433+000 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 10.28 7.28 .02 2.99 98 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Double Precision 1100/92 without SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 9.78186593-001 6.78276879-016 3.46944695-018 1.00000000+000 1.00000000+000 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.138-002 3.744-001 1.834+000 1.091+000 6.686+000 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.140-002 3.744-001 1.834+000 1.090+000 6.686+000 3.630-001 1.138-002 3.744-001 1.834+000 1.090+000 6.686+000 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 16.31 10.79 .02 5.50 177 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Single Precision 1100/82 with SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 2.53175148+000 7.53998756-006 1.49011612-008 9.99996811-001 9.99997862-001 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 8.056-001 2.540-002 8.310-001 8.263-001 2.420+000 1.484+001 8.052-001 2.520-002 8.304-001 8.269-001 2.419+000 1.483+001 8.046-001 2.540-002 8.300-001 8.273-001 2.417+000 1.482+001 8.056-001 2.540-002 8.310-001 8.264-001 2.420+000 1.484+001 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 8.070-001 2.520-002 8.322-001 8.251-001 2.424+000 1.486+001 8.054-001 2.520-002 8.306-001 8.267-001 2.419+000 1.483+001 8.070-001 2.560-002 8.326-001 8.247-001 2.425+000 1.487+001 8.062-001 2.524-002 8.314-001 8.259-001 2.422+000 1.485+001 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 27.17 24.11 .04 3.02 98 K **** **** *************************************************** **** **** *** Dongarra Linpack Test Program *** SPERRY 1100 series, ASCII FORTRAN (FTN), Double Precision 1100/82 with SAM, FTN ZEO opt, Rolled BLAS. NORM. RESID RESID MACHEP X(1) X(N) 9.78186593-001 6.78276879-016 3.46944695-018 1.00000000+000 1.00000000+000 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.148+000 3.640-002 1.184+000 5.799-001 3.449+000 2.115+001 1.148+000 3.620-002 1.184+000 5.800-001 3.449+000 2.114+001 1.148+000 3.620-002 1.184+000 5.800-001 3.449+000 2.114+001 1.148+000 3.624-002 1.184+000 5.798-001 3.449+000 2.115+001 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 1.149+000 3.620-002 1.185+000 5.793-001 3.453+000 2.117+001 1.149+000 3.620-002 1.185+000 5.793-001 3.453+000 2.117+001 1.149+000 3.640-002 1.186+000 5.792-001 3.453+000 2.117+001 1.149+000 3.624-002 1.185+000 5.793-001 3.453+000 2.117+001 .--. TOTAL SUPS CPU SUPS I/O SUPS ER/CC SUPS CORE USED 39.54 33.97 .04 5.54 177 K **** **** *************************************************** Are you interested also in the results of the Sperry AP? J.Wasniewski, RECKU, Copenhagen. From maddog!sequent!rand@lll-crg.ARPA Wed Jan 8 17:27:11 1986 Received: from lll-crg.ARPA (lll-crg.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA25729; Wed, 8 Jan 86 17:26:46 cst Received: by lll-crg.ARPA id AA29194; Wed, 8 Jan 86 15:27:45 pst id AA29194; Wed, 8 Jan 86 15:27:45 pst Received: by maddog.uucp (4.12/3.14) id AA00315; Wed, 8 Jan 86 15:12:44 pst Message-Id: <8601082312.AA00315@maddog.uucp> Date: Wed, 8 Jan 86 14:01:41 pst From: maddog!sequent!rand@lll-crg.ARPA (Randall H. Dow) To: dongarra@anl-mcs.ARPA Subject: Latest LINPACK Results Cc: gf@lll-crg.ARPA Status: RO Jack: Here are our latest results. As you can see, our latest optimizing compiler gives us about 15-20% performance improvement. Rand Dow ****************************************************************************** LINPACK in Full Precision Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 ****************************************************************************** + dlp Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.132E+01 3.667E-01 1.168E+01 5.877E-02 3.403E+01 2.086E+02 1.130E+01 3.500E-01 1.165E+01 5.894E-02 3.393E+01 2.080E+02 1.130E+01 3.333E-01 1.163E+01 5.903E-02 3.388E+01 2.077E+02 1.132E+01 3.467E-01 1.167E+01 5.885E-02 3.399E+01 2.084E+02 times for array with leading dimension of 200 1.132E+01 3.500E-01 1.167E+01 5.886E-02 3.398E+01 2.083E+02 1.132E+01 3.500E-01 1.167E+01 5.886E-02 3.398E+01 2.083E+02 1.137E+01 3.500E-01 1.172E+01 5.861E-02 3.413E+01 2.092E+02 1.132E+01 3.467E-01 1.167E+01 5.883E-02 3.400E+01 2.084E+02 Programmed STOP ****************************************************************************** LINPACK in Full Precision, Coded BLAS Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlp.as Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.007E+01 3.167E-01 1.038E+01 6.613E-02 3.024E+01 1.854E+02 1.005E+01 3.167E-01 1.037E+01 6.624E-02 3.019E+01 1.851E+02 1.007E+01 3.167E-01 1.038E+01 6.613E-02 3.024E+01 1.854E+02 1.007E+01 3.100E-01 1.038E+01 6.616E-02 3.023E+01 1.853E+02 Programmed STOP ****************************************************************************** Caveat on the "Compiler Directive". This is essentially coded via the mechanism I reported in my "Parallel LINPACK Case Study". We are just a couple of months away from actually having the compiler directive working so that *ALL* of the handcoding that I did would be automatically generated by the compiler with one compiler directive comment. ****************************************************************************** ****************************************************************************** "Compiler Directive" to access multiprocessing, Full Precision Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + dlps -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.033E+00 3.500E-01 9.383E+00 7.318E-02 2.733E+01 1.676E+02 8.333E-01 3.500E-01 1.183E+00 5.803E-01 3.447E+00 2.113E+01 8.667E-01 3.500E-01 1.217E+00 5.644E-01 3.544E+00 2.173E+01 8.617E-01 3.383E-01 1.200E+00 5.722E-01 3.495E+00 2.143E+01 times for array with leading dimension of 200 8.933E+00 3.333E-01 9.267E+00 7.410E-02 2.699E+01 1.655E+02 8.333E-01 3.333E-01 1.167E+00 5.886E-01 3.398E+00 2.083E+01 8.667E-01 3.333E-01 1.200E+00 5.722E-01 3.495E+00 2.143E+01 8.567E-01 3.400E-01 1.197E+00 5.738E-01 3.485E+00 2.137E+01 Programmed STOP ****************************************************************************** "Compiler Directive" to access multiprocessing, Full Precision, Coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlps.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.67117300E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.083E+00 3.000E-01 9.383E+00 7.318E-02 2.733E+01 1.676E+02 8.167E-01 3.000E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.000E-01 3.167E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.100E-01 3.100E-01 1.120E+00 6.131E-01 3.262E+00 2.000E+01 times for array with leading dimension of 200 8.917E+00 3.333E-01 9.250E+00 7.423E-02 2.694E+01 1.652E+02 8.167E-01 3.167E-01 1.133E+00 6.059E-01 3.301E+00 2.024E+01 8.000E-01 3.167E-01 1.117E+00 6.149E-01 3.252E+00 1.994E+01 8.117E-01 3.117E-01 1.123E+00 6.113E-01 3.272E+00 2.006E+01 Programmed STOP ****************************************************************************** LINPACK in Half Precision Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 ****************************************************************************** + slp Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.59605300E+00 3.80277600E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 8.833E+00 2.833E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.817E+00 2.667E-01 9.083E+00 7.560E-02 2.646E+01 1.622E+02 8.833E+00 2.667E-01 9.100E+00 7.546E-02 2.650E+01 1.625E+02 8.833E+00 2.750E-01 9.108E+00 7.539E-02 2.653E+01 1.626E+02 times for array with leading dimension of 200 8.850E+00 2.667E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.833E+00 2.833E-01 9.117E+00 7.532E-02 2.655E+01 1.628E+02 8.833E+00 2.667E-01 9.100E+00 7.546E-02 2.650E+01 1.625E+02 8.832E+00 2.750E-01 9.107E+00 7.540E-02 2.652E+01 1.626E+02 Programmed STOP ****************************************************************************** LINPACK in Half Precision, coded BLAS Sequent Balance 8000 (1 processor) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + slp.as Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.59605300E+00 3.80277600E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 8.050E+00 2.500E-01 8.300E+00 8.273E-02 2.417E+01 1.482E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.062E+00 2.517E-01 8.313E+00 8.260E-02 2.421E+01 1.485E+02 times for array with leading dimension of 200 8.050E+00 2.667E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.067E+00 2.500E-01 8.317E+00 8.257E-02 2.422E+01 1.485E+02 8.050E+00 2.500E-01 8.300E+00 8.273E-02 2.417E+01 1.482E+02 8.058E+00 2.500E-01 8.308E+00 8.265E-02 2.420E+01 1.484E+02 Programmed STOP ****************************************************************************** No entry yet for the matrix-vector program, order 300 ****************************************************************************** ****************************************************************************** Note: for all of the order 1000 runs, the first pass requires page faulting of the data. After that, the pages remain valid. This explains, I believe, the lower result on the first pass of each array. ****************************************************************************** ****************************************************************************** Single precision, order 1000 Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + slps.1000 -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.13179700E+01 2.70068600E-03 1.19209300E-07 1.00016200E+00 9.99933500E-01 times are reported for matrices of order 1000 sgefa sgesl total mflops unit ratio times for array with leading dimension of1001 4.211E+02 2.382E+01 4.449E+02 1.503E+00 1.331E+00 7.945E+03 2.960E+02 2.377E+01 3.198E+02 2.091E+00 9.566E-01 5.711E+03 2.961E+02 2.382E+01 3.199E+02 2.090E+00 9.569E-01 5.713E+03 2.963E+02 2.435E+01 3.206E+02 2.086E+00 9.589E-01 5.725E+03 times for array with leading dimension of1000 4.184E+02 2.382E+01 4.422E+02 1.512E+00 1.323E+00 7.896E+03 2.957E+02 2.388E+01 3.196E+02 2.092E+00 9.559E-01 5.707E+03 2.956E+02 2.402E+01 3.196E+02 2.092E+00 9.560E-01 5.707E+03 2.964E+02 2.435E+01 3.208E+02 2.085E+00 9.594E-01 5.728E+03 Programmed STOP ****************************************************************************** Single precision, order 1000, coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + slps.1000.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.13179700E+01 2.70068600E-03 1.19209300E-07 1.00016200E+00 9.99933500E-01 times are reported for matrices of order 1000 sgefa sgesl total mflops unit ratio times for array with leading dimension of1001 4.003E+02 2.282E+01 4.231E+02 1.581E+00 1.265E+00 7.555E+03 2.762E+02 2.287E+01 2.991E+02 2.236E+00 8.946E-01 5.341E+03 2.763E+02 2.283E+01 2.991E+02 2.235E+00 8.947E-01 5.341E+03 2.763E+02 2.285E+01 2.991E+02 2.236E+00 8.946E-01 5.341E+03 times for array with leading dimension of1000 3.984E+02 2.263E+01 4.210E+02 1.588E+00 1.259E+00 7.518E+03 2.744E+02 2.273E+01 2.972E+02 2.250E+00 8.888E-01 5.307E+03 2.744E+02 2.270E+01 2.971E+02 2.251E+00 8.886E-01 5.305E+03 2.744E+02 2.268E+01 2.971E+02 2.251E+00 8.885E-01 5.305E+03 Programmed STOP ****************************************************************************** Caveat on the Full Precision, order 1000. The standard driver program reguires about 16 MB of memory. Since we only have 16 MB of virtual space it wouldn't fit. I commented out the second half of the run, and the second set of arrays. This should give the same results. ****************************************************************************** ****************************************************************************** Double Precision, order 1000 Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 ****************************************************************************** + dlps.1000 -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 9.50387011E+00 4.22017976E-12 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 1000 dgefa dgesl total mflops unit ratio times for array with leading dimension of1001 6.653E+02 3.228E+01 6.976E+02 9.585E-01 2.087E+00 1.246E+04 4.468E+02 3.230E+01 4.791E+02 1.396E+00 1.433E+00 8.556E+03 4.468E+02 3.228E+01 4.791E+02 1.396E+00 1.433E+00 8.554E+03 4.456E+02 3.233E+01 4.779E+02 1.399E+00 1.429E+00 8.534E+03 Programmed STOP ****************************************************************************** Double Precision, order 1000, coded BLAS Sequent Balance 86000 (30 processors) DYNIX Fortran 2.4.4 (coded BLAS) ****************************************************************************** + dlps.1000.as -P30 CPUs: 30 Use: 30 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 9.50387011E+00 4.22017976E-12 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 1000 dgefa dgesl total mflops unit ratio times for array with leading dimension of1001 6.337E+02 2.847E+01 6.622E+02 1.010E+00 1.981E+00 1.182E+04 4.168E+02 2.842E+01 4.452E+02 1.502E+00 1.332E+00 7.949E+03 4.170E+02 2.838E+01 4.453E+02 1.501E+00 1.332E+00 7.952E+03 4.173E+02 2.844E+01 4.457E+02 1.500E+00 1.333E+00 7.959E+03 Programmed STOP From unido!ztivax!schnepf@seismo.CSS.GOV Wed Jan 29 08:09:23 1986 Received: from seismo.CSS.GOV (seismo.css.gov.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA03238; Wed, 29 Jan 86 08:08:33 cst Return-Path: Received: from unido.UUCP by seismo.CSS.GOV with UUCP; Wed, 29 Jan 86 09:01:18 EST From: unido!ztivax!schnepf@seismo.CSS.GOV Received: by unido.uucp with uucp; Wed, 29 Jan 86 14:35:16 -0100 Received: by ztivax.UUCP (4.12/4.8) id AA13233; Wed, 29 Jan 86 14:30:14 GMT Date: Wed, 29 Jan 86 14:30:14 GMT Message-Id: <8601291430.AA13233@ztivax.UUCP> To: dongarra@anl-mcs.arpa Status: R Dear Jack Dongarra, I have tested the LINPACK-benchmark on our Siemens/Fujitsu-VP 200. I fetched the program from netlib and compared the result with your report which I have fetched from netlib too. I could make some improvements in the performance by replacing the BLAS- routines by inline-FORTRAN-code and by inserting some compiler directives. Finally I reached about 33 MFLOPS on the VP 200 (cycle time 15 nsec.). I think it might be interesting for you to see the output of the compiler and the results for full precision (64 bit) and half precision (32 bit). It is worthwhile mentioning that no assembler has been used. Yours, Eric Schnepf Siemens AG, Vector Processor Systems Munich, Germany Arpanet: Na.schnepf at su-scorerom mips!escargot.earl@su-glacier.arpa Wed Jan 22 22:47:31 1986 Received: from su-glacier.arpa (su-glacier.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA21988; Wed, 22 Jan 86 22:47:18 cst Received: by su-glacier.arpa with Sendmail; Wed, 22 Jan 86 20:46:43 pst Received: from escargot.UUCP (escargot.ARPA) by mips.UUCP (4.12/4.7) id AA10567; Wed, 22 Jan 86 14:46:19 pst Received: by escargot.UUCP (4.12/4.7) id AA00222; Wed, 22 Jan 86 14:45:59 pst Date: Wed, 22 Jan 86 14:45:59 pst From: mips!escargot.earl@su-glacier.arpa (Earl Killian) Message-Id: <8601222245.AA00222@escargot.UUCP> To: DONGARRA@ANL-MCS.ARPA Subject: linpack benchmark for new VAX Unix fortran compiler Status: RO I ran your full precision linpack benchmark on a 4.2bsd Unix VAX 780 w/FPA using the LLL S-1 Project fortran compiler instead of the 4.2bsd Unix f77 compiler and got the following results. The compiler generally does much better than f77 (2x faster on Spice), but the times aren't much different on linpack, probably because the VAX spends most of its time cache missing on linpack, so the code quality is secondary. By the way, have you considered modifying the source of DAXPY to copy the DA parameter to a local before using it? Fortran's pass-by-reference semantics mean that the store to DY might change the value of DA, and thus it can't be loaded into a register outside of the loop. If it were a local then a compiler could do that. That would make compiler times a tiny bit closer to coded BLAS times, which must assume there is no hazard there, and in addition that assignments to DY don't affect DX (otherwise it can't be vectorized). I think that would make it a better benchmark. I'll include the times for that modification last. Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 2.10272591E+00 1.16642807E-14 2.77555756E-17 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 5.100E+00 1.600E-01 5.260E+00 1.305E-01 1.532E+01 9.393E+01 5.180E+00 1.700E-01 5.350E+00 1.283E-01 1.558E+01 9.554E+01 5.010E+00 1.500E-01 5.160E+00 1.331E-01 1.503E+01 9.214E+01 5.070E+00 1.540E-01 5.224E+00 1.314E-01 1.522E+01 9.329E+01 times for array with leading dimension of 200 5.100E+00 1.600E-01 5.260E+00 1.305E-01 1.532E+01 9.393E+01 5.430E+00 1.500E-01 5.580E+00 1.231E-01 1.625E+01 9.964E+01 5.260E+00 1.500E-01 5.410E+00 1.269E-01 1.576E+01 9.661E+01 5.176E+00 1.540E-01 5.330E+00 1.288E-01 1.552E+01 9.518E+01 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 2.10272591E+00 1.16642807E-14 2.77555756E-17 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.960E+00 1.500E-01 5.110E+00 1.344E-01 1.488E+01 9.125E+01 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.945E+00 1.580E-01 5.103E+00 1.346E-01 1.486E+01 9.112E+01 times for array with leading dimension of 200 4.930E+00 1.600E-01 5.090E+00 1.349E-01 1.483E+01 9.089E+01 4.910E+00 1.500E-01 5.060E+00 1.357E-01 1.474E+01 9.036E+01 4.830E+00 1.600E-01 4.990E+00 1.376E-01 1.453E+01 8.911E+01 4.915E+00 1.510E-01 5.066E+00 1.355E-01 1.476E+01 9.046E+01 From GR105%VTVM1.BITNET@WISCVM.WISC.EDU Mon Feb 10 16:11:00 1986 Received: from WISCVM.WISC.EDU (wiscvm.wisc.edu.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA23112; Mon, 10 Feb 86 16:10:40 cst Message-Id: <8602102210.AA23112@anl-mcs.ARPA> Received: from (GR105)VTVM1.BITNET by WISCVM.WISC.EDU on 02/10/86 at 16:10:47 CST Date: Mon, 10 Feb 86 16:52:22 EST To: DONGARRA@ANL-MCS.ARPA From: GR105%VTVM1.BITNET@WISCVM.WISC.EDU Subject: TIMINGS FROM A PERKIN-ELMER 3230 Status: R Dear Dr. Dongarra, Here are some benchmark results from a Perkin Elmer 3230 which the Va Tech Aerospace Dept has as a departmental mini-computer. Enclosed first are the single precision Linpack timings and then the double precision timings. The last item enclosed is the timing routine (REAL FUNCTION SECOND(T)) that was used to time these bench- marks. Both of the benchmark cases were obtained from netlib (as in SEND LINPACKS LINPACKD FROM BENCHMARK). The computer was a Perkin Elmer 3230. The operating system was OS/32 rel 6.2.2. The compiler was the fortran O compiler (rel 5.2). The fortran O compiler is a highly optimizing compiler. The particular 3230 we have has the floating point hardware so all 32 bit single precision and 64 bit double precision floating point operations were executed in hardware. Sincerely yours, David Whitaker D T 02/10/86 16:11:13 LINS ,,,,,,,,,,,,, PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 1.05047800E+00 2.00271600E-04 9.53674300E-07 9.99929400E-01 9.99917700E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.841E+00 1.740E-01 5.015E+00 1.369E-01 1.461E+01 8.955E+01 4.142E+00 1.240E-01 4.266E+00 1.610E-01 1.243E+01 7.618E+01 4.139E+00 1.260E-01 4.265E+00 1.610E-01 1.242E+01 7.616E+01 4.148E+00 1.258E-01 4.274E+00 1.607E-01 1.245E+01 7.633E+01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.145E+00 1.260E-01 4.271E+00 1.608E-01 1.244E+01 7.627E+01 4.151E+00 1.240E-01 4.275E+00 1.606E-01 1.245E+01 7.634E+01 4.153E+00 1.240E-01 4.277E+00 1.605E-01 1.246E+01 7.637E+01 4.150E+00 1.263E-01 4.276E+00 1.606E-01 1.245E+01 7.636E+01 STOP LINS -END OF TASK CODE= 0 CPUTIME=2:00.705/0.092 D T 02/10/86 16:13:17 D A USER TIME 2:00.705 SVC TIME 0.092 WAIT TIME 0.421 ROLL TIME 0.000 I/O 30 ROLLS 0 SIGNOFF ELAPSED TIME=2:07 CPUTIME=2:00.705/0.092 TIME OFF=02/10/86 16:13:18 D T 02/10/86 16:19:57 LIND ,,,,,,,,,,,,, PLEASE SEND THE RESULTS OF THIS RUN TO: JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 TELEPHONE: 312-972-7246 ARPANET: DONGARRA@ANL-MCS NORM. RESID RESID MACHEP X(1) X(N) 1.07075157E+00 4.75175455E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 6.109E+00 1.850E-01 6.294E+00 1.091E-01 1.833E+01 1.124E+02 6.099E+00 1.810E-01 6.280E+00 1.093E-01 1.829E+01 1.121E+02 6.099E+00 1.810E-01 6.280E+00 1.093E-01 1.829E+01 1.121E+02 6.104E+00 1.855E-01 6.290E+00 1.092E-01 1.832E+01 1.123E+02 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 6.124E+00 1.830E-01 6.307E+00 1.089E-01 1.837E+01 1.126E+02 6.109E+00 1.810E-01 6.290E+00 1.092E-01 1.832E+01 1.123E+02 6.127E+00 1.840E-01 6.311E+00 1.088E-01 1.838E+01 1.127E+02 6.118E+00 1.865E-01 6.304E+00 1.089E-01 1.836E+01 1.126E+02 STOP LIND -END OF TASK CODE= 0 CPUTIME=2:54.896/0.100 D T 02/10/86 16:22:56 D A USER TIME 2:54.896 SVC TIME 0.100 WAIT TIME 0.452 ROLL TIME 0.000 I/O 31 ROLLS 0 SIGNOFF ELAPSED TIME=3:01 CPUTIME=2:54.896/0.100 TIME OFF=02/10/86 16:22:56 REAL FUNCTION SECOND(T) REAL T SAVE ISTART,TIME IF (ISTART .NE. 1) THEN CALL MTIME(I) TIME=0.0 ISTART=1 RETURN END IF C CALL MTIME (I) TIME = TIME + FLOAT(I)/1000. SECOND = TIME CALL MTIME (I) RETURN C END SUBROUTINE MTIME(I) $ASSM ST 0,SAV0 L 0,FLAG BNZ $$1 XR 0,0 SVC 2,SETTIME SVC 2,GETTIME SVC 2,CANTIME L 0,ITIME S 0,NTIME ST 0,HOLDTIME LIS 0,1 ST 0,FLAG XR 0,0 ST 0,J SVC 2,SETTIME L 0,SAV0 $FORT I=J RETURN $ASSM $$1 XR 0,0 SVC 2,GETTIME SVC 2,CANTIME ST 0,FLAG L 0,ITIME S 0,NTIME S 0,HOLDTIME ST 0,J L 0,SAV0 $FORT I=J RETURN $ASSM IMPUR ALIGN 4 SETTIME DB X'00',23 DC H'0' ITIME DC Y'1FFFFFFF' ALIGN 4 GETTIME DB X'20',23 DC H'0' NTIME DC Y'10000000' ALIGN 4 CANTIME DB X'10',23 DC H'0' DC Y'10000000' ALIGN 4 FLAG DC Y'0' HOLDTIME DS 4 SAV0 DS 4 PURE $FORT END From dgh%dgh@SUN.ARPA Sun Feb 16 12:10:11 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28995; Sun, 16 Feb 86 12:04:46 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA14369; Sun, 16 Feb 86 09:31:21 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA22882; Sun, 16 Feb 86 09:30:22 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA00628; Sun, 16 Feb 86 09:33:06 PST Date: Sun, 16 Feb 86 09:33:06 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602161733.AA00628@dgh.sun.uucp> To: Terry_Ratcliffe%mts%cheviot.newcastle@cs.ucl.ac.uk, Terry_Ratcliffe%newcastle.mailnet@mit-multics.arpa, alastair@ucbopal.berkeley.edu, decwrl!turtlevax!weitek!eric, dgh@dgh, dongarra@anl-mcs.arpa, hplabs!hpda!sra1!grimes, hplabs!motsj1!kjm, ihnp4!ima!pbear!peterb, ihnp4!ima!pbear!spastic!maggot!barada, ihnp4!ima!pbear!spastic!maggot!turner, ihnp4!ima!spastic!maggot!turner, ihnp4!inmet!pbear!peterb, ihnp4!inmet!pbear!spastic!maggot!barada, ihnp4!inmet!pbear!spastic!maggot!turner, ihnp4!inmet!spastic!maggot!turner, kcng@kim.berkeley.edu, oakhill!davet, oakhill!van, seismo!ukc!cheviot!robert, ucbvax!ibmpa!lmb, ucbvax!tektronix!ogcvax!inteloa!jimv, ucbvax!tektronix!ogcvax!moler, ucbvax!ucbdali.Berkeley.EDU!mcdonald, ucbvax!ucsfcgl!cca.ucsf!dick, zliu@weyl.berkeley.edu Subject: whetstone discussion Status: R Terry Ratcliffe of the University of Newcastle was kind enough to reply to my Whetstone attacks, raising some good points; my comments are interpolated: > David, > I've seen a copy of your "Benchmarking and the 68020 Cache" > and the "Weitek 1164/65 FPA" papers. > > a) Now you mention that the cache is on the 68020, and hence I would > expect a similar variation in KWips without the FPA. Have you > measured this and what's the results? > The cache effect is only profound (20%) when the P3 routine takes an amount of time comparable to an iteration of the calling loop. For double precision, for instance, the variation is more like 10%; if a 68881 were used instead of a Sun FPA, then the variation would be even smaller and probably not interesting, so I haven't bothered to measure it. > b) re your comments on whetstones. > > First my "credentials": > I have to run the "official (Government)" benchmarks from the > Central Computer and Telecommunications Agency whenever we get > approval (rarely) to buy a new large mainframe as part of the > evaluation process; > and I have to help tune some of the big 'orrible Fortran programs that > our number crunchers write (or more often inherit or import). > > So when is it a whetstone: The rules are quite simple. You can NOT alter > any line of the code, except those changes needed to get it to COMPILE, > (and those must be reported back and approved). (If it then won't run > that is logged as the result). You are allowed complete freedom of > where you place modules as part of the linking/loading process, and > it is assumed that potential suppliers will put them in the best place > for their box. > Things are far less defined here, which is why I think of Whetstone as a marketing benchmark. For instance, a customer called me to say that he had gotten Whetstone results rather different from what our marketing literature had led him to anticipate. It turned out he wasn't using any version of the usual Whetstone program, but rather some other code of unknown origin whose output was calibrated in Whetstones. That's why I tend to disregard Whetstone results that I haven't measured myself. > As to it's appropriateness re Linpack, you really need both benchmarks > as they are looking at two distinct sets of application areas. > Linpack is relevant to solving lots of linear equations. Though even > there it's not completely relevant as our really big number crunchers in > that area have taken to pulling out the inner loops and rewriting them > in assembler and hand tuning the assembler to the particular CPU model. That's a commentary on the compilers rather than the benchmark. As I indicated in one of my papers, our Sun-3 compilers produce truly optimal code for the 68881 or the FPA, on the inner loop of Linpack provided it's rolled. > Whetstone is more appropriate where you are finding minima of a > function in multi-dimensional space (lots of CPU time used by physicists > doing that); usually they call a library routine(s) to do the searching > but have to provide a subroutine to generate the function, which gets > called lots and lots of times (just like the infamous P3). > The dreaded divide by 2.0 in P3 just represents the fact that most > of these functions seem to end up with at least one inescapable divide. > I've always maintained that the value of T2 should alter between calls to > PA or P3 (to make it more difficult to produce a fiddled compiler) but > it's far too late to change it now. > > Terry Ratcliffe > >Terry_Ratcliffe%newcastle.mailnet@mit-multics.ARPA >Terry_Ratcliffe%mts%cheviot.newcastle@ucl-cs.ARPA I'd still maintain that the P3-calling loop is a poor model of nonlinear optimization calculations. In my own limited experience the nonlinear function is typically much more complicated that P3 suggests and the calling routine is much more complicated than a simple loop, although a simple loop might be a good model for an optimizer that calls an external function to compute many partial derivatives. More to the point, interprocedural optimization is becoming more common in other languages and will likely appear in Fortran compilers before long. Physicists that are calling short P3 style functions may soon figure out that by compiling their function with the source for the optimizer, they can get a noticeable improvement. To the extent interprocedural optimization is done, the Whetstone benchmark will be obsolete; so I would suggest that if you think it's worth preserving you should push your campaign to make T2 truly variable with the Whetstone authors. Jim Valerio of Intel mentioned that short routines that do complex arithmetic by calling subroutines fit the P3 mold rather well; I feel that's a compiler shortcoming since complex +-* should be compiled inline, but I must admit that Sun's compilers presently don't satisfy me in that regard. From dgh%dgh@SUN.ARPA Tue Feb 18 12:14:34 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA03407; Tue, 18 Feb 86 12:14:15 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA17170; Tue, 18 Feb 86 10:11:58 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA02735; Tue, 18 Feb 86 10:10:57 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA02960; Tue, 18 Feb 86 10:13:54 PST Date: Tue, 18 Feb 86 10:13:54 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602181813.AA02960@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: do you believe this? Status: RO This is a Sun-3 with an FPA, single precision, unrolled loops. What surprises me is the resid and norm resid values... This was run on a 4 Meg machine, accounting for a few page faults. norm. resid resid machep x(1) x(n) 0.00000000e+00 0.00000000e+00 2.23517418e-07 1.00016224e+00 9.99933422e-01 times are reported for matrices of order 1000 factor solve total mflops unit ratio times for array with leading dimension of1001 1.641e+03 6.360e+00 1.647e+03 4.059e-01 4.927e+00 2.942e+04 1083.2u 615.1s 2:25:37 19% 0+139k 2+4io 111045pf+2w From dgh%dgh@SUN.ARPA Tue Feb 18 22:48:56 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00269; Tue, 18 Feb 86 22:48:39 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA00206; Tue, 18 Feb 86 17:38:36 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA05405; Tue, 18 Feb 86 15:21:41 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA03630; Tue, 18 Feb 86 15:24:38 PST Date: Tue, 18 Feb 86 15:24:38 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602182324.AA03630@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: 1000x1000 single precision FPA results Status: RO Do these look better: S.ROLLffpa.huge.out Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1)-1 x(n)-1 1.13179665e+01 2.70068646e-03 1.19209290e-07 1.62243843e-04 -6.65783881e-05 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 1518.78 5.50 1524.28 439. 4.56 27219.28 953.5u 622.9s 2:24:47 18% 0+139k 14+4io 111909pf+2w From dgh%dgh@SUN.ARPA Thu Feb 20 16:26:37 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA06060; Thu, 20 Feb 86 16:26:27 cst Received: from snail.sun.uucp by sun.arpa (3.2-/SMI-3.0) id AA10383; Thu, 20 Feb 86 14:11:20 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA17229; Thu, 20 Feb 86 13:57:03 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA07481; Thu, 20 Feb 86 13:59:18 PST Date: Thu, 20 Feb 86 13:59:18 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602202159.AA07481@dgh.sun.uucp> To: aoki@sisyphus, carrie@lunar, crao@dgh, cwoo@dgh, dgh@dgh, dongarra@anl-mcs.arpa, edh@dragon, fetter@cygnus, jharman@dgh, jlc@maple, jricotta@dgh, katy@beluga, kmobley@dgh, mmm@dusk, psager@dgh, rcheng@dgh, weiss@genesis, wu@funshine Subject: 1000x1000 single precision Linpack test results Status: R The following output is from the new ultra-Linpack test included in the latest versions of Dongarra's paper. It is for single precision, release 3.1 f77 -O -ffpa: norm. resid resid machep x(1)-1 x(n)-1 1.13179665e+01 2.70068646e-03 1.19209290e-07 1.62243843e-04 -6.65783881e-05 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 846.62 2.56 849.18 787. 2.54 15163.93 888.1u 2.6s 14:55 99% 4+245k 14+5io 0pf+0w From dgh%dgh@SUN.ARPA Mon Feb 24 17:13:04 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA07190; Mon, 24 Feb 86 17:12:48 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA00282; Mon, 24 Feb 86 15:10:46 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08707; Mon, 24 Feb 86 14:09:10 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA13437; Mon, 24 Feb 86 14:11:06 PST Date: Mon, 24 Feb 86 14:11:06 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8602242211.AA13437@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: 12 meg results for 1000x1000 double precision Cc: aoki@sisyphus, carrie@lunar, crao@dgh, cwoo@dgh, dgh@dgh, dlambright@dgh, fetter@cygnus, jharman@dgh, jlc@maple, jricotta@dgh, katy@beluga, kmobley@dgh, mmm@dusk, psager@dgh, rcheng@dgh, weiss@genesis, wu@funshine Status: R Using Sun FPA; you can publish these: norm. resid resid machep x(1)-1 x(n)-1 9.50387011e+00 4.22017976e-12 2.22044605e-16 1.09912079e-13 5.08926234e-13 times are reported for matrices of order 1000 sgefa sgesl total Kflops unit ratio times for array with leading dimension of1001 1443.78 5.16 1448.94 461. 4.33 25873.93 From JCG%ibm-b.rutherford.ac.uk@cs.ucl.ac.uk Wed Feb 26 04:43:24 1986 Received: from BRL-AOS.ARPA (brl-aos.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA24871; Wed, 26 Feb 86 04:43:18 cst Received: from ucl-cs.arpa by AOS.BRL.ARPA id a020289; 26 Feb 86 5:39 EST Received: from ibm-b.rutherford.ac.uk by 44d.Cs.Ucl.AC.UK via Janet with NIFTP id a000432; 26 Feb 86 9:43 GMT Message-Id: <25 February 1986 16:50:01 GMT JCG@UK.AC.RL.IB> Date: Tuesday, 25 February 1986 16:50:01 GMT From: John Gordon ext 6574 (JCG at RLVM370) Address: User Support Group R27 Rutherford Appleton Lab To: DONGARRA Status: R RESULTS FOR A LINPACKD RUN USING SOURCE FROM NETLIB@ANL-MCS MACHINE = FUJITSU M-830 (WITH NO HYPERVISOR) COMPILER = IBM VS OPT=3 REL 1.4.1 SYSTEM = VM/CMS RELEASE 3 HPO LEVEL 32 THE TIMES RETURNED BY FUNCTION SECOND ARE VIRTUAL TIMES. I NOTICED THAT THE OVERHEADS CAUSED BY THE FUNTION SECOND COULD LEAD TO SIGNIFICANT ERRORS IF THEY WERE LARGE. REGARDS: ANDREW BANKS. CENTRAL COMPUTING DIVISION, RUTHERFORD APPLETON LAB, CHILTON, DIDCOT, OXON OX11 OQX ENGLAND. PLEASE SEND THE RESULTS OF THIS RUN TO: 0JACK J. DONGARRA MATHEMATICS AND COMPUTER SCIENCE DIVISION ARGONNE NATIONAL LABORATORY ARGONNE, ILLINOIS 60439 0TELEPHONE: 312-972-7246 0ARPANET: DONGARRA@ANL-MCS 0 NORM. RESID RESID MACHEP X(1) X(N) 2.91923652E+00 1.29549149E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 - TIMES ARE REPORTED FOR MATRICES OF ORDER 100 DGEFA DGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 1.194E-01 3.937E-03 1.234E-01 5.566E+00 3.593E-01 2.203E+00 1.202E-01 4.080E-03 1.243E-01 5.524E+00 3.620E-01 2.220E+00 1.194E-01 4.003E-03 1.234E-01 5.565E+00 3.594E-01 2.203E+00 1.198E-01 3.900E-03 1.237E-01 5.552E+00 3.602E-01 2.208E+00 0TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 1.194E-01 3.903E-03 1.233E-01 5.568E+00 3.592E-01 2.202E+00 1.188E-01 3.958E-03 1.228E-01 5.593E+00 3.576E-01 2.192E+00 1.193E-01 3.896E-03 1.232E-01 5.574E+00 3.588E-01 2.200E+00 1.193E-01 3.831E-03 1.231E-01 5.579E+00 3.585E-01 2.198E+00 From fae.wu@ames-vmsb.ARPA Tue Mar 4 10:40:43 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA16659; Tue, 4 Mar 86 10:40:30 cst Message-Id: <8603041640.AA16659@anl-mcs.ARPA> Date: 4 Mar 86 08:23:00 PST From: fae.wu@ames-vmsb.ARPA Subject: To: dongarra@anl-mcs Reply-To: fae.wu@ames-vmsb.ARPA Status: RO LU DECOMPOSITION TIMING SP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.103E-01 mflops = 7.9555 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.772E-02 mflops = 10.6352 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.632E-02 mflops = 12.9893 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.580E-02 mflops = 14.1676 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.589E-02 mflops = 13.9461 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.485E-01 mflops = 13.6331 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.345E-01 mflops = 19.1772 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.268E-01 mflops = 24.6920 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.236E-01 mflops = 28.0376 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.225E-01 mflops = 29.4309 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.127E+00 mflops = 17.6700 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.877E-01 mflops = 25.5389 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.681E-01 mflops = 32.8638 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.596E-01 mflops = 37.5585 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.566E-01 mflops = 39.5518 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.275E+00 mflops = 19.3239 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.207E+00 mflops = 25.6562 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.156E+00 mflops = 34.1686 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.130E+00 mflops = 40.9673 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.118E+00 mflops = 44.9886 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.460E+00 mflops = 22.5728 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.360E+00 mflops = 28.8679 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.266E+00 mflops = 39.1114 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.215E+00 mflops = 48.3334 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.181E+00 mflops = 57.3544 CHECK = 0.100E+01 SP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.708E+00 mflops = 25.3668 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.556E+00 mflops = 32.2936 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.394E+00 mflops = 45.5153 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.349E+00 mflops = 51.4380 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.329E+00 mflops = 54.5877 CHECK = 0.100E+01 ------ From fae.wu@ames-vmsb.ARPA Tue Mar 4 10:41:08 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA16681; Tue, 4 Mar 86 10:40:50 cst Message-Id: <8603041640.AA16681@anl-mcs.ARPA> Date: 4 Mar 86 08:23:00 PST From: fae.wu@ames-vmsb.ARPA Subject: To: dongarra@anl-mcs Reply-To: fae.wu@ames-vmsb.ARPA Status: RO pLEASE SEND THE RESULTS OF THIS RUN TO: jACK j. dONGARRA mATHEMATICS AND cOMPUTER sCIENCE dIVISION aRGONNE nATIONAL lABORATORY aRGONNE, iLLINOIS 60439 tELEPHONE: 312-972-7246 arpaNET: dongarra@anl-mcs NORM. RESID RESID MACHEP X(1) X(N) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 5.190E-02 1.483E-03 5.338E-02 1.286E+01 1.555E-01 9.533E-01 5.051E-02 1.489E-03 5.200E-02 1.321E+01 1.514E-01 9.285E-01 5.056E-02 1.517E-03 5.207E-02 1.319E+01 1.517E-01 9.299E-01 5.284E-02 1.450E-03 5.429E-02 1.265E+01 1.581E-01 9.695E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 5.082E-02 1.489E-03 5.231E-02 1.313E+01 1.523E-01 9.340E-01 5.229E-02 1.507E-03 5.379E-02 1.277E+01 1.567E-01 9.606E-01 5.251E-02 1.490E-03 5.400E-02 1.272E+01 1.573E-01 9.643E-01 5.036E-02 1.499E-03 5.186E-02 1.324E+01 1.510E-01 9.260E-01 ------ From fae.wu@ames-vmsb.ARPA Wed Mar 5 09:22:49 1986 Received: from ames-vmsb.ARPA (ames-vmsb.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA29070; Wed, 5 Mar 86 09:22:42 cst Message-Id: <8603051522.AA29070@anl-mcs.ARPA> Date: 5 Mar 86 07:15:00 PST From: fae.wu@ames-vmsb.ARPA Subject: Re: To: dongarra@anl-mcs.ARPA Reply-To: fae.wu@ames-vmsb.ARPA Status: RO Jack, Sorry, they are for the CRAY-2 running CFT ver 2.63. Pretty disappointing. Slower than a CRAY-1. Alex. ------ From eugene@AMES-NAS.ARPA Fri Mar 7 11:48:09 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02089; Fri, 7 Mar 86 11:48:04 cst Date: Fri, 7 Mar 86 09:52:00 pst From: eugene@AMES-NAS.ARPA (Eugene Miya) Message-Id: <8603071752.AA09206@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:52:00 pst To: dongarra@anl-mcs.ARPA Status: R Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.154E-02 1.475E-03 5.301E-02 1.295E+01 1.544E-01 9.467E-01 5.159E-02 1.485E-03 5.308E-02 1.294E+01 1.546E-01 9.478E-01 5.179E-02 1.474E-03 5.326E-02 1.289E+01 1.551E-01 9.511E-01 5.100E-02 1.485E-03 5.248E-02 1.308E+01 1.529E-01 9.372E-01 times for array with leading dimension of 200 5.168E-02 1.485E-03 5.316E-02 1.292E+01 1.548E-01 9.494E-01 5.154E-02 1.501E-03 5.305E-02 1.294E+01 1.545E-01 9.472E-01 5.156E-02 1.476E-03 5.304E-02 1.295E+01 1.545E-01 9.472E-01 5.149E-02 1.501E-03 5.299E-02 1.296E+01 1.543E-01 9.463E-01 From barton@AMES-NAS.ARPA Fri Mar 7 11:54:07 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02154; Fri, 7 Mar 86 11:54:00 cst Date: Fri, 7 Mar 86 09:57:52 pst From: barton@AMES-NAS.ARPA (John Barton) Message-Id: <8603071757.AA09278@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:57:52 pst To: dongarra@anl-mcs.ARPA Subject: Re: timings on the cray-2 Cc: +out@AMES-NAS.ARPA, blaylock@AMES-NAS.ARPA, eugene@AMES-NAS.ARPA, rbailey@AMES-NAS.ARPA, stevens@mercury Status: R Jack, The NAS project is glad to be of help to you in working on the Cray-2. We are quite interested in any timing information which you have. Please let me know exactly what the results are that you would be publicizing, and I will be better able to comment on their distribution. John Barton From eugene@AMES-NAS.ARPA Fri Mar 7 11:48:09 1986 Received: from ames-nas.ARPA (ames-nas.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02089; Fri, 7 Mar 86 11:48:04 cst Date: Fri, 7 Mar 86 09:52:00 pst From: eugene@AMES-NAS.ARPA (Eugene Miya) Message-Id: <8603071752.AA09206@ames-nas.ARPA> Received: by ames-nas.ARPA; Fri, 7 Mar 86 09:52:00 pst To: dongarra@anl-mcs.ARPA Status: R Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.81127134E+00 2.57216470E-12 7.10542736E-15 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.154E-02 1.475E-03 5.301E-02 1.295E+01 1.544E-01 9.467E-01 5.159E-02 1.485E-03 5.308E-02 1.294E+01 1.546E-01 9.478E-01 5.179E-02 1.474E-03 5.326E-02 1.289E+01 1.551E-01 9.511E-01 5.100E-02 1.485E-03 5.248E-02 1.308E+01 1.529E-01 9.372E-01 times for array with leading dimension of 200 5.168E-02 1.485E-03 5.316E-02 1.292E+01 1.548E-01 9.494E-01 5.154E-02 1.501E-03 5.305E-02 1.294E+01 1.545E-01 9.472E-01 5.156E-02 1.476E-03 5.304E-02 1.295E+01 1.545E-01 9.472E-01 5.149E-02 1.501E-03 5.299E-02 1.296E+01 1.543E-01 9.463E-01 From Lewis_W._Kellum%UMich-MTS.MAILNET@MIT-MULTICS.ARPA Wed Mar 12 12:10:21 1986 Received: from MIT-MULTICS.ARPA (mit-multics.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA01691; Wed, 12 Mar 86 12:08:53 cst Received: from UMich-MTS.Mailnet by MIT-MULTICS.ARPA with Mailnet id <2688487467539634@MIT-MULTICS.ARPA>; 12 Mar 1986 13:04:27 est Date: Wed, 12 Mar 86 11:28:26 EST From: Lewis_W._Kellum%UMich-MTS.Mailnet@MIT-MULTICS.ARPA To: DONGARRA@ANL-MCS.ARPA Message-Id: <1183783@UMich-MTS.Mailnet> Status: RO Dr. Dongarra - Here are linpack benchmark results for the Apollo dn3000. There was a problem with the 4th and 8th timing loop in the double precision version which I have since fixed. However the dn3000 was a demo, and left before I could rerun the bench. The other times should be fine. I've included a run on a dn660 for comparison. - Woody Kellum@um-mts. Lewis_W._Kellum%UMich-MTS.Mailnet%MIT-MULTICS.ARPA%YALE.ARPA%yal@MIT-Multics.ARPA ------------------------------------------------------------------------------ ----------------------------------------------------------------------------- Apollo dn3000, Double precision w/ -cpu 330 opt -------------------------------------------------- Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.64502611E+00 7.30025578E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.072E+01 3.241E-01 1.104E+01 6.220E-02 3.215E+01 1.971E+02 1.072E+01 3.299E-01 1.105E+01 6.213E-02 3.219E+01 1.973E+02 1.072E+01 3.240E-01 1.105E+01 6.216E-02 3.217E+01 1.973E+02 1.073E+02 3.233E-01 1.076E+02 6.379E-03 3.135E+02 1.922E+03 times for array with leading dimension of 200 1.081E+01 3.242E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.081E+01 3.242E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.081E+01 3.247E-01 1.114E+01 6.165E-02 3.244E+01 1.989E+02 1.083E+01 3.244E+00 1.407E+01 4.879E-02 4.099E+01 2.513E+02 Fortran STOP ------------------------------------------------------------------------ ----------------------------------------------------------------------- Apollo dn3000, Single Precision w/ -cpu 330 opt ----------------------------------------------------------------------- Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.54914700E+00 3.69101800E-05 1.19209300E-07 9.99986200E-01 9.99992500E-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 9.439E+00 2.852E-01 9.724E+00 7.062E-02 2.832E+01 1.736E+02 9.440E+00 2.850E-01 9.725E+00 7.061E-02 2.833E+01 1.737E+02 9.437E+00 2.850E-01 9.722E+00 7.063E-02 2.832E+01 1.736E+02 9.489E+00 2.847E-01 9.774E+00 7.025E-02 2.847E+01 1.745E+02 times for array with leading dimension of 200 9.436E+00 2.850E-01 9.721E+00 7.064E-02 2.831E+01 1.736E+02 9.437E+00 2.846E-01 9.722E+00 7.063E-02 2.832E+01 1.736E+02 9.435E+00 2.846E-01 9.720E+00 7.065E-02 2.831E+01 1.736E+02 9.448E+00 2.847E-01 9.733E+00 7.055E-02 2.835E+01 1.738E+02 Fortran STOP ------------------------------------------------------------------------- ------------------------------------------------------------------------- Apollo DN660, Double precision,compiled w/ -cpu 660 option ---------------------------------------------------------- $ BENCH/DOUBLE_LINPACK_BENCH_X60 Please send the results of this run to: Jack J. Dongarra Mathematics and Computer Science Division Argonne National Laboratory Argonne, Illinois 60439 Telephone: 312-972-7246 ARPAnet: DONGARRA@ANL-MCS norm. resid resid machep x(1) x(n) 1.62238876E+00 7.19979631E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 9.731E+00 2.980E-01 1.003E+01 6.847E-02 2.921E+01 1.791E+02 9.783E+00 3.012E-01 1.008E+01 6.809E-02 2.937E+01 1.801E+02 9.814E+00 2.983E-01 1.011E+01 6.790E-02 2.945E+01 1.806E+02 9.776E+00 2.983E-01 1.007E+01 6.816E-02 2.934E+01 1.799E+02 times for array with leading dimension of 200 9.754E+00 2.983E-01 1.005E+01 6.831E-02 2.928E+01 1.795E+02 9.736E+00 3.050E-01 1.004E+01 6.839E-02 2.925E+01 1.793E+02 9.776E+00 2.985E-01 1.007E+01 6.816E-02 2.934E+01 1.799E+02 9.775E+00 3.096E-01 1.008E+01 6.809E-02 2.937E+01 1.801E+02 Fortran STOP ------------------------------------------------------------------ From dgh%dgh@SUN.ARPA Fri Apr 4 16:38:10 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA02544; Fri, 4 Apr 86 16:38:04 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA28335; Fri, 4 Apr 86 14:35:11 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08749; Fri, 4 Apr 86 14:34:09 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08920; Fri, 4 Apr 86 14:38:10 PST Date: Fri, 4 Apr 86 14:38:10 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042238.AA08920@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO single precision Sun-2/50 3.0 f77 -O -fsoft (contrary to what I said earlier this week, I think the software release number should come first rather than last) norm. resid resid machep x(1) x(n) 1.59605336e+00 3.80277633e-05 1.19209290e-07 9.99986172e-01 9.99992490e-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 5.408e+01 1.600e+00 5.568e+01 1.233e-02 1.622e+02 9.943e+02 5.344e+01 1.640e+00 5.508e+01 1.247e-02 1.604e+02 9.836e+02 5.486e+01 1.660e+00 5.652e+01 1.215e-02 1.646e+02 1.009e+03 5.344e+01 1.612e+00 5.505e+01 1.247e-02 1.603e+02 9.830e+02 times for array with leading dimension of 200 5.334e+01 1.640e+00 5.498e+01 1.249e-02 1.601e+02 9.818e+02 5.340e+01 1.600e+00 5.500e+01 1.248e-02 1.602e+02 9.821e+02 5.342e+01 1.600e+00 5.502e+01 1.248e-02 1.603e+02 9.825e+02 5.517e+01 1.618e+00 5.678e+01 1.209e-02 1.654e+02 1.014e+03 From dgh%dgh@SUN.ARPA Fri Apr 4 13:30:18 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA28499; Fri, 4 Apr 86 13:28:07 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA27671; Fri, 4 Apr 86 11:24:49 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA07253; Fri, 4 Apr 86 11:22:38 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08618; Fri, 4 Apr 86 11:26:31 PST Date: Fri, 4 Apr 86 11:26:31 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604041926.AA08618@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO Single precision Sun-2/50 + Sky FFP f77 -O -fsky 3.0 norm. resid resid machep x(1) x(n) 1.43082821e+00 3.40938568e-05 1.19209290e-07 9.99960721e-01 9.99971569e-01 times are reported for matrices of order 100 sgefa sgesl total mflops unit ratio times for array with leading dimension of 201 1.448e+01 4.800e-01 1.496e+01 4.590e-02 4.357e+01 2.671e+02 1.404e+01 4.400e-01 1.448e+01 4.742e-02 4.217e+01 2.586e+02 1.440e+01 5.400e-01 1.494e+01 4.596e-02 4.351e+01 2.668e+02 1.446e+01 4.380e-01 1.489e+01 4.610e-02 4.338e+01 2.660e+02 times for array with leading dimension of 200 1.412e+01 4.600e-01 1.458e+01 4.710e-02 4.247e+01 2.604e+02 1.410e+01 4.200e-01 1.452e+01 4.729e-02 4.229e+01 2.593e+02 1.452e+01 4.200e-01 1.494e+01 4.596e-02 4.351e+01 2.668e+02 1.407e+01 4.300e-01 1.450e+01 4.737e-02 4.222e+01 2.589e+02 From dgh%dgh@SUN.ARPA Fri Apr 4 14:23:00 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA00331; Fri, 4 Apr 86 14:22:49 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA27873; Fri, 4 Apr 86 12:19:47 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA07655; Fri, 4 Apr 86 12:18:52 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08686; Fri, 4 Apr 86 12:22:54 PST Date: Fri, 4 Apr 86 12:22:54 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042022.AA08686@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO double precision Sun-2/50 + Sky FFP f77 -O -fsky 3.0 norm. resid resid machep x(1) x(n) 1.50887158e+00 6.69603262e-14 2.22044605e-16 1.00000000e+00 1.00000000e+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 2.478e+01 7.600e-01 2.554e+01 2.689e-02 7.439e+01 4.561e+02 2.468e+01 7.400e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.474e+01 7.600e-01 2.550e+01 2.693e-02 7.427e+01 4.554e+02 2.537e+01 7.460e-01 2.612e+01 2.629e-02 7.607e+01 4.664e+02 times for array with leading dimension of 200 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.466e+01 7.600e-01 2.542e+01 2.701e-02 7.404e+01 4.539e+02 2.491e+01 7.480e-01 2.566e+01 2.676e-02 7.474e+01 4.582e+02 From dgh%dgh@SUN.ARPA Fri Apr 4 15:43:42 1986 Received: from sun.arpa (sun.arpa.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA01754; Fri, 4 Apr 86 15:43:31 cst Received: from snail.sun.uucp (snail-ptp) by sun.arpa (3.2-/SMI-3.0) id AA28134; Fri, 4 Apr 86 13:39:34 PST Received: from dgh.sun.uucp by snail.sun.uucp (3.2-/SMI-3.0DEV4) id AA08292; Fri, 4 Apr 86 13:38:35 PST Received: by dgh.sun.uucp (1.1/SMI-3.0DEV3) id AA08816; Fri, 4 Apr 86 13:42:36 PST Date: Fri, 4 Apr 86 13:42:36 PST From: dgh%dgh@SUN.ARPA (David Hough) Message-Id: <8604042142.AA08816@dgh.sun.uucp> To: dongarra@anl-mcs.arpa Subject: linpack results Status: RO double precision Sun-2/50 3.0 f77 -O -fsoft norm. resid resid machep x(1) x(n) 1.67117300e+00 7.41628980e-14 2.22044605e-16 1.00000000e+00 1.00000000e+00 times are reported for matrices of order 100 dgefa dgesl total mflops unit ratio times for array with leading dimension of 201 1.232e+02 3.700e+00 1.269e+02 5.412e-03 3.696e+02 2.266e+03 1.260e+02 3.880e+00 1.298e+02 5.289e-03 3.782e+02 2.319e+03 1.212e+02 3.780e+00 1.250e+02 5.492e-03 3.641e+02 2.232e+03 1.209e+02 3.622e+00 1.245e+02 5.513e-03 3.628e+02 2.224e+03 times for array with leading dimension of 200 1.192e+02 3.580e+00 1.228e+02 5.592e-03 3.577e+02 2.193e+03 1.201e+02 3.560e+00 1.236e+02 5.554e-03 3.601e+02 2.208e+03 1.188e+02 3.560e+00 1.224e+02 5.612e-03 3.564e+02 2.185e+03 1.205e+02 3.620e+00 1.241e+02 5.532e-03 3.615e+02 2.217e+03 From unido!ztivax!schnepf@seismo.CSS.GOV Tue Apr 8 03:22:42 1986 Received: from seismo.CSS.GOV (seismo.css.gov.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA07950; Tue, 8 Apr 86 03:21:58 cst Received: from unido.UUCP by seismo.CSS.GOV with UUCP; Tue, 8 Apr 86 03:28:43 EST Received: by unido.uucp with uucp; Tue, 8 Apr 86 09:58:29 -0200 Return-Path: Received: by ztivax.LOCAL (4.12/4.8) id AA14752; Tue, 8 Apr 86 08:44:55 -0100 (MET) Date: Tue, 8 Apr 86 08:44:55 -0100 From: unido!ztivax!schnepf@seismo.CSS.GOV (Eric Schnepf) Posted-Date: Tue, 8 Apr 86 08:44:55 -0100 Message-Id: <8604080744.AA14752@ztivax.LOCAL> To: dongarra@anl-mcs.arpa Status: RO Dear Jack , below I'm sending you the results I obtained on the VP 100 and VP 50 for the LINPACK benchmarks (100 equations and 300 equations). Additional the result for 1000 equations on a VP 200 (peak performance) is appended. I hope this information is satisfactorily for your performance paper. If you need any further information please let me know. Regards, Eric Schnepf Siemens AG, Vector Processor Systems Munich, Germany Arpanet: Na.schnepf at su-score SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN FULL PRECISION (FORTRAN77/VP, ROLLED BLAS): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.201E-02 1.693E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.198E-02 1.693E-03 4.367E-02 1.572E+01 1.272E-01 7.799E-01 4.195E-02 1.693E-03 4.365E-02 1.573E+01 1.271E-01 7.794E-01 4.197E-02 1.576E-03 4.355E-02 1.577E+01 1.268E-01 7.776E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.203E-02 1.667E-03 4.370E-02 1.571E+01 1.273E-01 7.803E-01 4.204E-02 1.576E-03 4.361E-02 1.574E+01 1.270E-01 7.788E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.883E-02 1.927E-03 5.076E-02 1.353E+01 1.478E-01 9.063E-01 4.878E-02 1.901E-03 5.068E-02 1.355E+01 1.476E-01 9.049E-01 4.878E-02 1.901E-03 5.068E-02 1.355E+01 1.476E-01 9.049E-01 4.879E-02 1.797E-03 5.058E-02 1.357E+01 1.473E-01 9.033E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.875E-02 1.901E-03 5.065E-02 1.356E+01 1.475E-01 9.045E-01 4.883E-02 1.805E-03 5.064E-02 1.356E+01 1.475E-01 9.042E-01 ----------------------------------------------------------------------- SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN FULL PRECISION (FORTRAN77/VP, ROLLED BLAS WITH COMPILER DIRECTIVES): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 3.924E-02 1.589E-03 4.083E-02 1.682E+01 1.189E-01 7.292E-01 3.924E-02 1.562E-03 4.081E-02 1.683E+01 1.189E-01 7.287E-01 3.922E-02 1.562E-03 4.078E-02 1.684E+01 1.188E-01 7.282E-01 3.922E-02 1.461E-03 4.068E-02 1.688E+01 1.185E-01 7.265E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 3.932E-02 1.562E-03 4.089E-02 1.679E+01 1.191E-01 7.301E-01 3.927E-02 1.536E-03 4.081E-02 1.683E+01 1.189E-01 7.287E-01 3.924E-02 1.536E-03 4.078E-02 1.684E+01 1.188E-01 7.282E-01 3.928E-02 1.453E-03 4.073E-02 1.686E+01 1.186E-01 7.273E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 2.65186135E+00 1.17683641E-13 2.22044605E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 201 4.237E-02 1.719E-03 4.409E-02 1.557E+01 1.284E-01 7.873E-01 4.237E-02 1.693E-03 4.406E-02 1.558E+01 1.283E-01 7.868E-01 4.240E-02 1.667E-03 4.406E-02 1.558E+01 1.283E-01 7.868E-01 4.239E-02 1.581E-03 4.397E-02 1.562E+01 1.281E-01 7.852E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.258E-02 1.719E-03 4.430E-02 1.550E+01 1.290E-01 7.910E-01 4.255E-02 1.693E-03 4.424E-02 1.552E+01 1.289E-01 7.901E-01 4.253E-02 1.667E-03 4.419E-02 1.554E+01 1.287E-01 7.892E-01 4.254E-02 1.581E-03 4.412E-02 1.556E+01 1.285E-01 7.879E-01 ----------------------------------------------------------------------- SOLVING A SYSTEM OF 100 LINEAR EQUATIONS WITH LINPACK IN HALF PRECISION (FORTRAN77/VP, ROLLED BLAS): TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 3.68801689E+00 7.03105237E-04 9.53674316E-07 9.99935091E-01 9.99949932E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 202 4.029E-02 1.615E-03 4.190E-02 1.639E+01 1.220E-01 7.482E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.508E-03 4.164E-02 1.649E+01 1.213E-01 7.435E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.016E-02 1.589E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.013E-02 1.615E-03 4.174E-02 1.645E+01 1.216E-01 7.454E-01 4.012E-02 1.505E-03 4.163E-02 1.650E+01 1.212E-01 7.433E-01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- NORM. RESID RESID MACHEP X(1) X(N) 3.68801689E+00 7.03105237E-04 9.53674316E-07 9.99935091E-01 9.99949932E-01 TIMES ARE REPORTED FOR MATRICES OF ORDER 100 SGEFA SGESL TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 202 4.758E-02 1.875E-03 4.945E-02 1.389E+01 1.440E-01 8.831E-01 4.747E-02 1.823E-03 4.930E-02 1.393E+01 1.436E-01 8.803E-01 4.750E-02 1.823E-03 4.932E-02 1.392E+01 1.437E-01 8.808E-01 4.749E-02 1.737E-03 4.923E-02 1.395E+01 1.434E-01 8.790E-01 TIMES FOR ARRAY WITH LEADING DIMENSION OF 200 4.753E-02 1.823E-03 4.935E-02 1.391E+01 1.437E-01 8.812E-01 4.753E-02 1.823E-03 4.935E-02 1.391E+01 1.437E-01 8.812E-01 4.755E-02 1.849E-03 4.940E-02 1.390E+01 1.439E-01 8.822E-01 4.754E-02 1.740E-03 4.928E-02 1.393E+01 1.435E-01 8.800E-01 ------------------------------------------------------------------- SOLVING A SYSTEM OF 300 LINEAR EQUATIONS USING THE VECTOR UNROLLING TECHNIQUE: (PROGRAM LUD) TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): ------ LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.471E-02 MFLOPS = 17.4232 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.430E-02 MFLOPS = 19.1127 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.464E-02 MFLOPS = 17.7168 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.490E-02 MFLOPS = 16.7745 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.531E-02 MFLOPS = 15.4588 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.163E-01 MFLOPS = 40.7230 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.142E-01 MFLOPS = 46.4556 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.152E-01 MFLOPS = 43.5871 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.158E-01 MFLOPS = 41.8636 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.170E-01 MFLOPS = 38.8550 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.367E-01 MFLOPS = 61.0168 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.303E-01 MFLOPS = 73.9868 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.323E-01 MFLOPS = 69.3889 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.333E-01 MFLOPS = 67.2714 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.359E-01 MFLOPS = 62.3896 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.686E-01 MFLOPS = 77.4339 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.533E-01 MFLOPS = 99.7255 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.564E-01 MFLOPS = 94.2876 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.587E-01 MFLOPS = 90.5630 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.627E-01 MFLOPS = 84.6984 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.116E+00 MFLOPS = 89.1982 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.850E-01 MFLOPS = 122.1454 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.897E-01 MFLOPS = 115.7636 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.933E-01 MFLOPS = 111.3053 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.100E+00 MFLOPS = 103.7757 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.183E+00 MFLOPS = 97.9239 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.129E+00 MFLOPS = 139.3180 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.134E+00 MFLOPS = 133.5431 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E+00 MFLOPS = 127.8239 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.151E+00 MFLOPS = 119.2251 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 100 (15 NSEC. CYCLE TIME): (WITH COMPILER DIRECTIVES) -------------------------- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.385E-02 MFLOPS = 21.3081 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.346E-02 MFLOPS = 23.7113 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.388E-02 MFLOPS = 21.1651 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.422E-02 MFLOPS = 19.4667 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.471E-02 MFLOPS = 17.4232 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.135E-01 MFLOPS = 48.9617 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.116E-01 MFLOPS = 56.8481 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.129E-01 MFLOPS = 51.1291 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E-01 MFLOPS = 47.3206 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.157E-01 MFLOPS = 42.2815 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.317E-01 MFLOPS = 70.5273 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.251E-01 MFLOPS = 89.2760 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.280E-01 MFLOPS = 79.9005 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.299E-01 MFLOPS = 74.7589 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.337E-01 MFLOPS = 66.4910 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.617E-01 MFLOPS = 86.0559 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.445E-01 MFLOPS = 119.3207 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.498E-01 MFLOPS = 106.6589 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.530E-01 MFLOPS = 100.2153 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.597E-01 MFLOPS = 88.9831 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.107E+00 MFLOPS = 96.9865 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.732E-01 MFLOPS = 141.9740 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.804E-01 MFLOPS = 129.2319 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.861E-01 MFLOPS = 120.5592 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.960E-01 MFLOPS = 108.2247 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.172E+00 MFLOPS = 104.4982 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.113E+00 MFLOPS = 159.3439 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.122E+00 MFLOPS = 146.9791 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.131E+00 MFLOPS = 137.4838 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.145E+00 MFLOPS = 124.0296 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): ----- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.549E-02 MFLOPS = 14.9460 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.529E-02 MFLOPS = 15.5350 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.547E-02 MFLOPS = 15.0171 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.581E-02 MFLOPS = 14.1417 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.620E-02 MFLOPS = 13.2504 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.207E-01 MFLOPS = 31.9236 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.189E-01 MFLOPS = 35.0017 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.191E-01 MFLOPS = 34.6673 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.199E-01 MFLOPS = 33.1739 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.208E-01 MFLOPS = 31.8836 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.514E-01 MFLOPS = 43.5746 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.437E-01 MFLOPS = 51.1743 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.432E-01 MFLOPS = 51.8846 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.448E-01 MFLOPS = 49.9261 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.469E-01 MFLOPS = 47.7626 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.103E+00 MFLOPS = 51.6030 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.827E-01 MFLOPS = 64.2236 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.813E-01 MFLOPS = 65.3760 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.833E-01 MFLOPS = 63.8016 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.867E-01 MFLOPS = 61.2729 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.182E+00 MFLOPS = 57.1933 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.142E+00 MFLOPS = 73.0684 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.137E+00 MFLOPS = 76.0069 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.140E+00 MFLOPS = 74.3906 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.145E+00 MFLOPS = 71.7668 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.294E+00 MFLOPS = 61.1622 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.225E+00 MFLOPS = 79.7273 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.215E+00 MFLOPS = 83.6649 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.218E+00 MFLOPS = 82.3360 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.225E+00 MFLOPS = 79.6352 CHECK = 0.100E+01 TIMINGS FOR FUJITSU VP 50 (15 NSEC. CYCLE TIME): (WITH COMPILER DIRECTIVES) -------------------------- LU DECOMPOSITION TIMING DP SIZE OF THE ARRAYS 301 AND ORDER IS 50 UNROLLING DEPTH 1 TIME = 0.445E-02 MFLOPS = 18.4421 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.409E-02 MFLOPS = 20.0866 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.453E-02 MFLOPS = 18.1241 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.495E-02 MFLOPS = 16.5979 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.549E-02 MFLOPS = 14.9460 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 100 UNROLLING DEPTH 1 TIME = 0.170E-01 MFLOPS = 39.0341 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.147E-01 MFLOPS = 45.1354 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.162E-01 MFLOPS = 40.7886 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.176E-01 MFLOPS = 37.7021 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.193E-01 MFLOPS = 34.2469 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 150 UNROLLING DEPTH 1 TIME = 0.424E-01 MFLOPS = 52.7764 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.341E-01 MFLOPS = 65.7284 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.376E-01 MFLOPS = 59.4967 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.398E-01 MFLOPS = 56.1914 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.442E-01 MFLOPS = 50.6020 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 200 UNROLLING DEPTH 1 TIME = 0.863E-01 MFLOPS = 61.5872 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.653E-01 MFLOPS = 81.3874 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.705E-01 MFLOPS = 75.3187 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.760E-01 MFLOPS = 69.9480 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.828E-01 MFLOPS = 64.1631 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 250 UNROLLING DEPTH 1 TIME = 0.154E+00 MFLOPS = 67.6405 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.113E+00 MFLOPS = 92.2313 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.120E+00 MFLOPS = 86.2099 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.128E+00 MFLOPS = 80.8286 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.140E+00 MFLOPS = 74.3764 CHECK = 0.100E+01 DP SIZE OF THE ARRAYS 301 AND ORDER IS 300 UNROLLING DEPTH 1 TIME = 0.250E+00 MFLOPS = 71.7312 CHECK = 0.100E+01 UNROLLING DEPTH 2 TIME = 0.180E+00 MFLOPS = 99.6071 CHECK = 0.100E+01 UNROLLING DEPTH 4 TIME = 0.191E+00 MFLOPS = 94.2432 CHECK = 0.100E+01 UNROLLING DEPTH 8 TIME = 0.202E+00 MFLOPS = 89.0342 CHECK = 0.100E+01 UNROLLING DEPTH 16 TIME = 0.219E+00 MFLOPS = 81.9932 CHECK = 0.100E+01 ------------------------------------------------------------------- TOWARD PEAK PERFORMANCE (1000 LINEAR EQUATIONS): (OPTIMIZED SUBROUTINE) TIMINGS FOR FUJITSU VP 200 (15 NSEC. CYCLE TIME): ------ NORM. RESID RESID MACHEP X(1) X(N) 1.39417418E+02 6.19081175E-11 2.22044659E-16 1.00000000E+00 1.00000000E+00 TIMES ARE REPORTED FOR MATRICES OF ORDER 1000 FACTOR SOLVE TOTAL MFLOPS UNIT RATIO TIMES FOR ARRAY WITH LEADING DIMENSION OF 1001 1.568E+00 1.557E-02 1.584E+00 4.222E+02 4.737E-03 2.828E+01 From GREENFIELD@MARLBORO.DEC.COM Tue Apr 8 16:30:09 1986 Received: from MARLBORO.DEC.COM (marlboro.dec.com.ARPA) by anl-mcs.ARPA (4.12/4.9) id AA19266; Tue, 8 Apr 86 16:29:53 cst Date: 8 Apr 1986 1726-EST From: GREENFIELD@MARLBORO.DEC.COM To: dongarra@anl-mcs, greenfield@dvinci Subject: ["Mike Greenfield (mr3-1/e13,dtn297-7481)" : Linpack Results for Jack Dongarra] Message-Id: <"MS11(5146)+GLXLIB0(4)-4" 12197281171.16.279.33826 at MARLBORO.DEC.COM> Status: RO Not sure if the mail system took the last message - try again.... - - - - - - - Begin message from: "Mike Greenfield (mr3-1/e13,dtn297-7481)" Sender: GREENFIELD@DVINCI Date: 8 Apr 1986 1716-EST From: "Mike Greenfield (mr3-1/e13,dtn297-7481)" To: GREENFIELD@MARKET Subject: Linpack Results for Jack Dongarra Mailed to: MARKET::GREENFIELD Jack; Here are the latest VAX cpu results, including for the new VAX 8500. Note that since the VAX 8300 and 8800 can only give up to 100% of a processor to a process, there is no benefit for this benchmark in the single stream case. If you send me your driver program (mentioned in the appendix), we will see what the multi-thread version does. regards, Mike (Mflops are correct - other ratios have a new formula, so they may need to be recomputed) Solving a system of linear equations with LINPACK in full precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8800(UP) VMS v4 (coded BLAS) 11 1.13 .606 1.76 VAX 8800(UP) VMS v4 13 .970 .708 2.06 VAX 8650 VMS v4 (coded BLAS) 13 .96 .715 2.08 VAX 8500 VMS v4 (coded BLAS) 16 .763 .900 2.62 VAX 8650 VMS v4 17 .70 .975 2.84 VAX 8600 VMS v4 (coded BLAS) 19 .66 1.04 3.03 VAX 8500 VMS v4 19 .652 1.05 3.07 VAX 8600 VMS v4 25 .49 1.41 4.11 VAX 785 VMS v4 (coded BLAS) 54 .225 3.01 8.77 VAX 785 VMS v4 63 .196 3.50 10.2 VAX 8200 VMS v4 (coded BLAS) 68 .180 3.81 11.1 VAX 780 VMS v4 (coded BLAS) 74 .166 4.12 12.0 uVAX II VMS v4 (coded BLAS) 79 .156 4.40 12.8 VAX 8200 VMS v4 81 .151 4.54 13.2 VAX 780 VMS v4 89 .138 4.96 14.4 uVAX II VMS v4 97 .126 5.45 15.9 Note: 88UP is VAX 8800 using only a single CPU 8300 is same as 8200 since only one cpu is used ====================================================================== Solving a System of Linear Equations with LINPACK in Single Precision. Computer Compiler Ratio MFLOPS Time Unit secs secs VAX 8650 VMS v4 (coded BLAS) 6.4 1.9 .361 1.05 VAX 88UP VMS v4 (coded BLAS) 7.4 1.65 .416 1.21 VAX 88UP VMS v4 9.1 1.35 .509 1.48 VAX 8650 VMS v4 9.7 1.3 .545 1.59 VAX 8600 VMS v4 (coded BLAS) 9.8 1.3 .546 1.59 VAX 8500 VMS v4 (coded BLAS) 13 .958 .717 2.09 VAX 8600 VMS v4 14 .88 .780 2.27 VAX 8500 VMS v4 15 .800 .859 2.50 VAX 785 VMS v4 (coded BLAS) 24 .511 1.34 3.91 VAX 785 VMS v4 31 .398 1.72 5.02 VAX 780 VMS v4 (coded BLAS) 36 .339 2.02 5.88 VAX 8200 VMS v4 (coded BLAS) 40 .307 2.23 6.51 VAX 780 VMS v4 49 .250 2.74 7.98 VAX 8200 VMS v4 54 .227 3.03 8.82 uVAX II VMS v4 (coded BLAS) 54 .227 3.04 8.81 uVAX II VMS v4 70 .174 3.95 11.5 Note: 88UP is VAX 8800 using only a single CPU 8300 is same as 8200 since only one cpu is used Posted: Tue 8-Apr-1986 15:36 Eastern Standard Time To: RHEA::DECWRL::"""dongarra@anl-mcs""" -------- - - - - - - - End forwarded message -------- .