================================================== NUMERICAL ANALYSIS TITLES -- Volume 3, Number 1 April 1, 1985 (No foolin') ================================================== Note: Responding to a request by John Lewis, and for the benefit of the SIGNUM Newsletter, these titles and abstracts are now ordered alphabetically by insti- tution. ##### ARGONNE ##### All Argonne reports listed here may be obtained by writing to the person listed as submitting the report at the following address: Argonne National Laboratory Mathematics and Computer Science Division 9700 S. Cass Ave. Argonne, IL 60439 ################### [1] Proceedings for the Argonne Workshop on Programming the Next Generation of SuperComputers, October 1984 Report ANL/MCS-TM-34 Argonne National Lab On February 27-28, 1984, an Argonne National Laboratory workshop titled "Programming the Next Generation of Supercomputers" was held at the Amfac Hotel in Albuquerque, New Mexico. The purpose of the workshop was to gather together researchers from the DOE laboratories who were using multiprocessors and to have these investigators relate their experiences with the programming languages supplied with these processors. The goal was to identify those areas where additional language primitives would aid in the development of portable software. Upon the identification of such needs, extensions to the Fortran language would then be proposed to such groups as the DOE Language Working Group and the X3J3 Fortran committee for further study. The topics presented at the workshop included reviews of existing languages that address concurrency and parallel computation, presentations on portable programming methods for parallel processing, users' views of what is needed to program multiprocessors, and the supplier's views of what is needed to drive their processors. This report collects the presentations given at this workshop and records the organized discussion held after each presentation and each session. In most cases, the materials reproduced here are the slides given by each author at the workshop. In some cases, papers were prepared by the authors and included in this report. A summary of the conclusions reached by the workshop is also included. Submitted by: Brian T. Smith --------------- [2] Iain S. Duff Parallel implementation of multifrontal schemes ANL/MCS-TM-49, March, 1985 We consider the direct solution of large sparse sets of linear equations in an MIMD environment. We base our algorithm on a multifrontal approach and show how to distribute tasks among processors according to an elimination tree which can be automat- ically generated from any pivot strategy. We study organiza- tional aspects of the scheme for shared memory multiprocessor configurations and consider implications for multiprocessors with local memories and a communication network. Submitted by: Iain S. Duff --------------- [3] I.S. Duff, A.M. Erisman, C.W. Gear, and J.K. Reid Some remarks on inverses of sparse matrices ANL/MCS-TM-51, March, 1985 We discuss some of the properties of the structure of the inverse of a sparse matrix. In particular, we show that the structural inverse of an irreducible sparse matrix is always full. We first show that Gaussian elimination always yields a full structural inverse and then show the result directly. We believe that our lemmas concerning Gaussian elimination are of interest in their own right. Submitted by: Iain S. Duff --------------- [4] I.S. Duff and C.W. Gear Computing the structural index ANL/MCS-TM-50, March, 1985 The index of many differential/algebraic equations is deter- mined by the structure of the system, that is, by the pattern of nonzero entries in the Jacobians. It is important to be able to determine if the index does not exceed two because such problems can be solved numerically. In this paper we present an algorithm that determines if the index is one, two, or greater, based only on the structure. The algorithm can be exponential in its execu- tion time: we do not know whether it is possible to get an asymp- totically faster algorithm. However, in many practical problems, this algorithm will execute in polynomial time. Submitted by: Iain S. Duff --------------- [5] Jack J. Dongarra, Linda Kaufman and, Sven Hammarling, Squeezing the Most out of Eigenvalue Solvers on High-Performance Computers ANL-MCS/TM 46, January, 1985 This paper describes modifications to many of the standard algorithms used in computing eigenvalues and eigenvectors of matrices. These modifications can dramatically increase the performance of the underlying software on high performance computers without resorting to assembler language, without significantly influencing the floating point operation count, and without affecting the roundoff error properties of the algorithms. The techniques are applied to a wide variety of algorithms and are beneficial in various architectural settings. Submitted by: Jack J. Dongarra --------------- [6] Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Richard J. Hanson A Proposal for an Extended Set of Basic Linear Algebra Subprograms ANL-MCS/TM 41, December, 1984 This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions proposed are targeted at matrix vector operations which should provide for more efficient and portable implementations of algorithms for high performance computers. Submitted by: Jack J. Dongarra --------------- [7] J.J. Dongarra, A.H. Sameh, and D.C. Sorensen Implementation of Some Concurrent Algorithms for Matrix Factorization ANL-MCS/TM 25, October, 1984 Three parallel algorithms for computing the QR-factorization of a matrix are presented. The discussion is primarily concerned with implementation of these algorithms on a computer that supports tightly coupled parallel processes sharing a large common memory. The particular experiments were conducted on the Denelcor HEP computer. The three algorithms are a Householder method based upon high-level modules, a Windowed Householder method that avoids fork-join synchronization, and a Pipelined Givens method that is a variant of the data-flow type algorithms offering large enough granularity to mask synchronization costs. Computational results indicate that the Pipelined Givens method is preferred and that this is primarily due to the number of array references required by the various algorithms. Submitted by: Jack J. Dongarra --------------- [8] J.J. Dongarra and D.C. Sorensen A Parallel Linear Algebra Library for the Denelcor HEP ANL-MCS/TM 33, October, 1984 This paper discusses the implementation of a library of algorithms for problems in linear algebra on the Denelcor HEP. The package includes some of the most heavily used subroutines from LINPACK, that is, solution of linear systems based on LU, Cholesky, and QR factorizations as well as the appropriate triangular solvers. The concept followed is to code these routines in terms of high-level modules, which provides a vehicle to achieve both transportability and efficiency across a wide range of architectures. We discuss this concept in the context of a numerical linear algebra software library which is adaptable to highly parallel computing systems. However, the concept is expected to applicable to other libraries as well. Submitted by: Jack J. Dongarra --------------- [9] M. T. Heath and D.C. Sorensen A Pipelined Givens Method for Computing the QR Factorization of a Sparse Matrix ANL-MCS/TM 47, February, 1985 This paper discusses an extension of the Pipelined Givens method for computing the QR factorization of a real m x n matrix to the case in which the matrix is sparse. When restricted to one process, the algorithm performs the same computation as the serial sparse Givens algorithm of George and Heath. Our implementation is compatible with the data structures used in SPARSPAK. The pipelined algorithm is well suited to parallel computers having globally shared memory and low overhead synchronization primitives, such as the Denelcor HEP, for which computational results are presented. We point out certain synchronization problems that arise in the adaptation to the sparse setting and discuss the effect on parallel speedup of accessing a serial data file. Submitted by: Jack J. Dongarra --------------- ##### BELL LABS ##### All Bell Laboratories reports were submitted by Eric Grosse <{ucbvax,ihnp4}!research!ehg> and they may be obtained by writing to him at: AT&T Bell Labs 2c471 Murray Hill, NJ 07974 or by requesting them of him through electronic mail. ##################### [10] Eric Grosse AUTOMATIC CHOICE OF COLORS FOR LEVEL PLOTS TR/NAMs 85-1 AT&T Bell Labs Color level plots are becoming popular in scientific computing but often require manually adjusting the color codes. A formula, fit by Friele, MacAdam, and Chickering to ellipses of measured color discrimination, provides the basis for a spectrum of equally spaced colors. The color choice problem is formulated as a nonlinear least squares optimization and solved numerically. A table can be precomputed to adjust the hue coordinate in the hexcone model of Alvy Ray Smith, just as compensation tables are used for gamma correction of CRTs. --------------- [11] William M. Coughran, Jr. and Eric Grosse THE GROVE EDITOR TR/NAMs 83-3 We describe a structure editor, grove, which manipulates groves of trees rather than text files. Various commands for searching and modifying nodes and subtrees are described. We also document some routines for performing basic operations on trees, which facilitate the construction of program- transformation systems. --------------- [12] William M. Coughran, Jr. and Eric Grosse THE PINE PROGRAMMING LANGUAGE TR/NAMs 83-4 Pine is a language intended to bring to scientific computation convenient program transformation, array operations, and simplified calls to libraries. Pine programs can be translated into FORTRAN. The key features added to FORTRAN are array operations, heap-based dynamic memory allocation, more convenient input/output, generic functions, operators of the form +=, expression-language syntax, and control structures. The lexical form of the language, based on a screen editor of trees, is designed to simplify program transformations. This manual describes pine and its processor to those already familiar with the issues involved in compiling into FORTRAN. --------------- [13] William M. Coughran, Jr., Eric Grosse, and Donald J. Rose VARIATION DIMINISHING SPLINES IN SIMULATION TR/NAMs 84-3 Variation diminishing splines provide an effective tool for modeling active elements in circuit simulation. Using quadratic tensor product splines and maintaining uniform sampling at the boundary by linear extension of the data yields an algorithm that is smooth (unlike simple table lookup), shape preserving (unlike simple interpolation), and efficient (30 microseconds to evaluate on a Cray-1A). The rate of convergence to function and derivative values is O(h**2). --------------- [14] Brenda S. Baker, Eric Grosse, and Conor S. Rafferty NON-OBTUSE TRIANGULATION OF A POLYGON TR/NAMs 84-4 We show how to triangulate a polygon without using any obtuse triangles. Such triangulations can be used to discretize partial differential equations in a way that guarantees the resulting matrix is Stieltjes, a desirable property both for computation and for theoretical analysis. A simple divide-and-conquer approach would fail because adjacent subproblems cannot be solved independently, but this can be overcome by careful subdivision. Overlay a square grid on the polygon, preferably with the polygon vertices at grid points. Choose boundary cells so they can be triangulated without propagating irregular points to adjacent cells. The remaining interior is rectangular and easily triangulated. Small angles can also be avoided in these constructions. --------------- [15] Jack J. Dongarra and Eric Grosse DISTRIBUTION OF MATHEMATICAL SOFTWARE VIA ELECTRONIC MAIL TR/NAMs 85-2 A large collection of mathematical software is now available via electronic mail. Messages sent to "netlib@anl-mcs" (on the Arpanet/CSNET) or to "research!netlib" (on the UNIX\(tm network) wake up a server that distributes items from the collection. For example the one-line message, "send index", gets a library catalog by return mail. We describe how to use the service and some of the issues in its implementation. --------------- [15] Linda Kaufman, Jack Dongarra, and Sven Hammarling SQUEEZING THE MOST OUT OF EIGENVALUE SOLVERS ON HIGH-PERFORMANCE COMPUTERS TR/NAMs 84-7 This paper describes modifications to many of the standard algorithms used in computing eigenvalues and eigenvectors of matrices which decrease the number of vector-memory references. These modifications can dramatically increase the performance of the underlying software without resorting to assembly language and without significantly influencing the floating point operation count. --------------- ##### CORNELL ##### [16] Franklin T. Luk A Parallel Method for Computing the Generalized Singular Value Decomposition TR/EE-CEG-85-1 We describe a new parallel algorithm for computing the generalized singular value decomposition of two n-by-n matrices, one of which is nonsingular. Our procedure requires O(n) time and one triangular array of O( n**2 ) processors. Submitted by: Franklin T. Luk Obtainable from: Franklin T. Luk School of Electrical Engineering Phillips Hall Cornell University Ithaca, New York 14853 --------------- [17] Thomas F. Coleman, Burton S. Garbow, and Jorge J. More' Software for Estimating Sparse Hessian Matrices Argonne National Laboratory Cornell University MCS/TM-43 CSD/TR 85-660 January, 1985 The solution of a nonlinear optimization problem often requires an estimate of the Hessian matrix for a function f. In large scale problems the Hessian matrix is usually sparse, and then estimation by differences of gradients is attractive because the number of differences can be small compared to the dimension of the problem. In this paper we describe a set of subroutines whose purpose is to estimate the Hessian matrix with the least possible number of gradient evaluations. Submitted by: Thomas F. Coleman Obtainable from: Thomas F. Coleman Cornell University Computer Science Dept. Ithaca, NY 14853 --------------- ##### NYU: COURANT INSTITUTE ##### [18] James Demmel On the Conditioning of Pole Assignment Computer Science Dept. Technical Report, Jan 85 An algorithm for the pole assignment problem has recently been given by Kautsky,Nichols, and Van Dooren. Given the n by n matrix A, the n by m full rank matrix B (m Obtainable from: James Demmel Courant Institute 251 Mercer Str. New York, NY 10012 --------------- ##### DELFT (NETHERLANDS) ##### [19] Henk A. van der Vorst The Performance of FORTRAN Implementations for Preconditioned Conjugate Gradients on Vector Computers We will consider in detail the performance of FORTRAN implemen- tations for the conjugate gradient algorithm, for the solution of large linear systems, on a number of well-known vectorcomputers: CRAY-1, CRAY X-MP and CYBER 205. Lowerbounds on the CPU-times, required for separate parts of the algorithm, are presented and these are compared to the actually observed CPU-times. It appears that these lowerbounds are reasonably sharp. Submitted by: Henk A. van der Vorst Obtainable from: Henk A. van der Vorst Department of Mathematics University of Technology at Delft Julianalaan 132 Delft, the Netherlands --------------- ##### EMORY ##### [20] J.M. Borwein and H. Wolkowicz A Simple Constraint Qualification in Infinite Dimensional Programming A new, simple, constraint qualification for infinite dimensional programs with linear programming type constraints is used to derive the dual program. Applications include a proof of the explicit solution of the best interpolation problem of Micchelli, Smith, Swetits, and Ward. Submitted by: Henry Wolkowicz Obtainable from: J.M. Borwein Dalhousie University Halifax, Nova Scotia B3H 4H8 Canada (or from Henry, but he didn't include his address). --------------- ##### IBM ##### [21] Jane Cullum and Ralph A. Willoughby A Practical Procedure for Computing Eigenvalues of Large Sparse Nonsymmetric Matrices IBM Research Report, RC 10988, 2/14/85 We propose a Lanczos procedure with no reorthogonalization for computing eigenvalues of very large nonsymmetric matrices. (This procedure can also be used to compute corresponding eigenvectors but that issue will be dealt with in a separate paper.) Such computations are for example, central to transient stability analyses of electrical power systems and for determining parameters for iterative schemes for the numerical solution of partial differential equations. Numerical results for several large matrices are presented to demonstrate the effectiveness of this procedure. Submitted by: Jane Cullum Obtainable from: Jane Cullum and Ralph A. Willoughby IBM T.J. Watson Research Center, P.O. Box 218 Yorktown Heights, New York 10598 --------------- ##### ICASE ##### Barbara Kraft ICASE, M/S 132C NASA Langley Research Center Hampton, VA 23665 may be contacted for all ICASE reports, or a message may be sent to: emily@icase.csnet All of these reports were submitted by "Emily". (The list below has been abbreviated to the numerically-oriented reports. Only the original technical-report information is given, although some of these have appeared in proceedings and journals. Sorry. -RHB) ################## [22] Rosen, I. G. Approximation methods for inverse problems involving the vibration of beams with tip bodies. ICASE Report No. 84-51, September 26, 1984, 10 pages. Proc. 23rd IEEE Conference on Decisions and Control, Las Vegas, NV, Decmeber 12-14, 1984. We outline two cubic spline based approximation schemes for the estimation of structural parameters associated with the transverse vibration of flexible beams with tip appendages. The identification problem is formulated as a least squares fit to data subject to the system dynamics which are given by a hybrid system of coupled ordinary and partial differential equations. The first approximation scheme is based upon an abstract semigroup formulation of the state equation while a weak/variational form is the basis for the second. Cubic spline based subspaces together with a Rayleigh-Ritz- Galerkin approach was used to construct sequences of easily solved finite dimensional approximating identification problems. Convergence results are briefly discussed and a numerical example demonstrating the feasibility of the schemes and exhibiting their relative performance for purposes of comparison is provided. --------------- [23] Canuto, C. Boundary Conditions in Chebyshev and Legendre Methods ICASE Report No. 84-52, October 1, 1984, 38 pages We discuss two different ways of treating non-Dirichlet boundary conditions in Chebyshev and Legendre collocation methods for second order differential problems. An error analysis is provided. The effect of preconditioning the corresponding spectral operators by finite difference matrices is also investigated. --------------- [24] Roe, P. L. Generalized formulation of TVD Lax-Wendroff schemes ICASE Report No. 84-53, October 23, 1984, 14 pages The work of Davis which imports the concept of total-variation- diminution (TVD) into non-upwinded, Lax-Wendroff type schemes is reformulated in a way which is easier to analyze. The analysis reveals a class of TVD schemes not observed by Davis. --------------- [25] Bokhari, Shahid H. Shuffle-exchanges on augmented meshes ICASE Report No. 84-54, October 24, 1984, 12 pages Proc. 1985 Intl. Conference on Distributed Computer Systems Denver, CO. Prior research has shown how a mesh connected array of size N = 2**K an integer, can be augmented by adding at most one edge per node such that it can perform a shuffle-exchange of size N/2 in constant time. We now show how to perform a shuffle-exchange of size N on this augmented array in constant time. This is done by combining the available perfect shuffle of size N/2 with the existing nearest neighbor connections of the mesh. By carefully scheduling the different permutations that are composed in order to achieve the shuffle, the time required is reduced to 5 steps, which is shown to be optimal for this network. --------------- [26] Goodman, Jonathan B. and Randall J. LeVeque A geometric approach to high resolution TVD schemes ICASE Report No. 84-55, October 29, 1984, 35 pages We use a geometric approach, similar to Van Leer's MUSCL schemes, to construct a second-order accurate generalization of Godunov's method for solving scalar conservation laws. By making suitable approximations we obtain a scheme which is easy to implement and total variation diminishing. We also investigate the entropy condition from the standpoint of the spreading of rarefaction waves. For Godunov's method we obtain quantitative information on the rate of spreading which explain the kinks in rarefaction waves often observed at the sonic point. --------------- [27] Tin-Chee Wong, C. H. Liu, and James Geer Comparison of uniform perturbation solutions and numerical solutions for some potential flows past slender bodies ICASE Report No. 84-56, October 29, 1984, 32 pages Approximate solutions for potential flow past an axisymmetric slender body and past a thin airfoil are calculated using a uniform perturbation method and then compared with either the exact analytical solution or the solution obtained using a purely numerical method. The perturbation method is based upon a representation of the disturbance flow as the superposition of singularities distributed entirely within the body, while the numerical (panel) method is based upon a distribution of singularities on the surface of the body. It is found that the perturbation method provides very good results for small values of the slenderness ratio and for small angles of attack. Moreover, for comparable accuracy, the perturbation method is simpler to implement, requires less computer memory, and generally uses less computation time than the panel method. In particular, the uniform perturbation method yields good resolution near the regions of the leading and trailing edges where other methods fail or require special attention. --------------- [28] Salas, M. D., S. Abarbanel and D. Gottlieb Multiple steady states for characteristic initial value problems ICASE Report No. 84-57, November 1, 1984, 43 pages The time dependent, isentropic, quasi-one-dimensional equations of gas dynamics and other model equations are considered under the constraint of characteristic boundary conditions. Analysis of the time evolution shows how different initial data may lead to different steady states and how seemingly anamolous behavior of the solution may be resolved. Numerical experimentation using time consistent explicit algorithms verifies the conclusions of the analysis. The use of implicit schemes with very large time steps leads to erroneous results. --------------- [29] Rose, Milton E. A compact finite element method for elastic bodies ICASE Report No. 84-58, November 7, 1984, 38 pages A nonconforming finite element method is described for treating linear equilibrium problems, and a convergence proof showing second order accuracy is given. The close relationship to a related compact finite difference scheme due to Phillips and Rose is examined. A condensation technique is shown to preserve the compactness property and suggests an approach to a certain type of homogenization. --------------- [30] Lustman, Liviu A review of spectral methods ICASE Report No. 84-59, November 19, 1984, 25 pages This paper presents a brief outline of spectral methods for partial differential equations. The basic ideas, together with simple proofs are discussed. An application to potential transonic flow is also reviewed. --------------- [31] Hussaini, M. Y. and W. D. Lakin Existence and non-uniqueness of similarity solutions of a boundary layer problem ICASE Report No. 84-60, November 21, 1984, 17 pages This work considers a Blasius boundary value problem with inhomogeneous lower boundary conditions f(0) = 0 and f'(0) = - L with L strictly positive. The Crocco variable formulation of this problem has a key term which changes sign in the interval of interest. It is shown that solutions of the boundary value problem do not exist for values of L larger than a positive critical value LC. The existence of solutions is proved for 0 < L _ LC by considering an equivalent initial value problem. However, for 0 < L < LC, solutions of the boundary value problem are found to be nonunique. Physically, this non-uniqueness is related to multiple values of the skin friction. --------------- [32] Bayliss, Alvin, Maestrello, Lucio and Turkel, Eli On the interaction of a sound pulse with the shear layer of an axisymmetric jet, III. Nonlinear effects ICASE Report No. 84-61 November 23, 1984, 22 pages The fluctuating field of a jet excited by transient mass injection is simulated numerically. The model is developed by expanding the state vector as a mean state plus a fluctuating state. Nonlinear terms are not neglected, and the effect of nonlinearity is studied. A high order numerical method is used to compute the solution. The results show a significant spectral broadening in the flow field due to the nonlinearity. In addition, large scale structures are broken down into smaller scales. --------------- [33] Swanson, R. C. and Eli Turkel A multistage time-stepping scheme for the Navier-Stokes equations ICASE Report No. 84-62, November 23, 1984, 28 pages A class of explicit multistage time-stepping schemes is used to construct an algorithm for solving the compressible Navier-Stokes equations. Flexibility in treating arbitrary geometries is obtained with a finite-volume formulation. Numerical efficiency is achieved by employing techniques for accelerating convergence to steady state. Computer processing is enhanced through vectorization of the algorithm. The scheme is evaluated by solving laminar and turbulent flows over a flat plate and an NACA 0012 airfoil. Numerical results are compared with theoretical solutions or other numerical solutions and/or experimental data. --------------- [34] Brenier, Yann and Stanley Osher Approximate Riemann solvers and numerical flux functions ICASE Report No. 84-63, December 5, 1984, 33 pages Given a monotone function z(x) which connects two constant states, uL < uR, (uL > uR), we find the unique (up to a constant) convex (concave) flux function, F(u) such that z(x/t) is the physically correct solution to the associated Riemann problem. For z(x/t), an approximate Riemann solver to a given conservation law, we derive simple necessary and sufficient conditions for it to be consistent with any entropy inequality. Associated with any member of a general class of consistent numerical fluxes, hf(uR,uL), we have an approximate Riemann solver defined through z(Z) = (-d/dZ)hfZ(uR,uL), where fZ(u) = f(u) - Zu. We obtain the corresponding F(u) via a Legendre transform and show that it is consistent with all entropy inequalities iff hfZ(uR,uL) is an E flux for each relevant Z. Examples involving commonly used two point numerical fluxes are given, as are comparisons with related work. --------------- [35] Hall, Philip and Mujeeb R. Malik On the instability of a three-dimensional attachment line boundary layer: Weakly nonlinear theory and a numerical approach ICASE Report No. 84-64, December 6, 1984, 65 pages The instability of a three-dimensional attachment line boundary layer is considered in the nonlinear regime. Using weakly nonlinear theory, it is found that, apart from a small interval near the (linear) critical Reynolds number, finite amplitude solutions bifurcate subcritically from the upper branch of the neutral curve. The time-dependent Navier-Stokes equations for the attachment line flow have been solved using a Fourier-Chebyshev spectral method and the subcritical instability is found at wavenumbers that correspond to the upper branch. Both the theory and the numerical calculations show the existence of supercritical finite amplitude (equilibrium) states near the lower branch which explains why the observed flow exhibits a preference for the lower branch modes. The effect of blowing and suction on nonlinear stability of the attachment line boundary layer is also investigated. --------------- [36] Fix, George J. and Manil Suri Three-dimensional mass conserving elements for compressible flows ICASE Report No. 84-65, December 13, 1984, 27 pages A variety of finite element schemes has been used in the numerical approximation of compressible flows particularly in underwater acoustics. In many instances instabilities have been generated due to the lack of mass conservation. In this paper we develop new two- and three-dimensional elements which avoid these problems. --------------- [37] Banks, H. Thomas and James M. Crowley Estimation of material parameters in elastic systems ICASE Report No. 84-66, December 28, 1984, 45 pages In this paper we present theoretical and numerical results for inverse problems involving estimation of spatially varying parameters such as stiffness and damping in distributed models for elastic structures such as Euler-Bernoulli beams. An outline of algorithms we have used and a summary of some of our computational experiences are presented. --------------- [38] Ito, Kazufumi and Robert K. Powers Chansrasekhar equations for infinite dimensional systems ICASE Report No. 84-67, December 31, 1984, 36 pages In this paper we derive the Chandrasekhar equations for linear time invariant systems defined on Hilbert spaces using a functional analytic technique. An important consequence of this is that the solution to the evolutional Riccati equation is strongly differentiable in time and one can define a 'strong' solution of the Riccati differential equation. A detailed discussion on the linear quadratic optimal control problem for hereditary differential systems is also included. --------------- [39] Ortega, James M. and Robert G. Voigt Solution of partial differential equations on vector and parallel computers ICASE Report No. 85-1, January 2, 1985, 173 pages In this paper we review the present status of numerical methods for partial differential equations on vector and parallel computers. A discussion of the relevant aspects of these computers and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial-boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. A brief discussion of application areas utilizing these computers is included. --------------- [40] Voigt, Robert G. Where are the parallel algorithms? ICASE Report No. 85-2, January 16, 1985, 14 pages Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected. --------------- [41] Gottlieb, D. and E. Tadmor Recovering pointwise values of discontinuous data within spectral accuracy ICASE Report No. 85-3, January 31, 1985, 20 pages We show how pointwise values of a function, f(x), can be accurately recovered either from its spectral or pseudospectral approximations, so that the accuracy solely depends on the local smoothness of f in the neighborhood of the point x. Most notably, given the equidistant function grid values, its intermediate point values are recovered within spectral accuracy, despite the possible presence of discontinuities scattered in the domain. (Recall that the usual spectral convergence rate decelerates otherwise to first order, throughout). To this end we employ a highly oscillatory smoothing kernel, in contrast to the more standard positive unit-mass mollifiers. In particular, post-processing of a stable Fourier method applied to hyperbolic equations with discontinuous data, recovers the exact solution modulo a spectrally small error. Numerical examples are presented. --------------- [42] Lakin, W. D. Differentiating Matrices for Arbitrarily Spaced Grid Points ICASE Report No. 85-5, January 31, 1985, 23 pages Differentiating matrices allow the numerical differentiation of functions defined at points of a discrete grid. The present work considers a type of differentiating matrix based on local approximation on a sequence of sliding subgrids. Previous derivations of this type of matrix have been restricted to grids with uniformly spaced points, and the resulting derivative approximations have lacked precision, especially at endpoints. The new formulation allows grids which have arbitrarily spaced points. It is shown that high accuracy can be achieved through use of differentiating matrices on non-uniform grids which include "near-boundary" points. Use of the differentiating matrix as an operator to solve eigenvalue problems involving ordinary differential equations is also considered. --------------- [43] Fleming, Peter J. SUBOPT - A CAD program for suboptimal linear regulators ICASE Report No. 85-6, February 4, 1985, 16 pages. This interactive software package provides design solutions for both standard linear quadratic regulator (LQR) and suboptimal linear regulator problems. Intended for time-invariant continuous systems the package is easily modified to include sampled-data systems. LQR designs are obtained by established techniques while the large class of suboptimal problems containing controller and/or performance index options is solved using a robust gradient minimization technique. Numerical examples demonstrate features of the package and recent developments are described. --------------- [44] Banks, H. Thomas and I. Gary Rosen A Galerkin method for the estimation of parameters in hybrid systems governing the vibration of flexible beams with tip bodies ICASE Report No. 85-7, February 5, 1985, 44 pages In this report we develop an approximation scheme for the identification of hybrid systems describing the transverse vibrations of flexible beams with attached tip bodies. In particular, problems involving the estimation of functional parameters (spatially varying stiffness and/or linear mass density, temporally and/or spatially varying loads, etc.) are considered. The identification problem is formulated as a least squares fit to data subject to the coupled system of partial and ordinary differential equations describing the transverse displacement of the beam and the motion of the tip bodies respectively. A cubic spline-based Galerkin method applied to the state equations in weak form and the discretization of the admissible parameter space yield a sequence of approximting finite dimensional identification problems. We demonstrate that each of the approximating problems admits a solution and that from the resulting sequence of optimal solutions a convergent subsequence can be extracted, the limit of which is a solution to the original identification problem. The approximating identification problems can be solved using standard techniques and readily available software. Numerical results for a variety of examples are provided. --------------- [45] Tal-Ezer, Hillel The eigenvalues of the pseudospectral Fourier approximation to the operator sin(2x) (d/dx) ICASE Report No. 85-8, February 7, 1985, In this note we show that the eigenvalues of the pseudospectral Fourier approximation to the operator sin(2x) (d/dx) have real parts either equal to zero or equal to plus or minus one. Whereas this does not prove stability for the Fourier method, applied to the hyperbolic equation dU/dt = sin(2x)(dU/dx) - Pi < x < - Pi; it indicates that the growth in time of the numerical solution is essentially the same as that of the solution to the differential equation. --------------- [46] Tal-Ezer, Hillel Spectral methods in time for parabolic problems ICASE Report No. 85-9, February 8, 1985, 19 pages A pseudospectral explicit scheme for solving linear, periodic, parabolic problems is described. It has infinite accuracy both in time and in space. The high accuracy is achieved while the time resolution parameter M (M = 0(1/(delta t)) for time marching algorithm) and the space resolution parameter N (N = 0(1/(delta t)) have to satisfy M = 0(N**(1 + ep)) ep > 0, compared to the common stability condition M = 0(N**2) which has to be satisfied in any explicit finite order time algorthm. --------------- [47] Tadmor, Eitan Complex symmetric matrices with strongly stable iterates ICASE Report No. 85-10, February 14, 1985, 22 pages We study complex-valued symmetric matrices. A simple expression for the spectral norm of such matrices is obtained, by utilizing a unitarily congruent invariant form. Consequently, we provide a sharp criterion for identifying those symmetric matrices whose spectral norm is not exceeding one: such strongly stable matrices are usually sought in connection with convergent difference approximations to partial differential equations. As an example, we apply the derived criterion to conclude the strong stability of a Lax- Wendroff scheme. --------------- [48] Turkel, Eli Algorithms for the Euler and Navier-Stokes equations for supercomputers ICASE Report No. 85-11, February 14, 1985, 27 pages We consider the steady state Euler and Navier-Stokes equations for both compressible and incompressible flow. Methods are found for accelerating the convergence to a steady state. This acceleration is based on preconditioning the system so that it is no longer time consistent. In order that the acceleration technique be scheme independent this preconditioning is done at the differential equation level. Applications are presented for very slow flows and also for the incompressible equations. --------------- [49] Pratt, Terrence W. PISCES: an environment for parallel scientific computation ICASE Report No. 85-12, February 15, 1985, 30 pages PISCES (Parallel Implementation of Scientific Computing EnvironmentS) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a Fortran 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces Fortran, an extension of Fortran 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified "virtual machine" that defines the run-time environment within which Pisces Fortran programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. This paper provides an introduction to the Pisces 1 virtual computer and the Fortran 77 extensions. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several different "granularities" of parallelism in programs and the provision of a "window" mechanism for distributed access to large arrays of data. --------------- [50] Harabetian, Eduard A convergent series expansion for hyperbolic systems of conservation laws ICASE Report No. 85-13, February 20, 1985, 88 pages We consider the discontinuous piecewise analytic initial value problem for a wide class of conservation laws that includes the full three-dimensional Euler equations. The initial interaction at an arbitrary curved surface is resolved in time by a convergent series. Among other features the solution exhibits shock, contact, and expansion waves as well as sound waves propagating on characteristic surfaces. The expansion waves correspond to the one-dimensional rarefactions but have a more complicated structure. The sound waves are generated in place of zero strength shocks, and they are caused by mismatches in derivatives. --------------- [51] Abarbanel, Saul and David Gottlieb Information content in spectral calculations ICASE Report No. 85-14, February 21, 1985, 13 pages Spectral solutions of hyperbolic partial differential equations induce a Gibbs phenomenon near local discontinuities or strong gradients. A procedure is presented for extracting the piecewise smooth behavior of the solution out of the oscillatory numerical solution data. The procedure is developed from the theory of linear partial differential equations. Its application to a non-linear system (the two-dimensional Euler equations of gas dynamics) is shown to be efficacious for the particular situation. --------------- [52] Cox, Christopher L. and George J. Fix On the accuracy of least squares methods in the presence of corner singularities ICASE Report No. 85-15, February 21, 1985, 24 pages This paper treats elliptic problems with corner singularities. Finite element approximations based on variational principles of the least squares type tend to display poor convergence properties in such contexts. Moreover, mesh refinement or the use of special singular elements do not appreciably improve matters. Here we show that if the least squares formulation is done in appropriately weighted space, then optimal convergence results in unweighted spaces like L two. --------------- [53] Maday, Yvon Analysis of spectral operators in one-dimensional domains ICASE Report No. 85-17, February 28, 1985, 26 pages We prove results concerning certain projection operators on the space of all polynomials of degree less than or equal to N with respect to a class of one-dimensional weighted Sobolev spaces. These results are useful in the theory of the approximation of partial differential equations with spectral methods. --------------- [54] Roe, Philip L. Discrete models for the numerical analysis of time-dependent multidimensional gas dynamics ICASE Report No. 85-18, March 1, 1985, 36 pages This paper explores a possible technique for extending to multidimensional flows some of the upwind-differencing methods that have proved highly successful in the one-dimensional case. Attention here is concentrated on the two-dimensional case, and the flow domain is supposed to be divided into polygonal computational elements. Inside each element the flow is represented by a local superposition of elementary solutions consisting of plane waves not necessarily aligned with the element boundaries. --------------- ##### LIVERMORE ##### [55] R. C. Y. Chin, G. W. Hedstrom, F. A. Howes, and J. R. McGraw Parallel computation of multiple-scale problems Report UCRL-92007 Lawrence Livermore National Laboratory Submitted by: Gerald Hedstrom Obtainable from: Gerald Hedstrom Lawrence Livermore National Laboratory Livermore, CA 94550 --------------- ##### MARYLAND ##### All of the Maryland titles were submitted by: Rodrigo Fontecilla They may be obtained from: Rodrigo Fontecilla University of Maryland Dept. of Computer Science College Park, MD 20742 #################### [56] Rodrigo Fontecilla A globally convergent algorithm for nonlinear equality constrained optimization using a double line search. In this paper, we present a globally convergent algo- rithm based on the 2-step algorithms developed by the author. Two different line searches with Goldstein-Armijo conditions are implemented. We show that the horizontal step is a descent direction for the objective function and that the vertical step is a descent direction for the $l sub 2#- norm of the constraints. Further, we show that the total step (horizontal+vertical) is a descent direction for both, the Lagrangian and the $l sub 2#-norm of the constraints. It is shown that the algorithm generates a sequence {$x sub k#} converging in the weak sense, i.e. the gradient of the Lagrangian and the constraints converge to zero. Fast con- vergence is guaranteed close to the solution with a 2-step q-quadratic rate. The BFGS secant update is implemented, a global convergence result is also obtained, and close to the solution the convergence is 2-step q-superlinear. Numerical results using a finite difference approximation of the Hes- sian and the BFGS secant update on 30 test problems are presented. --------------- [57] Rodrigo Fontecilla SUBSPACE INVARIANCE OF BROYDEN'S $THETA$-CLASS University of Maryland Computer Science Department Technical Report 1430 Broyden's $theta#-class of updating formulae have the pro- perty that if the current approximation matrix is symmetric and positive definite, then the updated matrix maintains those same properties under certain conditions. It is shown that if the current approximation matrix is symmetric and positive definite on a subspace of $Rn#, then the updated matrix is symmetric and positive definite along the same subspace under certain conditions. Applications of these results to the implementation of quasi-Newton methods for solving nonlinear constrained optimization problems are presented. --------------- ##### MINNESOTA ##### All of the reports from Minnesota were submitted without abstracts by: Panos Pardalos They are available from him at: Computer Science Department University of Minnesota Lind Hall 136 Minneapolis MN 55455 ##################### [58] J.B.Rosen Computational Experience with Large-Scale Constrained Global Optimization Tech. Report 84-13 June 1984 --------------- [59] J.B.Rosen and P.M.Pardalos Global Minimization of Large-Scale Constrained Concave Quadratic Problems by Separable Programming Tech. Report 84-29 October 1984 --------------- [60] P.M.Pardalos and J.B.Rosen Reduction of Nonlinear Integer Separable Programming Problems to Zero-One Linear Programming Problems Tech. Report 84-25 October 1984 --------------- [61] P.M.Pardalos and J.B.Rosen Methods for Global Concave Minimization: A Bibliographic Survey Tech. Report 84-31 November 1984 --------------- [62] P.M.Pardalos and J.B.Rosen A Global Optimization Approach to the Linear Complementarity Problem December 1984 --------------- ##### OAK RIDGE ##### [63] M. T. Heath A. J. Laub C. C. Paige R. C. Ward ORNL UC-Santa Barbara McGill Univ. ORNL COMPUTING THE SINGULAR VALUE DECOMPOSITION OF A PRODUCT OF TWO MATRICES Tech. Rept. ORNL-6118, January 1985 An algorithm is developed for computing the singular value decomposition of a product of two general matrices without explicitly forming the product. The algorithm is based on an earlier Jacobi-like method due to Kogbetliantz and uses plane rotations applied to the two matrices separately. A triangular variant of the basic algorithm is developed that reduces the amount of work required. Submitted by: Bob Ward Obtainable from: Mathematical Sciences Oak Ridge National Laboratory P. O. Box Y, Bldg. 9207-A Oak Ridge, Tennessee 37831 ------------------- [64] Alan George Michael T. Heath Joseph Liu University of Waterloo ORNL York University PARALLEL CHOLESKY FACTORIZATION ON A MULTIPROCESSOR Tech. Rept. ORNL-6124, March 1985 A parallel algorithm is developed for Cholesky factorization on a multiprocessor. The algorithm is based on self-scheduling of a pool of tasks. The subtasks in several variants of the basic elimination algorithm are analyzed for potential concurrency in terms of precedence relations, work profiles, and processor utilization. This analysis is supported by simulation results. The most promising variant, which we call column-Cholesky, is identified and implemented for the Denelcor HEP multiprocessor. Experimental results are given for this machine. Submitted by: Bob Ward Obtainable from: Mathematical Sciences Oak Ridge National Laboratory P. O. Box Y, Bldg. 9207-A Oak Ridge, Tennessee 37831 ----------------- [65] R. E. Funderlic C. D. Meyer, Jr. Oak Ridge National Laboratory North Carolina State University SENSITIVITY OF THE STATIONARY DISTRIBUTION VECTOR FOR AN ERGODIC MARKOV CHAIN Technical Report ORNL-6098, January 1985 Stationary distribution vectors p for Markov chains with associated transition matrices T are important in the analysis of many models in the mathematical sciences, such as queuing networks, input-output economic models and compartmental tracer analysis models. The purpose of this paper is to provide insight into the sensitivity of p to perturbations in the transition probabilities of T and to understand some of the difficulties in computing an accurate p. The group inverse A# of I-T is shown to be of fundamental importance in understanding sensitivity or conditioning of p. The main result shows that if there is a state that is accessible from every other state and the corresponding column of T has no small off-diagonal elements, then p cannot be sensitive to small perturbations in T. Ecological examples are given. A new stable algorithm for calculating A# is described. Submitted by: Bob Ward Obtainable from: Mathematical Sciences Oak Ridge National Laboratory P. O. Box Y, Bldg. 9207-A Oak Ridge, Tennessee 37831 --------------- [66] George Ostrouchov Symbolic Givens Reduction in Large Sparse Least Squares ORNL-6102 Oak Ridge National Laboratory December 1984 Orthogonal Givens factorization is a popular method for solving large sparse least squares problems. In order to exploit sparsity and to use a fixed data structure in Givens reduction, a preliminary symbolic factorization needs to be performed. Some results on row-ordering and structure of rows in a partially reduced matrix are obtained using a graph-theoretic representation. These results provide a basis for a symbolic Givens factorization. Column-ordering is also discussed, and an algorithm for symbolic Givens reduction is developed and tested. --------------- ##### UBC ##### [67] Manfred R. Trummer Nystroem's method versus Fourier type methods for the numerical solution of integral equations. Tech. Rep. 84-23 It is shown that Nystroem's method and Fourier type methods produce the same approximation to a solution of an integral equation at the collocation points for Nystroem's method. The quadrature rule for numerical integration must have these collocation points as abscissa. Submitted by: Manfred R. Trummer Obtainable from: Department of Computer Science The University of British Columbia Vancouver, B.C., Canada V6T 1W5 --------------- [68] Manfred R. Trummer An efficient implementation of a conformal mapping method using the Szego kernel. Tech. Rep. 84-24 An implementation, based on iterative techniques, of a method to compute the Riemann mapping function is presented. The method has been recently introduced by N. Kerzman and the author. It expresses the Szego kernel as the solution of an integral equation of the second kind. It is shown how to treat symmetric regions. The algorithm is tested on five examples. The numerical results show that the method is competitive, with respect to accuracy, stability, and efficiency. Submitted by: Manfred R. Trummer Obtainable from: Department of Computer Science The University of British Columbia Vancouver, B.C., Canada V6T 1W5 --------------- [69] Uri Ascher and G. Bader STABILITY OF COLLOCATION AT GAUSSIAN POINTS Technical Report 84-13 Revised, February, 1985 Symmetric Runge-Kutta schemes are particularly useful for solving stiff two-point boundary value problems. Such A-stable schemes perform well in many cases, but it is demonstrated that in some instances the stronger property of algebraic stability is required. A characterization of symmetric, algebraically stable Runge-Kutta schemes is given. The class of schemes thus defined turns out not to be very rich: The only collocation schemes in it are those based on Gauss points, and other schemes in the class do not seem to offer any advantage over collocation at Gaussian points. Submitted by: Uri Ascher Obtainable from: Uri Ascher Department of Computer Science University of British Columbia Vancouver, British Columbia Canada V6T 1W5 --------------- [70] Uri Ascher COLLOCATION FOR TWO-POINT BOUNDARY VALUE PROBLEMS REVISITED Technical Report 84-17 November, 1984 Collocation methods for two-point boundary value problems for higher order differential equations are considered. By using appropriate monomial bases, we relate these methods to corresponding one-step schemes for 1st order systems of differential equations. This allows us to present the theory for nonstiff problems in relatively simple terms, refining at the same time some convergence results and discussing stability. No restriction is placed on the meshes used. Submitted by: Uri Ascher Obtainable from: Uri Ascher Department of Computer Science University of British Columbia Vancouver, British Columbia Canada V6T 1W5 --------------- ##### WATERLOO ##### [71] P. H. Calamai and A. R. Conn A projected Newton method for $l sub p$ location problems. The University of Waterloo Computer Science Department Tech Rept CS-85-07 This paper is concerned with the numerical solution of con- tinuous minisum multifacility location problems involving the $l sub p$ norm, where 1 Obtainable from: Andrew R. Conn The University of Waterloo Computer Science Department Waterloo, Ontario N2L 3G1 Canada --------------- [72] R. H. Bartels and J. C. Beatty Beta-Splines with a Difference University of Waterloo Computer Science Department Technical Report CS-83-40 Local control of the shape parameters beta-1 and beta-2 in a beta-spline has previously relied on quintic Hermite interpolation of the distinct beta values associated with the joints in a geometrically-continuous piecewise parametric polynomial curve. Here we introduce an alternative means of obtaining local control of geometrically continuous piecewise polynomial curves. The essential idea is to generalize the truncated power functions suitably to account for the discontinuities allowed by geometric continuity, and to generate the beta-splines from a differencing process. The resulting basis functions are piecewise cubic, partition unity, have local support, and are nonegative for "reasonable" values of beta-1 and beta-2. They accommodate variable knot spacing as well as different beta values at each knot. Submitted by: Richard H. Bartels Obtainable from: Richard H. Bartels or John C. Beatty The University of Waterloo Computer Science Department Waterloo, Ontario N2L 3G1 Canada --------------- ##### YALE ##### [73] Howard C. Elman A Stability Analysis of Incomplete LU Factorizations Report YALEU/DCS/RR-365 Submitted by: Howard C. Elman Obtainable from: Howard C. Elman Yale University New Have, Conn. --------------- .