Techniques for Improving the Performance of Sparse Matrix Factorization on Multiprocessor Workstations

Abstract

This paper looks at the problem of factoring large sparse systems of equations on high-performance multiprocessor workstations. While these multiprocessor workstations are capable of very high peak floating point computation rates, most existing sparse factorization codes achieve only a small fraction of this potential. A major limiting factor is the cost of memory accesses performed during the factorization. In this paper, we describe a parallel factorization code which utilizes the supermodal structure of the matrix to reduce the number of memory references. We also propose enhancements that significantly reduce the overall cache miss rate. The result is greatly increased factorization performance. We present experimental results from executions of our codes on the Silicon Graphics 4D/380 multiprocessor. Using eight processors, we find that the supermodal parallel code achieves a computation rate of approximately 40 MFLOPS when factoring a range of benchmark matrices. This is more than twice as fast as the parallel nodal code developed at the Oak Ridge National Laboratory running on the SGI 4D/380.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 1990
Accession Number
ADA226420

Entities

People

  • Anoop Gupta
  • Edward Rothberg

Organizations

  • Stanford University

Tags

Communities of Interest

  • Advanced Electronics

DTIC Thesaurus Topics

  • Abstracts
  • Buildings And Structures
  • Cancellation
  • Computations
  • Computer Programming
  • Computer Science
  • Computers
  • Elimination
  • Equations
  • Floating Point Operations
  • Grain Size
  • Graphics
  • Language
  • Multiprocessors
  • Multithreading
  • Numbers
  • Sparse Matrix

Fields of Study

  • Computer science
  • Engineering

Readers

  • Finite Element Method (FEM) for solving Partial Differential Equations (PDEs)
  • Parallel and Distributed Computing.