The Performance Impact of Data Reuse in Parallel Dense Cholesky Factorization,

Abstract

This paper explores performance issues for several prominent approaches to parallel dense Cholesky factorization. The primary focus is on issues that arise when blocking techniques are integrated into parallel factorization approaches to improve data reuse in the memory hierarchy. We first consider panel oriented approaches, where sets of contiguous columns are manipulated as single unit. These methods represent natural extensions of the column-oriented methods that have been widely used previously. On machines with memory hierarchies, panel-oriented methods significantly increase the achieved performance over column-oriented methods. However, we find that panel-oriented methods do not expose enough concurrency for problems that one might reasonably expect to solve on moderately parallel machines, thus significantly limiting their performance. We then explore block-oriented approaches, where square submatrices are manipulated instead of sets of columns. These methods greatly increase the amount of available concurrency, thus alleviating the problems encountered with panel-oriented method. However, a number of issues, including scheduling choices and block-placement issues, complicate their implementation. We discuss these issues and consider approaches that solve the resting problem. The resulting block-oriented implementation yields high processor utilization levels over a wide range of problem sizes.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1992
Accession Number
ADA322741

Entities

People

  • Anoop Gupta
  • Ed Rothberg

Organizations

  • Stanford University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Bandwidth
  • Classification
  • Computations
  • Computer Programming
  • Computer Science
  • Computers
  • Cost Models
  • Costs
  • Floating Point Operations
  • Hierarchies
  • Multithreading
  • Parallel Computing
  • Scheduling (Production)
  • Simulations
  • Simulators

Fields of Study

  • Computer science

Readers

  • Linear Algebra
  • Parallel and Distributed Computing.
  • Systems Analysis and Design