Hybrid QR Factorization Algorithm for High Performance Computing Architectures

Abstract

This paper describes a novel QR decomposition algorithm that utilizes both Givens rotations and Householder reflections to give optimal performance on high performance computing architectures. Computing the matrix factorization A = QR is a mathematical step frequently encountered in many signal processing applications including adaptive nulling. Efficient algorithms for computing the QR factorization are vital to satisfying strict latency and throughput constraints in real-time implementations of these signal processing algorithms. By effectively combining the use of Givens rotations and Householder reflections on a parallel computing platform it is possible to efficiently map the computational algorithm to the available computational hardware to best make use of the memory hierarchy and to optimize performance. Furthermore by introducing adjustable software parameters that control load partitioning among the processors cache use and the number of computations done in serial fashion on one processor versus the calculations done concurrently on several processors it is possible for the user to tune the algorithm to run on different machines, without rewriting any code and still maintaining optimum performance. Results given in this abstract show that properly tuning the Hybrid QR algorithm on the SGI O3000 provided better performance than the available SCALAPACK library routine PSGEQRF for computing the QR decomposition.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: May 01, 2003
Accession Number: ADA419390

Entities

People

Gerard G. L. Meyer
Peter Vouras

Organizations

United States Naval Research Laboratory

Hybrid QR Factorization Algorithm for High Performance Computing Architectures

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers