Recursive Array Layouts and Fast Matrix Multiplication

Abstract

The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditional column-major or row-major array layouts to incur high variability in memory system performance as matrix size varies. This paper investigates the use of recursive array layouts to improve performance and reduce variability. Previous work on recursive matrix multiplication is extended to examine several recursive array layouts and three recursive algorithms: standard matrix multiplication, and the more complex algorithms of Strassen and Winograd. While recursive layouts significantly outperform traditional layouts (reducing execution times by a factor of 1.2-2.5) for the standard algorithm, they offer little improvement for Strassen's and Winograd's algorithms. For a purely sequential implementation, it is possible to reorder computation to conserve memory space and improve performance between 10% and 20%. Carrying the recursive layout down to the level of individual matrix elements is shown to be counter-productive; a combination of recursive layouts down to canonically ordered matrix tiles instead yields higher performance. Five recursive layouts with successively increasing complexity of address computation are evaluated, and it is shown that addressing overheads can be kept in control even for the most computationally demanding of these layouts.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2005
Accession Number
ADA440384

Entities

People

  • Alvin R. Lebeck
  • Mithuna Thottehodi
  • Praveen K. Patnala
  • Siddhartha Chatterjee

Organizations

  • IBM Thomas J. Watson Research Center

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Addressing
  • Algebra
  • Algorithms
  • Compilers
  • Computational Complexity
  • Computations
  • Computer Programming
  • Hierarchies
  • Language
  • Linear Algebra
  • Orientation (Direction)
  • Parallel Computing
  • Parallel Processing
  • Programming Languages
  • Standards
  • Trees (Data Structures)
  • Two Dimensional

Fields of Study

  • Engineering

Readers

  • Graph Algorithms and Convex Optimization.
  • Linear Algebra
  • Parallel and Distributed Computing.

Technology Areas

  • Space
  • Space - Space Objects