Floating-Point Computations on Reconfigurable Computers

Abstract

Modern reconfigurable computers combine general-purpose processors with field programmable gate arrays (FPGAs). The FPGAs are, in effect, reconfigurable application-specific coprocessors. During one run. the FPGA might be a matrix-vector multiply coprocessor,' during another run, it might be a linear equation solver. There are several issues associated with the mapping of floating-point computations onto RCs. There is the determination of what the author terms "the FPGA design boundary," i.e., the portion of the application that is mapped onto the FPGA. Furthermore, FPGA-based kernel performance is heavily dependent upon both pipelining and parallelism. The author has coined the phrase "the three p's" to encapsulate this important relationship. In this paper, important FPGA design boundary heuristics are described, and a toroidal architecture and partitioned loop algorithm are used to maximize both pipe fining and parallelism for a double-precision floating-point sparse matrix conjugate gradient solver that is mapped onto a reconfigurable computer. Wall clock run time comparisons show that the FPGA-augmented version runs more than two times faster than the software-only version.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 2007
Accession Number
ADP023795

Entities

People

  • Gerald R. Morris

Organizations

  • Engineer Research and Development Center

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Boundaries
  • Central Processing Units
  • Clocks
  • Computations
  • Computer Programming
  • Computers
  • Department Of Defense
  • Engineering
  • Field Programmable Gate Arrays
  • High Performance Computing
  • Linear Systems
  • Object-Oriented Database Management Systems
  • Semiconductor Devices
  • Semiconductors
  • Sparse Matrix
  • Trees (Data Structures)

Readers

  • Linear Algebra
  • Parallel and Distributed Computing.