Mapping the Conjugate Gradient Algorithm onto High Performance Heterogeneous Computers
Abstract
Mapping scientific kernels onto high performance heterogeneous computers (HPHC) must comply with certain rules of thumb or heuristics. Previous research by Jackson State University's (JSU) HPHC research group has provided anecdotal evidence illustrating some of these rules/heuristics. The research highlighted by this thesis corroborates the credibility of these rules. In particular, four versions (two pairs) of a floating-point sparse matrix conjugate gradient (CG) iterative solver are presented. JSU s state-of-the-art HPHC utilizes general purpose processors (GPP) and heterogeneous computational hardware, in particular, a field programmable gate array (FPGA), to develop the CG kernels. The first version of the pair executes strictly on the GPP and the second uses both the GPP and FPGA to map the entire CG algorithm onto hardware. For the second pair, a refactored version of CG is used, which is statically analyzed to determine where the most computationally expensive operation occurs. This operation is the sparse matrix vector multiply (MVM) kernel. Based on this analysis, the software version of CG is refactored to call MVM as a subroutine. An FPGA version of the MVM algorithm is also developed and a static analysis of that algorithm suggests a speedup of the MVM kernel. All four version of CG are executed using a specially designed set of sparse matrices and the results demonstrate that adherence to the rules of thumb and heuristics when mapping scientific kernels onto a HPHC can lead to significant speedups.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 2014
- Accession Number
- ADA626686
Entities
People
- Jamory D. Hawkins
Organizations
- Jackson State University