Merge-based sparse matrix-vector multiplication (SpMV) using the CSR storage format
Abstract
We present a perfectly balanced, "merge-based" parallel method for computing sparse matrix-vector products (SpMV). Our algorithm operates directly upon the Compressed Sparse Row (CSR) sparse matrix format, a predominant in-memory representation for general-purpose sparse linear algebra computations. Our CsrMV performs an equitable multi-partitioning of the input dataset, ensuring that no single thread can be overwhelmed by assignment to (a) arbitrarily-long rows or (b) an arbitrarily-large number of zero-length rows. This parallel decomposition requires neither offline preprocessing nor specialized/ancillary data formats. We evaluate our method on both CPU and GPU microarchitecture across an enormous corpus of diverse real world matrix datasets. We show that traditional CsrMV methods are inconsistent performers subject to order-of-magnitude slowdowns, whereas the performance response of our method is substantially impervious to row-length heterogeneity.
Document Details
- Document Type
- Pub Defense Publication
- Publication Date
- Feb 27, 2016
- Source ID
- 10.1145/3016078.2851190
Entities
People
- Duane Merrill
- Michael Garland
Organizations
- Defense Advanced Research Projects Agency
- Nvidia