Performance-Portable Benchmarking Methods for Investigating Heterogeneous Computing Platforms
Abstract
Comparative benchmarking across heterogeneous computing platforms has become increasingly important for the evaluation of each platforms relative merit for high-performance-computing applications. The issues involved with performing meaningful benchmarks have further complicated the difficult task of constructing benchmarking codes that employ sound methodologies to accurately predict performance. One of the most popular benchmarks for heterogeneous platforms is the Scalable Heterogeneous Computing (SHOC) benchmark suite. We examined the benchmarking methods employed in the SHOC code and developed a generative programming benchmark code that is more predictive and representative of the true capabilities of a given platform. In this work, we developed innovative benchmarking methods capable of autotuning with parameterized kernels, dynamic sampling, and scaling analysis, while maintaining a single portable code base for all platforms. The end result is a benchmark code that employs autotuning to run optimal kernels for each platform therefore making the performance results more realistic to optimal performance and making comparisons among platforms more accurate. Benchmark results are presented for NVIDIA Kepler K20 graphics processing units, Intel Xeon Phi accelerators, and Intel Xeon central processing units.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 01, 2018
- Accession Number
- AD1055920
Entities
People
- Dale R. Shires
- David A. Richie
- James A. Ross
- Jamie K. Infantolino
- Song J. Park
- Thomas M. Kendall
Organizations
- United States Army Research Laboratory