Designing a Tunable Nested Data-Parallel Programming System
Abstract
This article describes Surge, a nested data-parallel programming system designed to simplify the porting and tuning of parallel applications to multiple target architectures. Surge decouples high-level specification of computations, expressed using a C++ programming interface, from low-level implementation details using two first-class constructs: schedules and policies. Schedules describe the valid ways in which data-parallel operators may be implemented, while policies encapsulate a set of parameters that govern platform-specific code generation. These two mechanisms are used to implement a code generation system that analyzes computations and automatically generates a search space of valid platform-specific implementations. An input and architecture-adaptive autotuning system then explores this search space to find optimized implementations. We express in Surge five real-world benchmarks from domains such as machine learning and sparse linear algebra and from the high-level specifications, Surge automatically generates CPU and GPU implementations that perform on par with or better than manually optimized versions.
Document Details
- Document Type
- Pub Defense Publication
- Publication Date
- Dec 28, 2016
- Source ID
- 10.1145/3012011
Entities
People
- Albert Sidelnik
- Mary Hall
- Michael Garland
- Saurav Muralidharan
Organizations
- Defense Advanced Research Projects Agency
- Federal Government of the United States
- Nvidia
- University of Utah