Modeling and analyzing evaluation cost of CUDA kernels

Abstract

General-purpose programming on GPUs (GPGPU) is becoming increasingly in vogue as applications such as machine learning and scientific computing demand high throughput in vector-parallel applications. NVIDIA's CUDA toolkit seeks to make GPGPU programming accessible by allowing programmers to write GPU functions, called kernels, in a small extension of C/C++. However, due to CUDA's complex execution model, the performance characteristics of CUDA kernels are difficult to predict, especially for novice programmers.

Document Details

Document Type
Pub Defense Publication
Publication Date
Jan 04, 2021
Source ID
10.1145/3434306

Entities

People

  • Jan Hoffmann
  • Stefan K. Muller

Organizations

  • Carnegie Mellon University
  • Defense Advanced Research Projects Agency
  • Illinois Institute of Technology
  • National Science Foundation

Tags

Fields of Study

  • Computer science

Readers

  • Parallel and Distributed Computing.
  • Systems Analysis and Design

Technology Areas

  • AI & ML