Productive High Performance Parallel Programming with Auto-tuned Domain-Specific Embedded Languages

Abstract

As the complexity of machines and architectures has increased, performance tuning has become more challenging, leading to the failure of general compilers to generate the best possible optimized code. Expert performance programmers can often hand-write code that outperforms compiler-optimized low-level code by an order of magnitude. At the same time, the complexity of programs has also increased, with modern programs built on a variety of abstraction layers to manage complexity, yet these layers hinder efforts at optimization. In fact, it is common to lose one or two additional orders of magnitude in performance when going from a low-level language such as Fortran or C to a high-level language like Python, Ruby, or Matlab.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 02, 2013
Accession Number
ADA575484

Entities

People

  • Shoaib A. Kamil

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Central Processing Units
  • Complementary Metal-Oxide Semiconductors
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Floating Point Operations
  • High Level Languages
  • Instruction Set Architecture
  • Lisp Programming Language
  • Machine Learning
  • Microarchitecture
  • Object-Oriented Programming Language
  • Operating Systems
  • Programming Languages
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Parallel and Distributed Computing.