Multiprocessor Performance Debugging and Memory Bottlenecks

Abstract

Driven by the computational demands of scientists and engineers, computer architects are building increasingly complex multiprocessor systems. However, while the peak Gigaflop ratings of such systems is often impressive, the actual performance of initial implementations of applications can be disappointing. To make the task of performance debugging manageable, tools are needed that can analyze program behavior and report sources of performance loss. This thesis offers techniques for building such tools for shared memory multiprocessors. Previous efforts to build performance debugging systems for shared memory multiprocessors had two shortcomings. First, though memory hierarchy performance is often critical to whole program performance, most tools cannot distinguish time the CPU is computing from time when it is stalled waiting on the memory hierarchy. Second, other tools often significantly perturb a program's execution. This dissertation addresses both of these problems. I describe software instrumentation that typically increases program execution time by less than 10%, while collecting a detailed profile of where processors are doing work, waiting for work, or stalled waiting on the memory hierarchy. A window-based user interface allows the user to interpret the profile, viewing compute, memory, and synchronization bottlenecks at increasing levels of detail, from a whole program level down to the level of individual procedures, loops, and synchronization objects. Several multiprocessor case studies are included to illustrate the features of the tool.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 1992
Accession Number
ADA268387

Entities

People

  • Aaron J. Goldberg

Organizations

  • Stanford University

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Case Studies
  • Classification
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Debugging
  • Hierarchies
  • High Resolution
  • Instrumentation
  • Multiprocessors
  • Object Code
  • Operating Systems
  • Simulations
  • Theses
  • User Interface

Fields of Study

  • Computer science
  • Engineering

Readers

  • Computational Modeling and Simulation
  • Parallel and Distributed Computing.