Improving Processor Performance by Dynamically Pre-Processing the Instruction Stream

Abstract

The exponentially increasing gap between processors and off-chip memory, as measured in processor cycles, is rapidly turning memory latency into a major processor performance bottleneck. Traditional solutions, such as employing multiple levels of caches, are expensive and do not work well with some applications. We evaluate a technique, called runahead pre-processing, that can significantly improve processor performance. The instruction and data stream prefetches generated during runahead episodes led to a significant performance improvement for all of the benchmarks we examined. We found that runahead typically led to about a 30% reduction in CPI for the four Spec95 integer benchmarks that we simulated, while runahead was able to reduce CPI by 77% for the STREAM benchmark. This is for a five stage pipeline with two levels of split instruction and data caches: 8KB each of L1, and 1MB each of L2. A significant result is that when the latency to off-chip memory increases, or if the caching performance for a particular benchmark is poor, runahead is especially effective as the processor has more opportunities in which to pre-process instructions. Finally, runahead appears particularly well suited for use with high clock-rate in-order processors that employ relatively inexpensive memory hierarchies.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1999
Accession Number
ADA365210

Entities

People

  • James D. Dundas

Organizations

  • University of Michigan

Tags

DTIC Thesaurus Topics

  • Access Time
  • Accuracy
  • Classification
  • Compilers
  • Computations
  • Computer Programs
  • Computer Science
  • Data Sets
  • Electrical Engineering
  • Hierarchies
  • Instruction Set Architecture
  • Instructions
  • Instrumentation
  • Lists (Data Structures)
  • Pipelines
  • Simulations
  • Simulators

Fields of Study

  • Computer science

Readers

  • Neurodegenerative Parkinson's Disease and Rickettsial Disease handbook, including the data level of dopamine, BC, neurons, and PD.
  • Parallel and Distributed Computing.