A Preliminary Evaluation of Cache-Miss-Initiated Prefetching Techniques in Scalable Multiprocessors

Abstract

Prefetching is an important technique for reducing the average latency of memory accesses in scalable cache-coherent multiprocessors. Aggressive prefetching can significantly reduce the number of cache misses, but may introduce bursty network and memory traffic, and increase data sharing and cache pollution. Given that we anticipate enormous increases in both network bandwidth and latency, we examine whether aggressive prefetching triggered by a miss (cache-miss-initiated prefetching) can substantially improve the running time of parallel programs. Using execution-driven simulation of parallel programs on scalable cache-coherent maching, we study the performance of three cache-miss-initiated prefetching techniques: large cache blocks, sequential prefetching, and hybrid prefetching. Large cache blocks (which fetch multiple words within a single block) and sequential prefetching (which fetches multiple consecutive blocks) are well-known prefetching strategies. Hybrid prefetching is a novel technique combining hardware and software support for stride-directed prefetching. Our simulation results show that large cache blocks rarely provide significant performance improvements; the improvement in the miss rate is often too small (or nonexistent) to offset a corresponding increase in the miss penalty. Our results also show that sequential and hybrid prefetching perform better than prefetching via large cache blocks, and that hybrid prefetching performs at least as well as sequential prefetching.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 1994
Accession Number
ADA281502

Entities

People

  • Ricardo Bianchini
  • Thomas J. Leblanc

Organizations

  • University of Rochester

Tags

DTIC Thesaurus Topics

  • Application Software
  • Compilers
  • Computer Architecture
  • Computer Programming
  • Computer Science
  • Computers
  • Computing System Architectures
  • Directories
  • Information Systems
  • Instructions
  • Parallel Computing
  • Parallel Processing
  • Programming Languages
  • Simulations
  • Simulators
  • Test And Evaluation
  • Universities

Fields of Study

  • Computer science

Readers

  • Parallel and Distributed Computing.