Informed Prefetching and Caching

Abstract

Disk arrays provide the raw storage throughput needed to balance rapidly increasing processor performance. Unfortunately, many important, I/O-intensive applications have serial I/O workloads that do not benefit from array parallelism. The performance of a single disk remains a bottleneck on overall performance for these applications. In this dissertation, I present aggressive, proactive mechanisms that tailor file-system resource management to the needs of I/O-intensive applications. In particular, I will show how to use application-disclosed access patterns (hints) to expose and exploit I/O parallelism, and to dynamically allocate file buffers among three competing demands: prefetching hinted blocks, caching hinted blocks for reuse, and caching recently used data for unhinted accesses. My approach estimates the impact of alternative buffer allocations on application elapsed time and applies run time cost benefit analysis to allocate buffers where they will have the greatest impact. I implemented TIP, an informed prefetching and caching manager, in the Digital UNIX operating system and measured its performance on a 175 MHz Digital Alpha workstation equipped with up to 10 disks running a range of applications. Informed prefetching on a ten-disk array reduces the wall-clock elapsed time of computational physics, text search, scientific visualization, relational database queries, speech recognition, and object linking by 10-84% with an average of 63%.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1997
Accession Number
ADA339202

Entities

People

  • Russel H. Patterson

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies
  • Engineered Resilient Systems

DTIC Thesaurus Topics

  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Data Storage Systems
  • Data Transmission
  • Databases
  • Fish
  • Information Processing
  • Multiple Access
  • Object Code
  • Operating Systems
  • Parallel Computing
  • Relational Databases
  • Systems Engineering
  • Three Dimensional
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Computer Networking
  • Computer Programming and Software Development.
  • Computer Science/Computer Engineering/Data Science/Digital Signal Processing.

Technology Areas

  • AI & ML