Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays

Abstract

The performance of traditional RAID Level 5 arrays is, for many applications, unacceptably poor while one of its constituent disks is non- functional. This paper describes and evaluates mechanisms by which this disk array failure-recovery performance can be improved. The two key issues addressed are the data layout, the mapping by which data and parity blocks are assigned to physical disk blocks in an array, and the reconstruction algorithm, which is the technique used to recover data that is lost when a component disk fails. The data layout techniques this paper investigates are instantiations of the declustered parity organization, a derivative of RAID level 5 that allows a system to trade some of its data capacity for improved failure-recovery performance. We show that our instantiations of parity declustering improve the failure-mode performance of an array significantly, and that a parity- declustered architecture is preferable to an equivalent-size multiple-group RAID Level 5 organization in environments where failure-recovery performance is important. The presented analyses also include comparisons to a RAID Level 1 (mirrored disks) approach. With respect to reconstruction algorithms, this paper describes and briefly evaluates two alternatives stripe-oriented reconstruction and disk-oriented reconstruction, and establishes that the latter is preferable as it provides faster reconstruction. The paper then revisits a set of previously-proposed reconstruction optimizations, evaluating their efficacy when used in conjunction with the disk-oriented algorithm. The paper concludes with a section on the reliability versus capacity trade-off that must be addressed when designing large arrays.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 1994
Accession Number
ADA278935

Entities

People

  • Daniel P. Siewiorek
  • Garth A. Gibson
  • Mark Holland

Organizations

  • Carnegie Mellon University

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Application Software
  • Computations
  • Computer Science
  • Computers
  • Computing System Architectures
  • Data Storage Systems
  • Data Transmission
  • Databases
  • Device Drivers
  • Failure Mode And Effect Analysis
  • Operating Systems
  • Reliability
  • Simulations
  • Simulators
  • Test And Evaluation
  • Workload

Fields of Study

  • Computer science
  • Engineering

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computational Modeling and Simulation
  • Parallel and Distributed Computing.