Fault-Tolerant Wait-Free Shared Objects

Abstract

A concurrent system consists of processes and shared objects. Previous research focused on the problem of tolerating process failures. We study the complementary problem of tolerating object failures. We divide object failures into two broad classes: responsive and non-responsive. With responsive failures, a faulty object responds to every invocation, but responses may be incorrect. With non-responsive failures, a faulty object may also hang without responding. For each class, we consider crash, omission, and arbitrary types of failures. For each type of failure, we are seeking a universal implementation for fault-tolerant wait-free shared objects. We present (deterministic) implementations for all types of responsive failures, including arbitrary failures. In contrast, we show that even the most benign type of non-responsive failures requires the use of randomization. Of special interest is the problem of implementing fault-tolerant objects using only objects of the same type. We present such fault-tolerant self-implementations for many common object types.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1992
Accession Number
ADA255499

Entities

People

  • Prasad Jayanti
  • Sam Toueg
  • Tushar D. Chandra

Organizations

  • Cornell University

Tags

Communities of Interest

  • Human Systems

DTIC Thesaurus Topics

  • Agreements
  • Algorithms
  • Automata
  • Computer Science
  • Computers
  • Contrast
  • Damage Detection
  • Degradation
  • Detection
  • Distributed Computing
  • Fault Tolerance
  • Iterations
  • Semantics
  • Sequences
  • Simulations
  • Translations
  • Universities

Fields of Study

  • Engineering

Readers

  • Database Systems and Applications
  • Facility/Structural Engineering.
  • Mathematical Modeling and Probability Theory.