Simulating Fail-Stop in Asynchronous Distributed Systems

Abstract

The fail-stop failure model appears frequently in the distributed systems literature. However, in an asynchronous distributed system, the fail- stop model cannot be implemented. In particular, it is impossible to reliably detect crash failures in an asynchronous system. In this paper, we show that it is possible to specify and implement a failure model that is indistinguishable from the fail-stop model from the point of view of any process within an asynchronous system. We give necessary conditions for a failure model to be indistinguishable from the fail-stop model, and derive lower bounds on the amount of process replication needed to implement such a failure model. We present a simple one-round protocol for implementing one such failure model, which we call simulated fail-stop.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1994
Accession Number
ADA278101

Entities

People

  • Keith Marzullo
  • Laura Sabel

Organizations

  • Cornell University

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Asynchronous Systems
  • Computer Science
  • Computers
  • Construction
  • Damage Detection
  • Department Of Defense
  • Detection
  • Detectors
  • Distributed Computing
  • Elections
  • Fault Tolerant Computing
  • Literature
  • Operating Systems
  • Sequences
  • Universities

Fields of Study

  • Computer science
  • Engineering

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computational Modeling and Simulation