Fail-Safe PVM: A Portable Package for Distributed Programming with Transparent Recovery

Abstract

Through fail-safe PVM we seek to explore fault tolerance for distributed computation from a practical perspective. The fail-safety features are considered to be add-ons to an existing model fro distributed computing, VIZ PVM in the prototype. We desire the following from the add-ons, with respect to the base model: (1) Application Independence. Fail-safety is provided for arbitrary model programs; (2) Application transparency. Failures are invisible to applications that do not bypass the model; (3) Compatibility. The interfaces presented by the model and the modified model are compatible; (4) External to operating system. The implementation requires only the standard OS interface. This makes it portable, which in turn facilitates heterogeneity and the constitution of systems containing many machines from different administrative domains; (5) Minimal overhead. Minimize overhead during regular execution; and (6) Practicality. The whole system is considered. In particular, the existence of distributed stable storage is not assumed

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 1993
Accession Number
ADA266594

Entities

People

  • Allan L. Fisher
  • Juan Leon
  • Peter Steenkiste

Organizations

  • Carnegie Mellon University

Tags

DTIC Thesaurus Topics

  • Communication Channels
  • Computations
  • Computer Networks
  • Computer Programming
  • Computer Science
  • Computers
  • Computing System Architectures
  • Distributed Computing
  • Engineering
  • Fail Safe
  • Fault Tolerance
  • Local Area Networks
  • Network Protocols
  • Network Topology
  • Operating Systems
  • Parallel Computing
  • Software Development

Fields of Study

  • Computer science
  • Engineering

Readers

  • Computer Science.
  • Sensor Fusion and Tracking Systems.
  • Structural Health Monitoring of Composite Structures.