Effectiveness Evaluation of Fault-Tolerant Multiprocessor Systems.

Abstract

An important area of research is in the analysis of the coverage of a fault tolerant system, that is, the probability that the system can recover from a fault. The author has studied a variety of models, from simple phase-type models to very complex stochastic Petri net models, and has investigated solution techniques for each model type. His methodology allows consideration of external events that can interfere with recovery, such as a hard limit on recovery time, or the occurrence of a second near-coincident fault. It was discovered that a policy of attempting transient recovery upon detection of an error (as opposed to automatically reconfiguring the affected component out of the system) may actually increase the unreliability of the system. This result holds if the error detectability is not nearly perfect, so that the risk of producing an undetectable error (if the transient error is present) is greater than the benefit gained by not discarding the component. Keywords: Bibliographies.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 27, 1988
Accession Number
ADA191688

Entities

People

  • Kishor S. Trivedi

Organizations

  • Duke University

Tags

DTIC Thesaurus Topics

  • Applied Mathematics
  • Availability
  • Classification
  • Communication Systems
  • Computer Science
  • Computers
  • Corporations
  • Electronic Mail
  • Markov Chains
  • Markov Models
  • Multiprocessors
  • North Carolina
  • Operations Research
  • Petri Nets
  • Probability
  • Software Development
  • Test And Evaluation

Fields of Study

  • Engineering

Readers

  • Fault Tolerant Diagnosis of Black and White Balloon Isolation Tests Using ¥.
  • Mathematical Modeling and Probability Theory.
  • Systems Analysis and Design