Policy Specification for Non-Local Fault Tolerance in Large Distributed Information Systems

Abstract

The services provided by critical infrastructure systems are essential to the operation of modem society. These systems include the financial payments system, transportation systems, military command and control systems, the electric power grid, and telecommunications systems including the Internet. Widespread failure of any of these system might result in severe financial loss or perhaps human injury. Critical infrastructure systems rely heavily on distributed information systems for operation. These information systems must therefore be dependable; that is, they must "deliver service that can justifiably be trusted." Traditional dependability alone does not provide a rich enough model to deal with the faults in large, critical distributed systems operating in hostile environments. These systems require not simply dependability but instead require survivability. Informally, survivability is when a system has "the ability to continue to provide service (possibly degraded or different) in a given environment when various events cause major damage to the system or its operating environment." One means of achieving survivability is non-local fault tolerance, where faults that affect significant portions of the network must be detected and handled in a coordinated fashion. Our approach to doing this is with a survivability control system. This control system takes network sensor events as input, uses these to detect faults, and responds with application reconfiguration. This thesis presents TEDL, the Time-based Event Detection Language, for formal specification of the reactive policy of this control system. A translator is used to synthesize an executable implementation from this specification. The results from using TEDL to describe and execute several attack and failure scenarios for a simplified financial payments system are presented.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2003
Accession Number
ADA479837

Entities

People

  • Philip E. Varner

Organizations

  • University of Virginia

Tags

Communities of Interest

  • C4I
  • Cyber
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Command And Control Systems
  • Computer Programming
  • Computer Science
  • Control Systems
  • Detection
  • Detectors
  • Fault Tolerance
  • Formal Languages
  • Grammars
  • Information Systems
  • Intrusion Detection
  • Intrusion Detection Systems
  • Intrusion Detectors
  • Language
  • Load Monitoring
  • Network Architecture
  • Programming Languages

Fields of Study

  • Computer science

Readers

  • Computer Networking
  • Educational Psychology
  • Fault Tolerant Diagnosis of Black and White Balloon Isolation Tests Using ¥.

Technology Areas

  • Fully Networked C3
  • Fully Networked C3 - Command and Control