The Willow Architecture: Comprehensive Survivability for Large-Scale Distributed Applications
Abstract
The Willow architecture is a comprehensive approach to survivability in critical distributed applications. Survivability is achieved in a deployed system using a unique combination of (a) fault avoidance by disabling vulnerable network elements intentionally when a threat is detected or predicted, (b) fault elimination by replacing system software elements when faults are discovered, and (c) fault tolerance by reconfiguring the system if non-maskable damage occurs. The key to the architecture is a powerful reconfiguration mechanism that is combined with a general control structure in which network state is sensed, analyzed, and required changes effected. The architecture can be used to deploy software functionality enhancements as well as survivability. Novel aspects include: node configuration control mechanisms; a workflow system for resolving conflicting configurations; communications based on wide-area event notification; tolerance for wide-area, hierarchic and sequential faults; and secure, scalable and delegatable trust models.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 2001
- Accession Number
- ADA436790
Entities
People
- Alexander L. Wolf
- Antonio Carzaniga
- Dennis M. Heimbigner
- John Knight
- Jonathan Hill
- Michael Gertz
- Premkumar Devanbu
Organizations
- University of Colorado Boulder