Predictive Failure Avoidance
Abstract
Software plays a critical role in the safe, secure, and correct operation of military systems. At the same time, software is increasing in its scale, complexity, and in its use of rich data sources, such as decision trees or association rules computed by machine learning. The first of these demands a comprehensive approach to software verification in order to be able to trust that deployed systems will be safe and mission capable. The second means that despite significant advances in software verification techniques, modern software-intensive systems cannot be shown to be free of potential failures or vulnerabilities prior to deployment. We cannot afford to simply forgo deployment of software-intensive systems that can provide significant operational advantage, but deploying systems without complete verification means that there is a risk of failure or vulnerability that might compromise system safety or mission capability. We propose to mitigate that risk by developing efficient "predictive failure avoidance" (PFA) techniques that (1) monitor system execution, (2) predict when failures may occur, and then (3) steer program execution to avoid the failure while minimally affecting program behavior. Over the past decade researchers have explored a variety approaches to avoid program failures or security vulnerabilities. Broadly speaking these approaches exhibit significant shortcomings. They rely on analyses that lead to either incomplete or unnecessary failure avoidance. They rely on whole-program analyses which limit scalability. They compute failure characterizations that are inefficient to monitor at runtime. They incur significant developer or runtime overhead. Finally, they adopt extreme failure adaptation strategies, e.g., aborting the computation, which may not be generally applicable. Our proposed research aims to overcome these limitations by exploiting the hypothesis that - for any program execution there exists alternative executions that yield similar results. This may seem counterintuitive, since it is well-known that program behavior is discontinuous, but an increasing body of evidence suggests that the more complex the software system the more true this hypothesis becomes. Leveraging this property will allow our PAF approach to offer faster and more effective failure avoidance than existing repair techniques. Conceptually, our predictive failure avoidance approach seeks to (a) predict when a program execution will fail, (b) diagnose the path that the failing execution will take, (c) identify non-failing paths that are similar to the failing path, and (d) perturb the program state to execute the nearest non-failing path. To achieve this we will develop modular variants of software verification techniques that are able to compute logical descriptions of the inputs on which a program fails, we will use a range of structural and semantic analyses to compute a "neighborhood" of execution paths that are nearby to failing paths, we will develop algorithms to compute the distance between such neighborhoods to identify the best alternative path, and then generate functions that can be executed at runtime to adapt the state of a program that is destined to fail to instead follow an alternative path. We will implement our techniques in prototype tools and evaluate them on a corpus of more than 1000 C benchmark programs to assess the accuracy of fault avoidance, the acceptability of the alternative execution paths, and runtime overhead.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Feb 14, 2019
- Source ID
- W911NF1910054
Entities
People
- Matthew B. Dwyer
Organizations
- Army Contracting Command
- United States Army
- University of Virginia