Finding and Fixing Fragility in Machine Learning
Abstract
This dissertation addresses the general problem that machine learning models are fragile. Fragility arises when composing models into systems or using them in real operational environments. It also arises when model inputs are perturbed, even by small amounts, especially when perturbations are chosen by adversaries. This dissertation applies an existing state-of-the-art safety analysis methodology, System Theoretic Process Analysis (STPA), borrowed from systems safety engineering, to concrete ML applications with notable social and ethical risks to demonstrate a systematic means to argue for safe and trustworthy ML in sociotechnical systems. STPA bridges high-level goals like safety and the AI ethical principles to low level ML life-cycle design and implementation decisions. At the technical level, the dissertation introduces a novel defense for deep neural network (DNN) classifiers which exceeds state-of-the-art adversarial robustness against benchmark attacks for CIFAR-10 and CIFAR-100 datasets. The best defense, a novel stochastic, none-of-the-above (NOTA) defense, LAD-SRNA, achieves Auto Attack attack success rates less than the natural error rate in both datasets with near state-of-the-art accuracy and better than state-of-the-art robustness in classification systems. Finally, this dissertation introduces a total of 16 adaptive attacks, modifying 8 existing state-of-the-art attacks to overcome both NOTA defenses and stochastic defenses as well as a combination of the two.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2023
- Accession Number
- AD1213505
Entities
People
- Edgar W Iii Jatho
Organizations
- Naval Postgraduate School