Attacking Neural Networks with High Entropy Input Sampling
Abstract
Deep learning is becoming a technology central to the safety and accuracy of many types of systems. Unfortunately, attackers can create adversarial examples that manipulate Deep Neural Networks (DNN) into making incorrect predictions by carefully crafting perturbations that, to humans, look indistinguishable from examples the DNN would classify correctly. Research shows that adversarial examples exist near the decision boundary. Decision-based attacks are designed to find adversarial examples by traversing the data manifold toward the decision boundary using iterative sampling without any knowledge of the model parameters or gradients. In this sense, decision-based attacks are very important, as they apply to many real-world attack scenarios. We propose a new decision-based attack, High Entropy Input Sampling (HEIS), that iteratively steps toward the decision boundary by using entropy over class predictions as a heuristic to find adversarial examples without any knowledge of the model gradients. Using HEIS, we were able to produce adversarial examples that reduced the accuracy of a CIFAR-10 DNN from 91% to 11% for epsilon=0.2 and reduced the accuracy ofResNet50, an ImageNet DNN, from 81% to 22% for epsilon=0.4. Furthermore, we discovered that the adversarial examples are highly transferable to other models, causing dramatic drops in accuracy among all models tested. Finally, we use HEIS to break three state-of-the-art neural network defenses.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2022
- Accession Number
- AD1200446
Entities
People
- Daniel S. Deridder
Organizations
- Naval Postgraduate School