Intelligent Systems, Advanced Learning Theory, Methodology, and Techniques: Mapping Black-Box Attack Metrics and Parameter Spaces in Machine Learning

Abstract

Recent advances in machine learning (ML) have vastly improved computational reasoning over complex domains. From video and text classification to complex data analysis, machine learning is constantly finding new applications. Yet, when machine learning models are exposed to adversarial behavior, the systems built upon them can be fooled, evaded, and misled in ways that can have profound security implications. Such concerns are growing the face of new classes of ``black-box attacks that create adversarial samples without direct access to the models or their parameters, architecture or training data. We seek to address these concerns by exploring practical and actionable metrics and the attacker and defender parameter spaces within black-box attacks on machine learning. More specifically, in this proposed work, we will explore an important element of the new science of security in machine learning--measurement. The work will be organized around three research thrusts exploring (a) measures and limits of transferability, (b) measures and impacts of dimensionality and model size, and (c) the application and evaluation of transferability and dimensionality metrics across diverse domains. In this first research thrust, we will explore transferability, one of the properties that enable black-box attacks, through two sub-thrusts: exploring the relationship between adversarial effort at creating surrogate models (e.g., number of oracle queries) and attack effectiveness, and exploring the relationship between sample distortion limits and attach effectiveness. In the second research thrust, we will investigate the relationship between dimensionality and attack effectiveness and model resiliency. Here we explore how the number of features an adversary can control and the model/dataset size impacts the degree to which adversarial examples can be crafted and defended against. The last research thrust focuses on the validation of developed metrics within a broad range of application domains. Here, we hypothesize that models associated with different domains (e.g., image, network, sensor data, and malware) are unique in their vulnerability to adversarial action and ability to be robust in the face of concerted adversarial effort. The scientific approach used to carry out this work (like much of the other works in the field) will be focused on applied theory in machine learning and empirical study. Here, we will expand our efforts in open source tools and test harnesses to allow massive scale experimentation in a broad range of domains. These tools and results will be publicly distributed within the research and larger technical community. This effort will produce a scientific understanding of the limits of black box attacks on machine learning models and identify important countermeasures that increase the resilience of models to adversarial action. Our focus here is to ensure that such understanding generalizes to many important domains in intelligent systems. By going beyond the basic analysis of unconstrained images, we aim to expand the communities understanding of black box attacks in domains that are more important to security and systems.

Document Details

Document Type
DoD Grant Award
Publication Date
Jul 02, 2019
Source ID
W911NF1910374

Entities

People

  • Patrick Drew McDaniel

Organizations

  • Army Contracting Command
  • Pennsylvania State University
  • United States Army

Tags

Fields of Study

  • Computer science

Readers

  • Cybersecurity.
  • Neural Network Machine Learning.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Cyber
  • Space