Human Decisions on Targeted and Non-Targeted Adversarial Samples

Abstract

In a world that relies increasingly on large amounts of data and on powerful Machine Learning (ML) models, the veracity of decisions made by these systems is essential. Adversarial samples are inputs that have been perturbed to mislead the interpretation of the ML and are a dangerous vulnerability. Our research takes a first step into what can be an important innovation in cognitive science: we analyzed humans judgments and decisions when confronted with targeted (inputs constructed to make a ML model purposely misclassify an input as something else) and non-targeted (a noisy perturbed input that tries to trick the ML model) adversarial samples. Our findings suggest that although ML models that produce non-targeted adversarial samples can be more efficient than targeted samples they result in more incorrect human classifications than those of targeted samples. In other words, non-targeted samples interfered more with human perception and categorization decisions than targeted samples.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2017
Accession Number: AD1157334

Entities

People

Bennett I. Bertenthal
Cleotilde Gonzalez
Prashanth Rajivan
Samuel M. Harding

Organizations

Carnegie Mellon University

Human Decisions on Targeted and Non-Targeted Adversarial Samples

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas