Systematically Studying Backdoor Attacks on DNNs and Developing a Detection Architecture

Abstract

Major Goals: Backdoor attacks [1][2] and Trojaning attacks [3] on DNNs are particularly two threatening attacks in the above settings. The goal of these attacks is to generate a backdoored neural network, which produces normal outputs on normal data, but wrong outputs on data embedded with backdoor keys. In image classifications, for example, the produced wrong outputs can be either targeted or non-targeted. The backdoor key can be a small predefined patch overlaid on a normal input image or even a special physical item that appears in a photo. Our preliminary experiments showed that when such a backdoored neural network is deployed in a face-recognition system or an autonomous car, attackers can easily fool the systems by attaching a predefined sticker on a human's face or a road sign. Particularly, implementing such attacks do not require to change the training process or the structure of the trained model; instead, only a small portion of the training data needs to be poisoned/perturbed. Existing defense methods of backdoor and poisoning attacks focus on building a robust training algorithm so that the trained models can either resist or ignore the poisoned samples, including some robust linear regression models. Moreover, these methods either require accesses to the training process or are limited to non-deep learning algorithms. For example, Auror [4] proposed to defend against poisoning attacks during the training phase of DNNs. However, the technique cannot detect the backdoors of a pre-trained model. Jagielski et al. [5] proposed a robust defense algorithm against poisoning attacks that can protect only linear regression models. In parallel to backdoor defenses, DNN verification is widely studied in the contexts such as medical diagnosis and autonomous driving to find the undesirable corner cases of a given DNN model.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 03, 2022
Accession Number: AD1201331

Entities

People

Hai Helen Li

Organizations

Duke University

Systematically Studying Backdoor Attacks on DNNs and Developing a Detection Architecture

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas