Semantic Decompilation of Deep Neural Network Binaries and Its Adversarial and Defensive Implications

Abstract

(Approved for public release)Motivation. As a widely adopted deep learning model, Deep Neural Networks (DNN) are oftencompiled intobinary code using dedicated compilers and run natively on target computing platforms.From the security perspective, an important research problem is: Given the DNN binarycode, is it possible to infer the semantic definition of the DNN model? Addressing this problemhas major security implications for both producers and consumers of DNN-based systems. Moreover,adversaries may see opportunities to steal or backdoor DNN models in commodity systems.Our research is motivated by the fact that traditional binary decompilation techniques are not sufficientto recover the semantic of compiled DNN models; and there has been no research effort thatreveals the adversarial and defensive implications of DNN decompilation.Proposed Work. In this research, we propose to develop a novel generic approach to recoveringDNN models from their compiled binary code. Furthermore, wewill explore both the adversarialand defensive implications of DNN binary reverse engineering, to reveal the nature of this capabilityas a #double-edged sword# in the landscape of machine learning (ML) security across thesoftware supply chain of ML-based systems. Our proposed research will be conducted in threemain tasks: Task I involves the development of a DNN binary semantic decompiler, based on anovel DNN intermediate representation (IR); Task II studies how adversaries could use the DNNsemantic decompiler to launch a variety of powerful attacks, such as proprietary DNN model theft,DNN backdoor injection, and white-box adversarial ML; Task III involves the development ofadvanced defenses to secure DNN-based systems by either leveraging or disrupting DNN decompilation.Innovation Claim and Impacts. To the best of our knowledge, this research is the first to holisticallyaddress trusted ML from model (DNN), system (trusted execution), and software supply chain(compiler and decompiler)perspectives, synergizing classic ML security with program analysis,software protection, and trusted and confiential computing.More specifically, our research will realize the following innovations: (1) We will implementthe first DNN decompiler able to fully and accurately recover the underlying DNN model fromthe binary. (2) The IR used by the DNN decompiler is the first semantic IR to characterize DNN#stensor operations in a generic way, supporting binaries compiled by different compilers and fordifferent architectures. (3) We will demonstrate backdoor injection enabled by DNN decompilation,a new attack not reported before. (4) We are the first to reveal potential attack surface offuture Neurosymbolic AI (NSAI) binaries, enabled by DNN decompilation. (5) The DNNcompilerfuzzing defense, which leverages our DNN decompiler, is the first to address DNN supplychain security. (6) We are the firstto propose a compiler system that obfuscates DNN binaries andpartitions DNN models leveraginghardware features to thwart malicious DNN decompilation.

Document Details

Document Type
DoD Grant Award
Publication Date
Jan 12, 2023
Source ID
N000142312157

Entities

People

  • Jing Tian

Organizations

  • Office of Naval Research
  • Purdue University
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Cybersecurity.
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Neural Networks