FINAL SECOND: Feature Identification, Neutralization, and Automated de-Layering for SEcuring COde ON Demand

Abstract

Objective. We propose the design, implementation, and evaluation of a suite of novel technologies for identifying,isolating, removi""ng, and optionally replacing or customizing behavioral functionalities (features) in binary software (e.g.,x86/x64/ARM/MIPS native"" code, and Java/ECMA bytecode) programs without source code. This will afford code consumers a facility to security-harden and custo"mize COTS software to better meet the security requirements ofmission-critical computing environments even when the software~s prod"ucers are untrusted, possibly unknown (e.g.,when some components are legacy modules of uncertain provenance), and unwilling to divu""lge source codes,disassemblies, or debugging metadata (e.g., symbol tables or relocations). The proposedinnovations will offer pro""vably sound protection against known attack classes, and probabilistic (e.g., entropy-based)protection against attack classes that"" contain defender-unknown exploits (e.g., zero-days). Anticipated products of theproposed research include publication of scientifi""c methodologies and algorithms, prototype implementations, andexperimental evaluations of the effectiveness (in terms of security,"" performance, and precision) of discoveredtechniques.While there is substantial prior and ongoing research on security-hardening o""f binary code against exploitation of lowlevelsoftware artifacts (i.e., functionalities unintended by software developers), the sci""entific literature on techniquesfor removal or neutralization of semantic features (i.e., developer-intended but consumer-unwanted" functionalities) isrelatively sparse and underexplored. We have therefore chosen to focus our proposed research mainly on Task Areas#1 (Feature Identification) and #2 (de-Bloating and de-Layering) of the BAA (although our proposed methodologiesnecessarily touc"h on the other task areas as well to some degree). Specifically, we propose to innovate scientificmethodologies that empower recipi""ents of closed-source, COTS software to identify undesired, unneeded, ordangerous program functionalities with sufficient precision"" to facilitate automated removal, replacement, and/orcustomization of these features without disruption of desired program function""alities.Approach. Our approach anticipates that in practice, behavioral features of software are often not easily identifiable via""a formal specification; they are more often demonstrable by human code recipients through user interaction (e.g., GUIpoint-and-clic""k), informal description (e.g., a whitelist of supported machine architectures), or simulation (e.g., unittesting of whitelisted fe"atures). We therefore propose an approach that combines machine learning-based approachesfor inferring which syntactic code feature"s (e.g., code blocks or dataflows) are associated with which human-identifiedsoftware behaviors, with rigorous semantic code analys"es for safely removing or replacing the identified code featureswith security-hardened code. While some of our solutions propose vi"rtualized environments for training models usedfor feature identification and isolation, the resulting de-bloated, de-layered softw""are is intended for execution onindependent, non-virtualized, commodity platforms (e.g., Linux or Windows OSes without a VM).Impac"t and Merit. The proposed research will significantly advance the scientific state-of-the-art of binary code securityhardening from mere artifact-neutralization to feature-neutralization and replacement. Project outcomes will facilitateNavy adaptation and integr"ation of production-level, source-free, binary COTS software into mission-critical operatingenvironments at low cost.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 29, 2017
Source ID
N000141712995

Entities

People

  • Kevin W. Hamlen

Organizations

  • Office of Naval Research
  • United States Navy
  • University of Texas at Dallas

Tags

Fields of Study

  • Computer science

Readers

  • Computer Programming and Software Development.
  • Distributed Systems and Data Platform Development
  • Military Logistics and Supply Chain Management

Technology Areas

  • AI & ML