Model-assisted machine-code synthesis

Abstract

Binary rewriters are tools that are used to modify the functionality of binaries lacking source code. Binary rewriters can be used to rewrite binaries for a variety of purposes including optimization, hardening, and extraction of executable components. To rewrite a binary based on semantic criteria, an essential primitive to have is a machine-code synthesizer---a tool that synthesizes an instruction sequence from a specification of the desired behavior, often given as a formula in quantifier-free bit-vector logic (QFBV). However, state-of-the-art machine-code synthesizers such as McSynth++ employ naive search strategies for synthesis: McSynth++ merely enumerates candidates of increasing length without performing any form of prioritization. This inefficient search strategy is compounded by the huge number of unique instruction schemas in instruction sets (e.g., around 43,000 in Intel's IA-32) and the exponential cost inherent in enumeration. The effect is slow synthesis: even for relatively small specifications, McSynth++ might take several minutes or a few hours to find an implementation.

Document Details

Document Type
Pub Defense Publication
Publication Date
Oct 12, 2017
Source ID
10.1145/3133885

Entities

People

  • Ara Vartanian
  • Thomas Reps
  • Venkatesh Srinivasan

Organizations

  • Defense Advanced Research Projects Agency
  • National Science Foundation
  • University of Wisconsin–Madison

Tags

Readers

  • Computer Science.
  • Mathematical Modeling and Probability Theory.
  • Operations Research