Scaling Concolic Execution of Binary Programs for Security Applications

Abstract

Concolic execution is a technique for program analysis that makes the values of certain inputs symbolic, symbolically executes a program's code, and computes a symbolic logical formula to represent a desired behavior of the program under analysis. The computed formula is then solved by a decision procedure to determine whether the desired behavior is feasible and, if so provide an example program input that satisfies the formula. Concolic execution and similar techniques have widely been applied to a variety of security-related applications including automatic test input generation, vulnerability discovery, exploit generation, signature generation, protocol reverse engineering and detecting deviations between software implementations. Although there has been a great success in applying it to various security-related applications, a basic implementation of concolic execution only works well on small programs and scaling it to real-world binary programs is difficult. One reason is that programs often contain certain code constructs that are difficult to reason about directly such as loops and encoding functions. Another reason is that the number of symbolic formulas grows drastically in proportion to the size of the program being analyzed. These observations led us to develop three scaling techniques for concolic execution. The first scaling technique, loop-extended concolic execution, focuses on improving the efficiency of concolic execution when analyzing program portions that involve loops. The second technique, decomposition and re-stitching of concolic execution, addresses the issue that arose from the presence of encoding functions, which are difficult to reason about automatically. The third technique uses the state model of the program under analysis to guide concolic execution. Our techniques work on program binaries and do not require the presence of source code or debugging information in the binaries.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2013
Accession Number
ADA590533

Entities

People

  • Pongsin Poosankam

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • C4I
  • Cyber
  • Energy and Power Technologies
  • Engineered Resilient Systems

DTIC Thesaurus Topics

  • Authentication
  • Coding
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Cryptography
  • Cybersecurity
  • Debugging
  • Detection
  • Engineering
  • Intrusion Detectors
  • Message Encoding
  • Military Research
  • Operating Systems
  • Software Development
  • Software Testing

Fields of Study

  • Computer science
  • Engineering

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computer Programming and Software Development.
  • Parallel and Distributed Computing.