Petablox: Large-Scale Software Analysis and Analytics Using Datalog

Abstract

Program analysis encompasses techniques and tools that analyze code to predict program behavior, with important benefits to programmer productivity and software quality. But widespread adoption of this technology is hindered by fundamental challenges in accuracy, scalability, and usability. This project developed foundational techniques and open-source artifacts, specifically: 1) a framework to effectively balance different analysis tradeoffs by combining logical and probabilistic reasoning; 2) methodologies to enable analysis designers leverage massive code corpora to automatically learn features, weights, and probability distributions directly from code; 3) solver techniques for efficient inference and learning; and 4) a system for integrating results of program analysis tools into prevalent developer workflows. The key findings were: 1) the discovery of 100 new bugs in large widely-used C/C programs such as Linux and OpenSSL; 2) accurate analysis of malicious Android apps and enterprise Java programs for information leaks and concurrency safety, respectively; and 3) a demonstration of a continuous integration tool on the Github platform to find bugs in C/C projects.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2020
Accession Number
AD1098764

Entities

People

  • Mayur Naik

Organizations

  • Georgia Tech Research Corporation

Tags

Communities of Interest

  • Autonomy
  • Cyber
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accuracy
  • Air Force
  • Air Force Research Laboratories
  • Artificial Intelligence
  • Bayesian Networks
  • Computer Languages
  • Computer Programming
  • Computer Programs
  • Information Retrieval
  • Information Science
  • Machine Learning
  • Neural Networks
  • Probability
  • Probability Distributions
  • Programming Languages
  • Reasoning
  • Software Development

Fields of Study

  • Computer science
  • Engineering

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML