Petablox: Large-Scale Software Analysis and Analytics Using Datalog
Abstract
Program analysis encompasses techniques and tools that analyze code to predict program behavior, with important benefits to programmer productivity and software quality. But widespread adoption of this technology is hindered by fundamental challenges in accuracy, scalability, and usability. This project developed foundational techniques and open-source artifacts, specifically: 1) a framework to effectively balance different analysis tradeoffs by combining logical and probabilistic reasoning; 2) methodologies to enable analysis designers leverage massive code corpora to automatically learn features, weights, and probability distributions directly from code; 3) solver techniques for efficient inference and learning; and 4) a system for integrating results of program analysis tools into prevalent developer workflows. The key findings were: 1) the discovery of 100 new bugs in large widely-used C/C programs such as Linux and OpenSSL; 2) accurate analysis of malicious Android apps and enterprise Java programs for information leaks and concurrency safety, respectively; and 3) a demonstration of a continuous integration tool on the Github platform to find bugs in C/C projects.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 2020
- Accession Number
- AD1098764
Entities
People
- Mayur Naik
Organizations
- Georgia Tech Research Corporation