Simpler: Hybrid Dynamic and Static Techniques for Trustworthy Data Analytics
Abstract
Statement of Work: This research project will develop SIMPLER, a limited sublanguage of R in which high-level features and abstractions layers have been removed, along with the tool required to transform program written in R into SIMPLER. SIMPLER is a much more efficient and restrictive but secured version of R, a dynamic programming language. R programming environment provides a vast library of statistics and machine learning algorithms. It is normally used for data analysis in big-data environment. Objective: The objective for this research is to develop hybrid static and dynamic program analysis and program transformation techniques that aim to automatically remove, or desugar, both language features and layers of abstraction from programs written in high-level dynamic languages. The PI have selected the R programming language as the initial target for their investigation, but their technique apply equally well to JavaScript, Python, and even to the dynamic parts of Java. Approach: The research project basic approach is to start with a program written in full R, a higher-order, lazy, functional, object-oriented, reflective vector language, and then perform static and dynamic analysis to systematically rewrite that program into an equivalent program written in SIMPLER, a limited sublanguage of R in which high-level features and abstractions layers have been removed. In particular they are looking to remove Promises and Name Lookup, modify arithmetic code to erase the distinction between missing values (denoted NA in R) and NaN values, remove side effects in higher-order functions (such as arguments to map and apply), restrict redefinition of functions in general, and in particular built-in functions, restrict reflective operations and replace them with equivalent but less powerful operations, Static binding of methods and functions and removal of higher-order functions through inlining, and removal of implicit type conversions as these make it harder to infer types. The transformed programs can be either returned to the developer in the form of patches, which can be applied to the original source if appropriate, or used internally by the R runtime to optimize code or to annotate results with desugared origin-traces. Overall Merit and ONR Mission/Relevance: This research project strives to achieve simple and lean software. It specifically focuses on program written in dynamic programming languages, with R as the initial target. If successful, this research project will enable developer to write their application using R which provides rich environment for improved productivity, while enhancing the efficiency and security of the resulting code by transforming it into SIMPLER implementation using the tools developed in this project. Once proven effective for R programming language, the techniques can be applied/ported to other dynamic programming languages such as JavaScript, Python, and the dynamic parts of Java. Dynamic programming languages are widely used in Navy applications, especially in web and cloud environment. Improving efficiency and security of these applications will enhance the trustworthiness of Navy cyber environment. It will prove to be an important contributor to the success of future Navy s missions.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Aug 12, 2016
- Source ID
- N000141512332
Entities
People
- Jan Vitek
Organizations
- Northeastern University
- Office of Naval Research
- United States Navy