De-bloating Managed Runtimes for Scalable Data-Intensive Systems

Abstract

Statement of Work:The proposed research will exploit programming language, compiler, and systems support to improve performance and scalability of memory-managed programs in highly parallel applications. For this purpose, they will investigate the following:(ia) new execution model that can statically bound the number of data objects in the heap (iic) ompiler and runtime system support that can automatically enforce the object bounded property by statically transforming and dynamically optimizing existing programs(iiid) evelop a novel language, ScaleJ, that uses a dataflow semantics to support memory- and thread-oblivious development of data processing functions(iva) uto-tuning framework for ScaleJ that can safely and adaptively adjust the degree of parallelism by considering a variety of parameters, including data partitioning, runtime bloat, memory availability, GC effort, and parallelism This work will vastly improve the performance of software applications written in memory-managed languages such as Java, C#, and Scala.Objective:The PI proposes to explore an alternative direction in how to effectively optimize the managed runtime to improve performance and scalability of data processing. Typically for distributed applications, more machines and processors are added to the task as workload increases, but gains are diminished due to software inefficiencies. This research proposes to improve the performance on each individual machine.Approach:The approach utilized by the proposed research involves rearchitecting the manner in which data is handled bymassively parallel applications written in object-oriented languages. Typically, the heap space in memory is used for data storage and data manipulation, but this practice causes issues when computation is parallelized across multiple CPUs, and especially when it is distributed across distinct systems. The results from several motivating experiments presented in the proposal show that memory inefficacies and runtime bloat run rampant: one task crashed with an Outof-Memory error (OME) while processing a 60GB graph with only 380MB partitioned for each machine out of 10GB possible heap space. One of the primary approaches proposed to deal with this problem is to statically bound the number of heap objects so that heap consumption grows only with the number of threads, not the size of the data.Overall Merit and ONR Mission/Relevance:The proposed research will help solve the widespread and recurring problem of runtime bloat that is exacerbated by inefficient execution and memory usage for software developed with higher-level languages. As software development has evolved, additional layers of abstraction have been continually added resulting in increased complexity and inefficiencies. This work will reverse that trend toward inefficiency by ensuring that data is handled in a more thoughtful manner for managed languages. The research integrates well with an important thrust within ONR~s cyber research program which is to be able to perform late-stage software transformation to de-bloat and de-layer the exceedingly complex modern software. Theproposed approach is critical to making data management for type-safe interpreted languages more efficient without having to rewrite source code or redevelop any applications.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 30, 2016
Source ID
N000141612913

Entities

People

  • Guoqing Xu

Organizations

  • Naval Information Warfare Center Pacific
  • Office of Naval Research
  • United States Navy

Tags

Fields of Study

  • Computer science
  • Engineering

Readers

  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.

Technology Areas

  • Cyber
  • Space