Redefining Analytics for Small High-Performance Computing Clusters

Abstract

Core contributions we made under this grant include (a) the development of the Network-Attached-Memory database architecture, (b) the first scalable RDMA-based transaction protocol, (c) a novel RDMA-based replication protocol, (d) the concept of learned index structures, (e) the first techniques to estimate the impact of Unknown Unknowns on aggregated query results, and (f) novel UDF compilation techniques. Overall, we were able to address all in the proposal outlined research challenges (RC). We analyzed the RDMA performance gains (RC I) and developed an RDMA-based storage manager (RC II), we developed modern query execution techniques for UDFs and reinvented the way indexing is done through our learned indexing approach (RC III), we extended our work on data integration in heterogeneous environments (RC IV), we studied the impact of data replication for RDMA-enabled networks (RC V), we significantly advanced the area of UDF and query compilation for complex analytics (RC VI), and developed a novel language to describe ML pipelines (RC VII).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 15, 2019
Accession Number
AD1096530

Entities

People

  • Tim Kraska

Organizations

  • Brown University

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Computer Science
  • Data Analysis
  • Data Integration
  • Data Management
  • Data Science
  • Data Sets
  • Engineering
  • Graphics Processing Unit
  • Information Science
  • Machine Learning
  • Mobile Phones
  • Network Science
  • Neural Networks
  • Social Media
  • Trees (Data Structures)
  • Uninterruptible Power Supplies

Fields of Study

  • Computer science
  • Engineering

Readers

  • Aerial Delivery - Logistics and Supply Chain Management.
  • Distributed Systems and Data Platform Development
  • Microwave Engineering.