Redefining Analytics for Small High-Performance Computing Clusters
Abstract
Core contributions we made under this grant include (a) the development of the Network-Attached-Memory database architecture, (b) the first scalable RDMA-based transaction protocol, (c) a novel RDMA-based replication protocol, (d) the concept of learned index structures, (e) the first techniques to estimate the impact of Unknown Unknowns on aggregated query results, and (f) novel UDF compilation techniques. Overall, we were able to address all in the proposal outlined research challenges (RC). We analyzed the RDMA performance gains (RC I) and developed an RDMA-based storage manager (RC II), we developed modern query execution techniques for UDFs and reinvented the way indexing is done through our learned indexing approach (RC III), we extended our work on data integration in heterogeneous environments (RC IV), we studied the impact of data replication for RDMA-enabled networks (RC V), we significantly advanced the area of UDF and query compilation for complex analytics (RC VI), and developed a novel language to describe ML pipelines (RC VII).
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 15, 2019
- Accession Number
- AD1096530
Entities
People
- Tim Kraska
Organizations
- Brown University