THE BERKELEY DATA ANALYSIS SYSTEM (BDAS): AN OPEN SOURCE PLATFORM FOR BIG DATA ANALYTICS

Abstract

The goal of this proposal was to deliver a modular open-source software stack that can support a new generation of large-scale analytic tools that provide answers over arbitrarily large datasets. This work was carried out by Berkeley's AMPLab, a research lab consisting of eleven faculty members and over 40 students. In addition to this grant, AMPLab (which ended in December 2016) was supported by industry affiliates and an NSF Expeditions grant. This grant was instrumental in improving our software stack, Berkeley Data Analytic System (BDAS), so that it can serve as a platform for the broader community. In particular, this grant enabled us to implement significant portions of the code-bases, integrate BDAS with commonly used tools, and make BDAS much easier to manage. In addition, it allowed us to extend the functionality of BDAS in several key area, including streaming, and query processing. Thanks to xData, BDAS has enjoyed a big success both in academia and industry. Today, Apache Spark is used by thousands of companies in production and counts over 400K meetup members worldwide, while Apache Mesos and Alluxio (formerly known as Tachyon) are used by hundreds of companies around the world.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2017
Accession Number
AD1039023

Entities

People

  • Anthony Joseph
  • Armando Fox
  • David Patterson
  • Ion Stoica
  • Michael Franklin
  • Michael I. Jordan
  • Michael Mahoney
  • Randy Katz
  • Scott Shenker

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Autonomy
  • Engineered Resilient Systems
  • Space

DTIC Thesaurus Topics

  • Air Force
  • Big Data
  • Computational Science
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Data Analysis
  • Data Centers
  • Data Management
  • Data Mining
  • Data Science
  • Databases
  • Information Science
  • Network Science
  • Open Source Software
  • Social Networking Services

Readers

  • Academic Conference Management
  • Aerospace Test and Evaluation
  • Distributed Systems and Data Platform Development