Apache Spark Cluster for Fast and Scalable Machine Learning with Big Data

Abstract

This proposal is to request fund from the DURIP program to acquire an Apache Spark cluster of compute nodes to support fast and scal"able machine learning research with big data.The PI has been collaborating with U.S. Naval Research Laboratory (NRL), Marine Meteo"rology Division on thedevelopment of efficient weather radar data compression techniques to tackle the challenge of transmitting a large amount of radar data from ships to shore over communication links constrained with severely limited bandwidths. The research has resulted in the development UFZIP radar data compression software that has been successfully transferred to ship operations. Ho"wever, it is anticipated that significantly increased amount of data will be generatedby NRL~s recent move to the ~COAMPS-OS Ship-F""ollowing Infosphere~, which is a 4DVAR radar data assimilation framework incorporating all available conventional and forward deploy""ed observations. Therefore, near real-time big data compression and analysis techniques will be highly relevant. To this end, we pro""pose to employ deep machine leaning techniques to achieve an in-depth analysis of data, which holds the promise of allowing for high"ly efficient reduction and compression on large-scale data. Deep machine learning is also expected to bring significantimprovement" on sea clutter suppression based on radar echo classification, an important data-quality control problem for which the PI has colla"borated with the NRL team. The acquired cluster provides a scalable solution to addressing the high-performance storage and compute" challenges faced by deep machine learning with a single compute node, in terms of significantly reduced time for training multiple" layers of learning models with massive datasets. The cluster will foster establishing new capabilities in pursuing research programs related to data science and technology of interest to ONR and DoD.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 29, 2017
Source ID
N000141712911

Entities

People

  • David Z Pan

Organizations

  • Office of Naval Research
  • United States Navy
  • University of Alabama in Huntsville

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.
  • Research Science/Academic Research

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks