Sketching Methods for High Dimensional Data Analysis

Abstract

The PI aims to develop new algorithms, with rigorously provable guarantees, to cope with large datasets. Sample applications include network traffic monitoring, trend detection, processing sensor network data, and large-scale machine learning. Often such massive data is streamed and must be processed in real-time using memory sublinear in the data size. Sublinear memory algorithmsmay fit in fast CPU cache, to keep up with high throughput streams, or sublinear space may be required by a memory-constrained device. Furthermore, as data becomes more and more massive, data analysts naturally expect the abilityto extract more, and higher-precision, knowledge and thus consider more features (i.e. dimensions) of the data. Thus, a natural consequence of increased abundance of massive datasets is the abundance of high-dimensional data sets. This is typical in machine learning, where many features of data are considered when making predictions or training classifiers. For example, consumer recommendation systems treat customers as high-dimensional vectors with many features derived from their product ratings and other behavior, then train classifiers. Thus a key component in the PI~s proposed research is the development of algorithms for the efficient processing of not only massive, but also high-dimensional, datasets.Applications include many domains of interest to the Office of Naval Research, including reducing required storage, more efficient signal acquisition (compressed sensing), and speeding up algorithms to compute approximate answers (by running algorithms on lower-dimensional data). More specifically, the PI will develop new methods for fundamental problems in the areas of dimensionality reduction, streaming, and compressed sensing.

Document Details

Document Type
DoD Grant Award
Publication Date
Nov 26, 2019
Source ID
N000142012006

Entities

People

  • Jelani Nelson

Organizations

  • Office of Naval Research
  • United States Navy
  • University of California Regents

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Space