Sketching methods for high-dimensional data analysis

Abstract

The PI aims to develop new algorithms, with rigorously provable guarantees, to cope with large datasets. Sample applications include network traffic monitoring, trend detection, processing sensor network data, and large-scale machine learning. Often such massive data is streamed and must be processed in real-time using memory sublinear in the data size. Sublinear memory algorithms may fit in fast CPU cache, to keep up with high throughput streams, or sublinear space may be required by a memory-constrained device.Furthermore, as data becomes more and more massive, data analysts naturally expect the ability to extract more, and higher-precision, knowledge and thus consider more features (i.e. dimensions) of the data. Thus, a natural consequence of increased abundance of massive datasets is the abundance of high-dimensional data sets. This is typical in machine learning, where many features of data are considered when making predictions or training classifiers. For example, consumer recommendation systems treat customers as high-dimensional vectors with many features derived from their product ratings and other behavior, then train classifiers. Thus a key component in the PI~sproposed research is the development of algorithms for the efficient processing of not only massive, but also highdimensional, datasets. Applications include many domains of interest to the Office of Naval Research, including reducing required storage, more efficient signal acquisition (compressed sensing), and speeding up algorithms to compute approximate answers (by running algorithms on lower-dimensional data).More specifically, the PI will develop new methods for fundamental problems in the areas of dimensionality reduction, streaming, and compressed sensing.

Document Details

Document Type
DoD Grant Award
Publication Date
Jan 04, 2017
Source ID
N000141712127

Entities

People

  • Jelani Nelson

Organizations

  • Office of Naval Research
  • President and Fellows of Harvard College
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • Space