Efficient and Coherent Data Selection and Summarization

Abstract

Today we are capturing far more data than our computational, processing and storage resources can handle. Data selection and reduction techniques can make large data substantially more efficient to browse and disseminate and have found numerous applications, including summarization, feature and model selection, clustering, product recommendation, network routing, sensor placement among others. Sequential and spatial data, including video, speech, text, biomedical signals, health records, smart grid data and sensor measurements, form an important large part of modem datasets, requiring effective data selection and summarization techniques. Such datasets often contain important structural relationships among items, imposed by underlying models, spatial/temporal constraints or side information, which must play a vital role in the selection of representative data. On the other hand, humans perform remarkably well in summarization of video, speech and text, which motivates learning structured data selection and summarization using ground-truth data summaries. This project develops a unified mathematical framework for unsupervised and supervised structured data selection and summarization by incorporating structural relationships among data and taking advantage of high-level reasoning and cognitive capabilities of humans. The proposed research brings together tools from sparse and low-rank recovery, convex and submodular optimization, active learning, dynamical systems as well as representation and metric learning to tackle these problems. More specifically, we propose novel objective functions that model and incorporate structural relationships among data to select a set of high-quality, diverse and compatible representatives and develop efficient optimization methods for maximizing our proposed objective functions. In addition to the unsupervised setting, we propose a supervised learning framework for structured data selection and summarization that integrates bottom-up correlations with high-level reasoning to learn to summarize data via minimum supervision. We will study theoretical performance guarantees of our proposed algorithms and will apply our proposed techniques to solve several important problems in computer vision.

Document Details

Document Type
DoD Grant Award
Publication Date
Feb 14, 2019
Source ID
W911NF1810300

Entities

People

  • Ehsan Elhamifar

Organizations

  • Army Contracting Command
  • Northeastern University
  • United States Army

Tags

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • Biotechnology