Information content of big data

Abstract

Every day, an enormous volume of data is generated at an unprecedented rate and modern decision-making feels like “swimming in a sea of sensors and drowning in data”. One of the primary challenges in current data science research is to identify the most valuable information from a large data set. Historically, this problem was approached using tools from information theory; however, these methods no longer scale to modern and multi-modal massive data sets. Our recent research shows that many information gathering problems satisfy submodularity, an intuitive diminishing returns condition that we can exploit to develop efficient algorithms with strong theoretical guarantees. In this proposal, we lay out a basic research program for information gathering from massive data by developing groundbreaking novel methods. First, we propose the study of deep submodular functions for information gathering problems. This newly discovered class of functions, reminiscent of deep neural networks, very naturally handles multi-modal data and has potential for efficient learning. We also propose to develop fast algorithms for submodular optimization that scale to large data sets and handle situations where the data is dynamic, may be corrupted, or may be queried. Furthermore, we describe algorithms that balance prediction accuracy with user privacy and data ownership in submodular learning frameworks. Finally, we describe experiments to test our methods on a variety of challenging real-world applications, including information source veracity and crowd teaching.

Document Details

Document Type
DoD Grant Award
Publication Date
Apr 09, 2018
Source ID
FA95501810160

Entities

People

  • Amin Karbasi

Organizations

  • Air Force Office of Scientific Research
  • United States Air Force
  • Yale University

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Educational Psychology
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms