Exploring the Power of Heterogeneous Information Sources

Abstract

The big data challenge is one unique opportunity for both data mining and database research and engineering. A vast ocean of data are collected from trillions of connected devices in real time on a daily basis, and useful knowledge is usually buried in data of multiple genres, from different sources, in different formats, and with different types of representation. Many interesting patterns cannot be extracted from a single data collection, but have to be discovered from the integrative analysis of all heterogeneous data sources available. Although many algorithms have been developed to analyze multiple information sources, real applications continuously pose new challenges: Data can be gigantic, noisy, unreliable, dynamically evolving, highly imbalanced, and heterogeneous. Meanwhile, users provide limited feedback, have growing privacy concerns, and ask for actionable knowledge. In this thesis, we propose to explore the power of multiple heterogeneous information sources in such challenging learning scenarios. There are two interesting perspectives in learning from the correlations among multiple information sources: Explore their similarities (consensus combination), or their differences (inconsistency detection).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2011
Accession Number
ADA553613

Entities

People

  • Jing Gao

Organizations

  • University of Illinois Urbana–Champaign

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Cyber
  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence Software
  • Automata Theory
  • Bayesian Networks
  • Change Detection
  • Climate Change
  • Computational Science
  • Computer Languages
  • Computers
  • Data Mining
  • Detectors
  • Information Science
  • Machine Learning
  • Network Science
  • Probabilistic Models
  • Social Media
  • Supervised Machine Learning
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Database Systems and Applications
  • Systems Analysis and Design

Technology Areas

  • AI & ML