Information Integration Seedling for Data Integration and exploitation System that Learns (DIESEL)

Abstract

The goal of the University of Washington effort under DIESEL is to develop a unified approach to entity, schema and concept matching. Entity resolution is the problem of determining which mentions in the data correspond to the same object (e.g., "J. Smith" and "Jane Smith" may be the same person). Schema matching is the problem of determining which fields in a database or other structure correspond to the same attributes (e.g., "Contact" and "Telephone" may be the same attribute). Concept matching (a.k.a. ontology alignment) is the problem of determining which concepts in two taxonomies correspond to each other (e.g., "Faculty" in one taxonomy may mean the same as "Staff" in another). To date, each of these problems has been addressed separately, assuming that the other two have been solved a priori (e.g., schema matching may be performed assuming that objects and concepts have already been resolved). In most cases, however, all three problems are present simultaneously, and a truly robust and widely applicable information integration system therefore needs to solve the three simultaneously.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2009
Accession Number
ADA498388

Entities

People

  • Pedro Domingos

Organizations

  • SRI International

Tags

Communities of Interest

  • Autonomy
  • C4I

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computational Science
  • Computer Languages
  • Computer Science
  • Data Integration
  • Databases
  • Government Procurement
  • Governments
  • Information Exchange
  • Information Science
  • Machine Learning
  • Natural Language Processing
  • Ontologies
  • Unsupervised Machine Learning

Readers

  • Artificial Intelligence
  • Distributed Systems and Data Platform Development
  • Educational Psychology