Information Integration Seedling for Data Integration and exploitation System that Learns (DIESEL)
Abstract
The goal of the University of Washington effort under DIESEL is to develop a unified approach to entity, schema and concept matching. Entity resolution is the problem of determining which mentions in the data correspond to the same object (e.g., "J. Smith" and "Jane Smith" may be the same person). Schema matching is the problem of determining which fields in a database or other structure correspond to the same attributes (e.g., "Contact" and "Telephone" may be the same attribute). Concept matching (a.k.a. ontology alignment) is the problem of determining which concepts in two taxonomies correspond to each other (e.g., "Faculty" in one taxonomy may mean the same as "Staff" in another). To date, each of these problems has been addressed separately, assuming that the other two have been solved a priori (e.g., schema matching may be performed assuming that objects and concepts have already been resolved). In most cases, however, all three problems are present simultaneously, and a truly robust and widely applicable information integration system therefore needs to solve the three simultaneously.
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 01, 2009
- Accession Number
- ADA498388
Entities
People
- Pedro Domingos
Organizations
- SRI International