Entity Bases: Large-Scale Knowledgebases for Intelligence Data

Abstract

This report describes new technology for rapidly integrating information from heterogeneous sources. First, we consider the problem of integrating records about entities harvested from multiple sources. We address this problem by developing the technology to build massive entity knowledgebases, which we call EntityBases. The key idea is to create a comprehensive knowledgebase for the entities of interest. In order to build such a knowledge base, we address the issues of linking entities with noisy, multi-valued attributes obtained from heterogeneous sources and providing a virtual repository that can be efficiently queried. This report describes how we have addressed these issues and shows how an EntityBase can be used for understanding and linking text documents. We also consider the problem of on-demand information integration and describe a novel smart copy and paste (SCP) architecture that seamlessly combines the design-time and run-time aspects of data integration.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2009
Accession Number
ADA494978

Entities

People

  • Ching-chien Chen
  • Craig Knoblock
  • Steven Minton

Organizations

  • University of Southern California

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Computer Languages
  • Computer Programming
  • Data Integration
  • Data Mining
  • Databases
  • Information Systems
  • Machine Learning
  • Network Science
  • Storage
  • Supervised Machine Learning
  • User Interface
  • Web Browsers
  • Websites
  • Word Processors

Fields of Study

  • Computer science
  • Engineering

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Geospatial Intelligence and Artificial Intelligence Analytics