Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity

Abstract

This paper describes a system to help intelligence analysts track and analyze information being published in multiple sources, particularly open sources on the Web. The system integrates technology for Web harvesting, natural language extraction, and network analytics, and allows analysts to view and explore the results via a Web application. One of the difficult problems we address is the entity resolution problem, which occurs when there are multiple, differing ways to refer to the same entity. The problem is particularly complex when noisy data is being aggregated over time, there is no clean master list of entities, and the entities under investigation are intentionally being deceptive. Our system must not only perform entity resolution with noisy data, but must also gracefully recover when entity resolution mistakes are subsequently corrected. We present a case study in arms trafficking that illustrates the issues, and describe how they are addressed.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 01, 2011
Accession Number
ADA562189

Entities

People

  • Craig Knoblock
  • Greg Barish
  • Kane See
  • Matthew Michelson
  • Peter Lamonica
  • Raymond Liuzzi
  • Sofus A. Macskassy
  • Steven N. Minton

Organizations

  • Air Force Research Laboratory

Tags

Communities of Interest

  • Air Platforms
  • Autonomy
  • C4I

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Aircrafts
  • Analysts
  • Artificial Intelligence
  • Case Studies
  • Contracts
  • Data Management
  • Databases
  • Extraction
  • Intelligence Analysts
  • Knowledge Management
  • Language
  • Monitoring
  • Natural Languages
  • Social Networks
  • Web Applications

Fields of Study

  • Computer science
  • Engineering

Readers

  • Computer Vision.
  • Database Systems and Applications
  • Systems Analysis and Design