Entity Resolution Workflow Installation Process and User Guide

Abstract

Entity resolution, in the context of text processing and information extraction domain, refers to the process of uniquely disambiguating a specific person or an object that appears in a text. For instance, if John Smith appears in a document, entity resolution seeks to identify who that John Smith specifically refers to from available choices in a database. This report describes the setup and configuration of the U.S. Army Research Laboratory s (ARL) software implementation of an entity resolution algorithm called Relationship-based Data Cleaning (RelDC), which systematically exploits not only features but also relationships among entities for the purpose of disambiguation. (The main concept is that) RelDC views the database as a graph of entities that are linked to each other via relationships. It first utilizes a feature-based method to identify a set of candidate entities (choices) for a reference to be disambiguated. Graph theoretic techniques are then used to discover and analyze relationships that exist between the entity containing the reference and the set of candidates. * In order to demonstrate the RelDC entity resolution algorithm in an intuitive and seamless way, ARL developed an Entity Resolution Workflow (ERW).

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 01, 2013
Accession Number
ADA586761

Entities

People

  • Michael H. Lee

Organizations

  • United States Army Research Laboratory

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Computer Program Documentation
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Data Processing
  • Databases
  • Directories
  • Graphical User Interface
  • Identification
  • Information Science
  • Military Research
  • Named Entity Recognition
  • Network Protocols
  • Operating Systems
  • Web Applications
  • Web Browsers

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Database Systems and Applications
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval