Expanding the Recall of Relation Extraction by Bootstrapping

Abstract

Most works on relation extraction assume considerable human effort for making an annotated corpus or for knowledge engineering. Generic patterns employed in KnowItAll achieve unsupervised, high-precision extraction, but often result in low recall. This paper compares two bootstrapping methods to expand recall that start with automatically extracted seeds by KnowItAll. The first method is string pattern learning, which learns string contexts adjacent to a seed tuple. The second method learns less restrictive patterns that include bags of words and relation-specific named entity tags. Both methods improve the recall of the generic pattern method. In particular, the less restrictive pattern learning method can achieve a 250% increase in recall at 0.87 precision, compared to the generic pattern method.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA456774

Entities

People

  • Junji Tomita
  • Oren Etzioni
  • Stephen Soderland

Organizations

  • University of Washington

Tags

Communities of Interest

  • Cyber

DTIC Thesaurus Topics

  • Acquisition
  • Algorithms
  • Base Lines
  • Computational Linguistics
  • Computer Science
  • Data Sets
  • Engineering
  • Errors
  • Executives
  • Extraction
  • Language
  • Linguistics
  • Models
  • Named Entity Recognition
  • Precision
  • Probabilistic Models
  • Training

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Systems Analysis and Design