Paraphrase Acquisition for Information Extraction

Abstract

We are trying to find paraphrases from Japanese news articles which can be used for Information Extraction. We focused on the fact that a single event can be reported in more than one article in different ways. However, certain kinds of noun phrases such as names, dates and numbers behave as anchors which are unlikely to change across articles. Our key idea is to identify these anchors among comparable articles and extract portions of expressions which share the anchors. This way we can extract expressions which convey the same information. Obtained paraphrases are generalized as templates and stored for future use.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2003
Accession Number
ADA460236

Entities

People

  • Satoshi Sekine
  • Yusuke Shinyama

Organizations

  • New York University

Tags

Communities of Interest

  • Biomedical
  • C4I
  • Weapons Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Acquisition
  • Computational Linguistics
  • Computer Science
  • Detection
  • Extraction
  • Governments
  • Hong Kong
  • Language
  • Linguistics
  • Machine Translation
  • Natural Languages
  • New York
  • Newspapers
  • North Korea
  • Template Patterns
  • Vector Spaces

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation