Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences

Abstract

We describe a syntax-based algorithm that automatically builds Finite State Automata (word lattices) from semantically equivalent translation sets. These FSAs are good representations of paraphrases. They can be used to extract lexical and syntactic paraphrase pairs and to generate new, unseen sentences that express the same meaning as the sentences in the input sets. Our FSAs can also predict the correctness of alternative semantic renderings, which may be used to evaluate the quality of translations.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2003
Accession Number
ADA459391

Entities

People

  • Bo Pang
  • Daniel Marcu
  • Kevin Knight

Organizations

  • Cornell University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Automated Text Summarization
  • Computational Science
  • Computer Languages
  • Computer Science
  • Data Mining
  • Information Retrieval
  • Information Science
  • Language
  • Linguistics
  • Machine Translation
  • Natural Language Processing
  • Statistics
  • Translations

Fields of Study

  • Computer science

Readers

  • Computational Linguistics