Large-Scale Paraphrasing for Natural Language Understanding

Abstract

In this project, we researched and developed technologies to automatically extract large-volumes of paraphrases to aid in natural language understanding (NLU) tasks. We developed three core algorithms to: (1) generate extremely large paraphrase databases, and (2) adapt paraphrase databases to new domains, and (3) augment paraphrase rules with fine-grained semantic entailment relations. Our work introduced the paraphrase database (PPDB), the largest paraphrase resource developed to date. The resource contains over 100 million paraphrases for English. We generated paraphrase databases for 23 foreign languages.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2018
Accession Number
AD1050977

Entities

People

  • Benjamin Van Durme
  • Chris Callison-burch

Organizations

  • Johns Hopkins University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Automated Text Summarization
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Information Systems
  • Language
  • Machine Learning
  • Natural Language Processing
  • Natural Language Understanding
  • Natural Languages
  • Ontologies
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Database Systems and Applications