Anaphoric Annotation in the ARRAU Corpus

Abstract

Arrau is a new corpus annotated for anaphoric relations, with information about agreement and explicit representation of multiple antecedents for ambiguous anaphoric expressions and discourse antecedents for expressions which refer to abstract entities such as events, actions and plans. The corpus contains texts from different genres: task-oriented dialogues from the Trains-91 and Trains-93 corpus, narratives from the English Pear Stories corpus, newspaper articles from the Wall Street Journal portion of the Penn Treebank, and mixed text from the Gnome corpus.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2008
Accession Number
AD1157983

Entities

People

  • Massimo Poesio
  • Ron Artstein

Organizations

  • University of Southern California

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Agreements
  • Ambiguity
  • Computational Linguistics
  • Computational Modeling
  • Computer Science
  • Language
  • Linguistics
  • Military Research
  • Natural Language Processing
  • Natural Languages
  • New York
  • Newspapers
  • Teamwork
  • United States
  • United States Government
  • Universities
  • Workshops

Readers

  • Computational Linguistics