Meeting Recorder Project: Dialog Act Labeling Guide

Abstract

This labeling guide is adapted from work on the Switchboard recordings and the accompanying manual (Jurafsky et al. 1997). The Switchboard-DAMSL (SWBD-DAMSL) manual for labeling one-on-one phone conversations provided a useful starting point for the types of dialog acts (DAs) that arose in the ICSI meeting corpus. However, the tagset for labeling meetings presented here has been modified as necessary to better reflect the types of interaction we observed in multiparty face-to-face meetings. This guide consists of five major sections: Quick Reference Information, Segmentation, How to Label, Adjacency Pairs, and Tag Descriptions. The first section supplies definitions for terms used throughout this guide and contains the correspondence of the Meeting Recorder DA (MRDA) tagset, which is the tagset detailed within this guide, to the SWBD-DAMSL tagset. This section also contains the entire MRDA tagset organized into groups according to syntactic, semantic, pragmatic, and functional similarities of the utterances they mark. The section entitled Segmentation, as its name indicates, details the rules and guidelines governing what constitutes an utterance along with how to determine utterance boundaries. The third section, How to Label, provides instruction regarding label construction, the management of utterances requiring additional DAs or containing quotes, and the use of the annotation software. The section entitled Adjacency Pairs details how adjacency pairs are constructed and the rules governing their usage. The section entitled Tag Descriptions provides explanations of each tag within the MRDA tagset. Two appendices are also found within this guide. The first provides a labeled portion of a meeting and the second contains information regarding tags used for a select number of meetings.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 09, 2004
Accession Number
ADA607947

Entities

People

  • Elizabeth Shriberg
  • Hannah Carvey
  • Rajdip Dhillon
  • Sonali Bhagat

Organizations

  • International Computer Science Institute

Tags

Communities of Interest

  • C4I

DTIC Thesaurus Topics

  • Artificial Intelligence Software
  • Audio Files
  • Automated Speech Recognition
  • Bayesian Networks
  • Boundaries
  • Cognitive Science
  • Computer Languages
  • Computer Science
  • Energy Levels
  • Machine Learning
  • Materials
  • Nets
  • Neural Networks
  • Probability
  • Recognition
  • Recording Systems
  • Supervised Machine Learning

Readers

  • Computational Linguistics
  • Library and Information Science
  • Speech Processing/Speech Recognition.