Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings

Abstract

The authors present baseline results for the joint segmentation and classification of dialog acts (DAs) of the International Computer Science Institute (ICSI) Meeting Corpus. Two simple approaches based on word information are investigated and compared with previous work on the same task. The first approach is based on a Hidden-Event Language Model (HE-LM), and the second relies on a Hidden Markov Model (HMM) based tagger. The HE-LM is frequently used for detection of sentence boundaries where after each word the model predicts a nonboundary or a sentence boundary event. In contrast, the authors use the HE-LM to predict not only a DA boundary or a nonboundary event, but the type of the DA boundary at the same time. The second technique relies on the concept of disambiguation of words, which is widely used in the form of HMM-based Part of Speech (POS) taggers. The authors also describe several metrics to assess the quality of the segmentation alone and the joint performance of segmentation and classification: NIST-SU, Lenient, Strict, DA Error Rate (DER), and DSER(DA Segmented Error Rate). As the investigated methods do not take into account prosodic features, it comes as no surprise that the overall performance of these systems is not always as good as previous work. Based on the experiments, the authors suggest that the lenient metric should not be used alone but in combination with other metrics that take into account the quality of the segmentation as well. The results provided in this paper serve as a baseline against which the authors will measure the results of future work on joint segmentation and classification.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2005
Accession Number
ADA444859

Entities

People

  • Andreas Stolcke
  • Elizabeth Shriberg
  • Matthias Zimmermann
  • Yang Liu

Organizations

  • International Computer Science Institute

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Applied Computer Science
  • Artificial Intelligence
  • Boundaries
  • Classification
  • Computational Processes
  • Computer Science
  • Computer Vision
  • Errors
  • False Alarms
  • Hidden Markov Models
  • Markov Models
  • Models
  • Natural Language Processing
  • Probability
  • Segmented
  • Sequences

Readers

  • Computational Modeling and Simulation
  • Speech Processing/Speech Recognition.