Improving Automatic Sentence Boundary Detection with Confusion Networks

Abstract

We extend existing methods for automatic sentence boundary detection by leveraging multiple recognizer hypotheses in order to provide robustness to speech recognition errors. For each hypothesized word sequence. an HMM is used to estimate the posterior probability of a sentence boundary at each word boundary. The hypotheses are combined using confusion networks to determine the overall most likely events. Experiments show improved detection of sentences for conversational telephone speech. though results are mixed for broadcast news.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2004
Accession Number
ADA460954

Entities

People

  • A. Stolcke
  • D. Hillard
  • E. Shriberg
  • M. Ostendorf
  • Y. Liu

Organizations

  • University of Washington

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Automatic
  • Boundaries
  • Detection
  • Feature Extraction
  • Hypotheses
  • Language
  • Networks
  • Neurobehavioral Manifestations
  • Probability
  • Recognition
  • Test And Evaluation
  • Test Sets
  • Training
  • Trees
  • Waveforms
  • Word Recognition

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Speech Processing/Speech Recognition.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Translation