A Study in Machine Learning from Imbalanced Data for Sentence Boundary Detection in Speech
Abstract
Enriching speech recognition output with sentence boundaries improves its human readability and enables further processing by downstream language processing modules. We have constructed a hidden Markov model (HMM) system to detect sentence boundaries that uses both prosodic and textural information.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 01, 2006
- Accession Number
- AD1002399
Entities
People
- Andreas Stolcke
- Elizabeth Shriberg
- Mary P. Harper
- Nitesh Chawla
- Yang Liu
Organizations
- SRI International