Improving Speech Recognition for Children Using Acoustic Adaptation and Pronunciation Modeling
Abstract
Developing a robust Automatic Speech Recognition (ASR) system for children is a challenging task because of increased variability in acoustic and linguistic correlates as function of young age. The acoustic variability is mainly due to the developmental changes associated with vocal tract growth. On the linguistic side, the variability is associated with limited knowledge of vocabulary, pronunciations and other linguistic constructs. This paper presents a preliminary study towards better acoustic modeling, pronunciation modeling and front-end processing for children's speech. Results are presented as a function of age. Speaker adaptation significantly reduces mismatch and variability improving recognition results across age groups. In addition, introduction of pronunciation modeling shows promising performance improvements.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2014
- Accession Number
- AD1171103
Entities
People
- Alexandros Potamianos
- Prashanth G. Shivakumar
- Shrikanth Narayanan
- Sungbok Lee
Organizations
- University of Southern California