Improving Speech Recognition for Children Using Acoustic Adaptation and Pronunciation Modeling

Abstract

Developing a robust Automatic Speech Recognition (ASR) system for children is a challenging task because of increased variability in acoustic and linguistic correlates as function of young age. The acoustic variability is mainly due to the developmental changes associated with vocal tract growth. On the linguistic side, the variability is associated with limited knowledge of vocabulary, pronunciations and other linguistic constructs. This paper presents a preliminary study towards better acoustic modeling, pronunciation modeling and front-end processing for children's speech. Results are presented as a function of age. Speaker adaptation significantly reduces mismatch and variability improving recognition results across age groups. In addition, introduction of pronunciation modeling shows promising performance improvements.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2014
Accession Number
AD1171103

Entities

People

  • Alexandros Potamianos
  • Prashanth G. Shivakumar
  • Shrikanth Narayanan
  • Sungbok Lee

Organizations

  • University of Southern California

Tags

DTIC Thesaurus Topics

  • Adaptive Training
  • Age Distribution
  • Age Groups
  • Automated Speech Recognition
  • Computers
  • Databases
  • Decoding
  • Dictionaries
  • Errors
  • Frequency
  • Hidden Markov Models
  • Language
  • Markov Models
  • Models
  • Signal Processing
  • Standards
  • Training
  • Vocabulary

Readers

  • Acoustical Oceanography.
  • Child and Adolescent Substance Abuse Science in Autism Spectrum Disorders.
  • Computational Linguistics

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation