Cepstral Domain Talker Stress Compensation for Robust Speech Recognition

Abstract

Automatic speech recognition algorithms generally rely on the assumption that for the distance measure used, interword variabilities are smaller than interword variabilities so that appropriate separation in the measurements space is possible. As evidenced by degradation of recognition performance, the validity of such an assumption decreases from simple tasks to complex tasks, from cooperative talkers to casual talkers, and from laboratory talking environments to practical talking environments. This report presents a study of talker-stress interword variability, and an algorithm that compensates for the systematic changes observed. The study is based on Hidden Markov Models trained by speech tokens spoken in various talking styles. The talking styles include normal speech, fast speech, loud speech, soft speech, and talking with noise injected through earphones; the styles are designed to simulate speech produced under real stressful conditions. Cepstral coefficients are used as the parameters in the Hidden Markov Models. The stress compensation algorithm compensates for the variations in the cepstral coefficients in a hypothesis- driven manner. The functional form of the compensation is shown to correspond to the equalization of spectral tilts. Preliminary experiments indicate that a substantial reduction in recognition error rate can be achieved with relatively little increase in computation and storage requirements.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 10, 1986
Accession Number: ADA176068

Entities

People

Yunhui Chen

Organizations

Massachusetts Institute of Technology

Cepstral Domain Talker Stress Compensation for Robust Speech Recognition

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas