What's Wrong With Automatic Speech Recognition (ASR) and How Can We Fix It?

Abstract

Seedling effort sponsored by IARPA to investigate the sources of speech recognition errors associated with noisy or unusual acoustic conditions. Research was conducted through two separate mechanisms: an in-depth study of the source of errors in the acoustic model, using a novel sampling process to quantify the effects that the two major Hidden Markov Model (HMM) assumptions have on recognition accuracy; and a broader study of problems in speech recognition relying on a surveys of area experts and relevant literature. The in-depth study demonstrates that a lack of robustness (to mismatched training/test conditions) is a significant source of error and that the sensitivity to such mismatches in the acoustic representations is a prominent source of errors. The results also show that in the case of matched conditions, one of the incorrect assumptions inherent to the standard statistical models is the dominant source of errors. A survey of automatic speech recognition (ASR) researchers and of the ASR literature provides a further sense of the community?s perspective on the topic. The report concludes with some speculations for fruitful directions of future research. The authors also suggest some extensions of this line of inquiry to other prediction and classification problems beyond speech recognition.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Mar 01, 2013
Accession Number: ADA590075

Entities

People

Jordan Cohen
Nelson Morgan
Steven Wegmann

Organizations

International Computer Science Institute

What's Wrong With Automatic Speech Recognition (ASR) and How Can We Fix It?

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas