What's Wrong With Automatic Speech Recognition (ASR) and How Can We Fix It?

Abstract

Seedling effort sponsored by IARPA to investigate the sources of speech recognition errors associated with noisy or unusual acoustic conditions. Research was conducted through two separate mechanisms: an in-depth study of the source of errors in the acoustic model, using a novel sampling process to quantify the effects that the two major Hidden Markov Model (HMM) assumptions have on recognition accuracy; and a broader study of problems in speech recognition relying on a surveys of area experts and relevant literature. The in-depth study demonstrates that a lack of robustness (to mismatched training/test conditions) is a significant source of error and that the sensitivity to such mismatches in the acoustic representations is a prominent source of errors. The results also show that in the case of matched conditions, one of the incorrect assumptions inherent to the standard statistical models is the dominant source of errors. A survey of automatic speech recognition (ASR) researchers and of the ASR literature provides a further sense of the community?s perspective on the topic. The report concludes with some speculations for fruitful directions of future research. The authors also suggest some extensions of this line of inquiry to other prediction and classification problems beyond speech recognition.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2013
Accession Number
ADA590075

Entities

People

  • Jordan Cohen
  • Nelson Morgan
  • Steven Wegmann

Organizations

  • International Computer Science Institute

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Cyber

DTIC Thesaurus Topics

  • Air Force
  • Artificial Intelligence
  • Automated Speech Recognition
  • Computational Science
  • Computer Science
  • Computers
  • Hidden Markov Models
  • Information Science
  • Literature Surveys
  • Markov Models
  • Natural Language Processing
  • Network Science
  • Neural Networks
  • Probabilistic Models
  • Signal Processing
  • Standards
  • Training

Readers

  • Speech Processing/Speech Recognition.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference