Analysis of Factors Affecting System Performance in the ASpIRE Challenge

Abstract

This paper presents an analysis of factors affecting system performance in the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge. In particular, overall word error rate (WER) of the solver systems is analyzed as a function of room, distance between talker and microphone, and microphone type. We also analyze speech activity detection performance of the solver systems and investigate its relationship to WER. The primary goal of the paper is to provide insight into the factors affecting system performance in the ASpIRE evaluation set across many systems given annotations and metadata that are not available to the solvers. This analysis will inform the design of future challenges and provide insight into the efficacy of current solutions addressing noisy reverberant speech in mismatched conditions.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 13, 2015
Accession Number
AD1034768

Entities

People

  • Jennifer T. Melot
  • Jessica M. Ray
  • Nicolas Malyska
  • Wade Shen

Organizations

  • MIT Lincoln Laboratory

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Accuracy
  • Attenuation
  • Audio Files
  • Automated Speech Recognition
  • Calibration
  • Coefficients
  • Detection
  • Directional
  • False Alarms
  • Measurement
  • Microphones
  • Omnidirectional
  • Orientation (Direction)
  • Power Measurement
  • Test And Evaluation
  • United States Government
  • Warning Systems

Fields of Study

  • Computer science

Readers

  • Computational Fluid Dynamics (CFD)
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design

Technology Areas

  • AI & ML