Multilingual Data Selection for Low Resource Speech Recognition

Abstract

Feature representations extracted from deep neural network-based multilingual frontends provide significant improvements to speech recognition systems in low resource settings. To effectively train these frontends, we introduce a data selection technique that discovers language groups from an available set of training languages. This data selection method reduces the required amount of training data and training time by approximately 40, with minimal performance degradation. We present speech recognition results on 7 very limited language pack (VLLP) languages from the second option period of the IARPA Babel program using multilingual features trained on up to 10 languages. The proposed multilingual features provide up to 15 relative improvement over baseline acoustic features on the VLLP languages.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 12, 2016
Accession Number
AD1038283

Entities

People

  • Bhuvana Ramabhadran
  • Brian Kingsbury
  • Jia Cui
  • Kartik Audhkhasi
  • Samuel Thomas

Organizations

  • IBM Thomas J. Watson Research Center

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Automated Speech Recognition
  • Data Sets
  • Department Of Defense
  • Discriminant Analysis
  • Identification
  • Index Terms
  • Information Science
  • Language
  • Military Research
  • Neural Networks
  • Neurobehavioral Manifestations
  • Probability
  • Recognition
  • Training

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Joint Military Operations and Doctrine.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks