Multi-Lingual Deep Neural Networks for Language Recognition

Abstract

Multi-lingual feature extraction using bottleneck layers in deep neural networks (BN-DNNs) has been proven to be an effective technique for low resource speech recognition and more recently for language recognition. In this work we investigate the impact on language recognition performance of the multi-lingual BN-DNN architecture and training configurations for the NIST 2011 and 2015 language recognition evaluations (LRE11 and LRE15). The best performing multi-lingual BN-DNN configuration yields relative performance gains of 50% on LRE11 and 40% on LRE15 compared to a standard MFCC/SDC baseline system and 17% on LRE11 and 7% on LRE15 relative to a single language BN-DNN system. Detailed performance analysis using data from all 24 Babel languages, Fisher Spanish and Switchboard English shows the impact of language selection and the amount of training data on overall BN-DNN performance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 08, 2016
Accession Number
AD1032978

Entities

People

  • Frederick S. Richardson
  • Luis M Marcos

Organizations

  • MIT Lincoln Laboratory

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Automated Speech Recognition
  • Computer Languages
  • Dimensionality Reduction
  • French Language
  • Information Science
  • Language
  • Natural Language Processing
  • Neural Networks
  • Order Statistics
  • Recognition
  • Standards
  • Switchboards
  • Test And Evaluation
  • Training

Fields of Study

  • Computer science

Readers

  • Educational Psychology
  • Neural Network Machine Learning.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks