Mixture Input Transformations for Adaptation of Hybrid Connectionist Speech Recognizes

Abstract

We extend the input transformation approach for adapting hybrid connectionist speech recognizers to allow multiple transformations to be trained. Previous work has shown the efficacy of the linear input transformation approach for speaker adaptation [1][2][3], but has focused only on training global transformations. This approach is clearly suboptimal since it assumes that a single transformation is appropriate for every region in the acoustic feature input space, that is, for every phonetic class, microphone, and noise level. In this paper, we propose a new algorithm to train mixtures of transformation networks (MTNs) in the hybrid connectionist recognition framework. This approach is based on the idea of partitioning the acoustic feature space into R regions and training an input transformation for each region. The transformations are combined probabilistically according to the degree to which the acoustic features belong to each region, where the combination weights are derived from a separate acoustic gating network (AGN). We apply the new algorithm to nonnative speaker adaptation, and present recognition results for the 1994 WSJ Spoke 3 development set. The MTN technique can also be used for noise or microphone robust recognition or for other nonspeech neural network pattern recognition problems.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 25, 1997
Accession Number
AD1002452

Entities

People

  • Victor Abrash

Organizations

  • SRI International

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Classification
  • Computing System Architectures
  • Errors
  • Hidden Markov Models
  • Machine Learning
  • Markov Models
  • Microphones
  • Models
  • Neural Networks
  • Probability
  • Recognition
  • Recurrent Neural Networks
  • Training

Fields of Study

  • Computer science

Readers

  • Neural Network Machine Learning.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Space