Vocal Tract Length Normalization for Large Vocabulary Continuous Speech Recognition

Abstract

Generally speaking, the speaker-dependence of a speech recognition system stems from speaker-dependent speech feature. The variation of vocal tract length and/or shape is one of the major source of inter-speaker variations. In this paper, we address several methods of vocal tract length normalization (VTLN) for large vocabulary continuous speech recognition: (1) explore the bilinear warping VTLN in frequency domain; (2) propose a speaker-specific Bark/Mel scale VTLN in Bark/Mel domain; (3) investigate adaptation of the normalization factor. Our experimental results show that the speaker-specific Bark/Mel scale VTLN is better than the piecewise/bilinear warping VTLN in frequency domain. It can reduce up to 12% word error rate for our Spanish and English spontaneous speech scheduling task database. For adaptation of the normalization factor, our experimental results show that promising result can be obtained by using not more than three utterances from a new speaker to estimate his/her normalization factor, and the unsupervised adaptation mode works as well as the supervised one. Therefore, the computational complexity of VTLN can be avoided by learning the normalization factor from very few utterances of a new speaker.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 1997
Accession Number
ADA333514

Entities

People

  • Alex Waibel
  • Puming Zhan

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Bandwidth
  • Computer Programs
  • Computer Science
  • Databases
  • Decoding
  • Frequency
  • Frequency Domain
  • Hidden Markov Models
  • Information Science
  • Language
  • Males
  • Markov Models
  • Natural Language Processing
  • Probability
  • Recognition
  • Test Sets

Fields of Study

  • Computer science

Readers

  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Machine Translation