Iterated Class-Specific Subspaces for Speaker-Dependent Phoneme Classification

Abstract

The features based on the MEL cepstrum have long dominated probabilistic methods in automatic speech recognition (ASR). This feature set has evolved to maximize general ASR performance within a Bayesian classifier framework using a common feature space. Now, however, with the advent of the PDF projection theorem (PPT) and the class-specific method (CSM), it is possible to design features separately for each phoneme and compare log-likelihood values fairly across various feature sets. In this paper, class-dependent features are found by optimizing a set of frequency-band functions for projection of the spectral vectors, analogous to the MEL frequency band functions, individually for each class. Using this method, we show significant improvement over standard MEL cepstrum methods in speaker and phoneme specific recognition.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2008
Accession Number
ADA494622

Entities

People

  • Paul Baggenstoss

Organizations

  • Naval Undersea Warfare Center

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Automatic
  • Classification
  • Computing-Related Activities
  • Data Science
  • Data Sets
  • Dimensionality Reduction
  • Equations
  • Factor Analysis
  • Frequency
  • Frequency Bands
  • Information Science
  • Machine Learning
  • Signal Processing
  • Training
  • Undersea Warfare

Readers

  • Mathematical Modeling and Probability Theory.
  • Speech Processing/Speech Recognition.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • Space
  • Space - Space Objects