Modelling Speaker Variability and Imposing Speaker Constraints in Phonetic Classification

Abstract

This thesis deals with intraspeaker correlation analyses of speech sounds, and the possible utilization of this correlation to speech recognition. Current approaches to phonetic classification, regardless of whether they use context-dependent or -independent models, achieve classification based on locally optimum criteria. They make no fundamental assumption about the fact that the same vocal tract is used to make all the phonemes in an utterance. Thus, for example, a system may classify one sound in the beginning of an utterance as an /s/ belonging to a long vocal tract, while inappropriately, classifying another sound in the same utterance as an /Sigma/ belonging to a short vocal tract. Clearly the different phonemes of an utterance are correlated. Hence there is a set of speaker-specific constraints that can be imposed among all sounds in an utterance, and phonetic decoding should be accomplished by exploiting these constraints.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 1992
Accession Number
ADA256801

Entities

People

  • Partha Niyogi

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • C4I
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Automated Speech Recognition
  • Computational Complexity
  • Computer Science
  • Computers
  • Correlation Analysis
  • Data Science
  • Dimensionality Reduction
  • Discriminant Analysis
  • Electrical Engineering
  • Information Science
  • Machine Learning
  • Pattern Recognition
  • Random Variables
  • Recognition
  • Signal Processing
  • Statistical Analysis

Readers

  • Regression Analysis.
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks