Modelling Speaker Variability and Imposing Speaker Constraints in Phonetic Classification
Abstract
This thesis deals with intraspeaker correlation analyses of speech sounds, and the possible utilization of this correlation to speech recognition. Current approaches to phonetic classification, regardless of whether they use context-dependent or -independent models, achieve classification based on locally optimum criteria. They make no fundamental assumption about the fact that the same vocal tract is used to make all the phonemes in an utterance. Thus, for example, a system may classify one sound in the beginning of an utterance as an /s/ belonging to a long vocal tract, while inappropriately, classifying another sound in the same utterance as an /Sigma/ belonging to a short vocal tract. Clearly the different phonemes of an utterance are correlated. Hence there is a set of speaker-specific constraints that can be imposed among all sounds in an utterance, and phonetic decoding should be accomplished by exploiting these constraints.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 1992
- Accession Number
- ADA256801
Entities
People
- Partha Niyogi
Organizations
- Massachusetts Institute of Technology