Stochastic Modeling as a Means of Automatic Speech Recognition
Abstract
Automatic recognition of continuous speech involves estimation of a sequence X(1), X(2), X(3), ..., X(T) which is not directly observed (such as the words of a spoken utterance), based on a sequence Y(1), Y(2), Y(3), ..., Y(T) of related observations (such as the sequence of acoustic parameter values) and a variety of sources of knowledge. Formally the author wishes to find the sequence x(1:T) which maximizes the a posteriori probability Pr(x(1:T))=(1:T) Y(1:T) =y(1:T),A,L.P,S), where A,L,P,S represent the acoustic-phonetic, lexical, phonological, and syntactic-semantic knowledge. A speech recognition system must attempt to approximate a solution to this problem, whether or not the system uses a formal stochastic model. The DRAGON speech recognition system models the knowledge sources as probalistic functions of Markov processes. The assumption of the Markov property allows the use of an optimal search strategy. A simplified implementation of the DRAGON system has been developed using knowledge A and L, and some of the knowledge from S.
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 01, 1975
- Accession Number
- ADA013808
Entities
People
- James K. Baker
Organizations
- Carnegie Mellon University