Continuous Speech Recognition Using Segmental Neural Nets
Abstract
We present the concept of a "Segmental Neural Net" (SNN) for phonetic modeling in continuous speech recognition. The SNN takes as input all the frames of a phonetic segment and gives as output an estimate of the probability of each of the phonemes, given the input segment. By tak- ing into account all the frames of a phonetic seg- ment simultaneously, the SNN overcomes the well- known conditional-independence limitation of hid- den Markov models (HMM). However, the prob- lem of automatic segmentation with neural nets is a formidable computing task compared to HMMs. Therefore, to take advantage of the training and decoding speed of HMMs, we have developed a novel hybrid SNN/HMM system that combines the advantages of both types of approaches. In this hy- brid system, use is made of the N-best paradigm to generate likely phonetic segmentations, which are then scored by the SNN. The HMM and SNN scores are then combined to optimize performance. In this manner, the recognition accuracy is guaran- teed to be no worse than the HMM system alone.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1991
- Accession Number
- ADA460342
Entities
People
- G. Zavaliagkos
- J. Makhoul
- Robert E. Schwartz
- S. Austin
Organizations
- BBN Technologies