A Survey of Temporal Techniques Applied Toward Neural Network Based Continuous Speech Recognition
Abstract
Neural network (NN) architectures for the recognition of continuous speech are reviewed in this report. Historically, NNs were developed for the recognition of static patterns. To use such networks for speech recognition required that the speech be segmented into chunks such as words or phonemes that could be recognized individually as static patterns. In real speech, the execution of a word or a phoneme depends to some extent on what words or phonemes precede it. These coarticulation effects cause problems unless prior history is used to aid the recognition process. New architectures are being developed to permit the speech stream to be treated as the continuous stream that it is. Segmenting still occurs, which is legitimate, since humans do identify individual words, syllables and phonemes, but the segmentation may be intrinsic to the recognition process. Alternatively, the segmentation may be done by a front-end process that preserves coarticulation effects. Hierarchic structures that recognize events of increasing temporal scale seem to provide the most promising path toward effective recognition of continuous speech.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 01, 1992
- Accession Number
- ADA392725
Entities
People
- Chris D. Love
Organizations
- Defence Research and Development Canada