Modeling and Classification of Acoustic Transients by Speech Recognition Techniques
Abstract
Techniques from automatic speech recognition are applied to the problem of modeling and classifying acoustic transients. Linear Predictive Coding (LPC), Vector Quantization (VQ) and Hidden Markov Models (HMMs) are three popular techniques which when combined together are called the structural- parametric approach to the recognition of speech sounds. The same approach is applied first in modeling and then in identifying three classes of brief, wideband sounds, similar to underwater passive sonar transients. An LPC analysis synthesis system operating below 9000 bits per second can produce high quality synthetic transients. The data rate necessary to maintain high quality can be further reduced to about 1100 bits per second by LPC followed by VQ, using the ltakura Saito (IS) class of distortion measures. The high fidelity achievable at low rates is evidence that LPC is a good spectral representation and that the IS distortion measure is meaningful in the comparison of transient spectra. Classification decisions based solely on averaged VQ distortion or entropy result in a classification accuracy of over 97%. Classification decisions based on VQ followed by HMMS result in a classification performance of the HMMM structures. The product code HMM consists of two independent HMMS per class; a classification decision is made by combining the results of the two independent HMMs.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1989
- Accession Number
- ADA218070
Entities
People
- Jeffrey P. Woodard