A Perception and Nonlinear PDE Based Approach to Processing Spoken Words
Abstract
During the period of June 1999 to June 2000, supported by ARO grant DAAD 19-99-1-0248, we developed a novel nonlinear transformation to process spoken words in noisy environment, based on human hearing perception and properties of focusing partial differential equation (PDE). The transformation was made on the short-term Fourier spectra of speech signals. It was designed to reduce noise through time adaptation, and enhance spectral peaks (formants) by evolving a-focusing quadratic Cahn-Hillard equation. Time adaptation and peak focusing (a.k.a lateral inhibition) are essential processing mechanisms in human cochleas. Numerical results on noisy spoken words indicated that the transformed spectral pattern of the spoken words was insensitive to noise (signal-to-noise ratio (SNR) ranging from 0 to 20 dB). The spectral distances between noisy and original words decreased after the transformation. Numerical experiment on eleven spoken words at SNR = 5 dB, for example, reached a recognition rate as high as 100%. These very encouraging results showed the success of our nonlinear transformation and the needs of its further development within our framework. In this final report, we state the problem studied summarize main results, and point out future directions.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 14, 2000
- Accession Number
- ADA385830
Entities
People
- Jack Xin
- Yingyong Qi
Organizations
- University of Arizona