Speech Analysis/Synthesis Based on Perception.
Abstract
This dissertation describes a speech system based on a combination of physiological and psychoacoustic results which has been developed. The system contains a nonuniform Filter/Detector bank. A new relationship between Filter/Detectors and the Short-time Fourier Transform magnitude is derived, and a generalized version of the Short-Time Fourier Transform magnitude is used to implement the anlaysis system. The new relationship is also applied to a discussion of channel vocoders, spectrograms, the sliding Discrete Fourier Transform, average power spectrum estimation, and nonuniform bandwidth analysis. Next, a new synthesis approach is used to reconstruct signals form the magnitude data produced by the nonuniform analysis. Apart form an overall sign factor, the analysis/synthesis system achieves exact reconstruction in the absence of data modification. The ability of the system to reconstruct signals from modified data is also demonstrated. Suggestions for further research, including data reduction and automatic speech recognition applications, are given. Keywords include: Auditory modeling, short-time fourier transform, magnitude-only reconstruction, Power spectrum estimation, Perception, Filter banks, Speech recognition, Spectrograms, and Vocoders.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 05, 1984
- Accession Number
- ADA151320
Entities
People
- J. C. Anderson
Organizations
- Massachusetts Institute of Technology