Speaker-Machine Interaction in Automatic Speech Recognition.
Abstract
The study examines the feasibility and limitations of speaker adaptation in improving the performance of a fixed (speaker-independent) automatic speech recognition system. A fixed vocabulary of 55 syllables is used in the recognition system, containing eleven stops and fricatives and five tense vowels. The results of the experiment on speaker adaptation, performed with 6 male and 6 female adult speakers, show that speakers can learn to change their articulations to improve recognition scores. The recognition scheme is based on the extraction of several acoustic features from the speech signal. This is accomplished by a hierarchy of decisions made on carefully selected parameters that are computed from a spectral description of the speech signal by means of a set of energoids (energy centroids), each energoid representing the center of energy concentration in a particular spectral energy band. Short-time spectra were obtained either from a bank of 36 bandpass filters covering the range 150-7025 Hz, or by directly computing the fast Fourier transform of portions of the sampled speech signal. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 15, 1970
- Accession Number
- AD0718255
Entities
People
- John I. Makhoul
Organizations
- Massachusetts Institute of Technology