Speaker-Machine Interaction in Automatic Speech Recognition.

Abstract

The study examines the feasibility and limitations of speaker adaptation in improving the performance of a fixed (speaker-independent) automatic speech recognition system. A fixed vocabulary of 55 syllables is used in the recognition system, containing eleven stops and fricatives and five tense vowels. The results of the experiment on speaker adaptation, performed with 6 male and 6 female adult speakers, show that speakers can learn to change their articulations to improve recognition scores. The recognition scheme is based on the extraction of several acoustic features from the speech signal. This is accomplished by a hierarchy of decisions made on carefully selected parameters that are computed from a spectral description of the speech signal by means of a set of energoids (energy centroids), each energoid representing the center of energy concentration in a particular spectral energy band. Short-time spectra were obtained either from a bank of 36 bandpass filters covering the range 150-7025 Hz, or by directly computing the fast Fourier transform of portions of the sampled speech signal. (Author)

Document Details

Document Type
Technical Report
Publication Date
Dec 15, 1970
Accession Number
AD0718255

Entities

People

  • John I. Makhoul

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Automatic
  • Bandpass Filters
  • Coverings
  • Energy Bands
  • Extraction
  • Fast Fourier Transforms
  • Filters
  • Hierarchies
  • Recognition
  • Spectra
  • Syllables
  • Vocabulary

Readers

  • Approximation Theory.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation