VOICE SOUND RECOGNITION.

Abstract

The report examines the merits of a new speech perception theory and its application to the voice sound recognition problem. Most conventional speech recognition systems require 7 important parameters to activate the recognition logic: the frequency of the first three formants, the amplitude of the first three formants and a voice-unvoiced decision. The theory tested uses just three important parameters: the frequency of a 'single equivalent formant' (SEF), the SEF amplitude, and a voicing decision. This decrease of more than two to one in input parameters should mean significantly more than a two to one reduction in the complexity of the recognition logic. Statistics were compiled on a set of 20 words uttered by an ensemble of 5 speakers (3 male and 2 female). Although some recognition confusions were encountered in some phonetically similar sounds they were not unexpected, since the statistics were compiled on segmented phonemes (sans transient cues). However, other confusions were the result of imperfect parameter extractors, and hopefully will be corrected as improved circuits are developed. Recognition rates as high as 98 percent were measured in this initial phase of the program. (Author)

Document Details

Document Type
Technical Report
Publication Date
Jul 01, 1965
Accession Number
AD0619964

Entities

People

  • Casimir F. Piotrowski
  • Charles F. Teacher

Tags

DTIC Thesaurus Topics

  • Amplitude
  • Automated Speech Recognition
  • Computing-Related Activities
  • Data Science
  • Frequency
  • Information Science
  • Neurobehavioral Manifestations
  • Perception
  • Recognition
  • Segmented
  • Statistics

Readers

  • Speech Processing/Speech Recognition.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference