Speech Recognition Using Multiple Features and Multiple Recognizers

Abstract

The purpose of this thesis is to demonstrate the feasibility of using multiple features and multiple recognizers to perform isolated word recognition. This is accomplished by performing multiple independent recognition tests and fusing the results together to get a single recognition result. The speech data is recorded and each word is extracted into a separate file. Eight features are calculated for each word. The features are calculated on 512 sample time slices and produce 16 component vector output. The three recognizers use the eight features to produce a total of 24 error distance lists. These lists are then fused together by adding the error values corresponding to each word. The word with the smallest fused error value is declared the recognition winner. Talker dependent and independent tests were run on a word set of zero through nine and A through Z. The talker dependent tests achieved accuracies between 87% and 100% depending on the talker. The talker independent tests achieved accuracies between 81% and 97%.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 03, 1991
Accession Number
ADA243791

Entities

People

  • Thomas F. Rathbun

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Accuracy
  • Acoustic Signals
  • Automated Speech Recognition
  • Computer Programming
  • Computer Programs
  • Computers
  • Ear
  • Hidden Markov Models
  • Image Recognition
  • Larynx
  • Lists (Data Structures)
  • Markov Models
  • Neural Networks
  • Operating Systems
  • Pattern Recognition
  • Recognition
  • Word Recognition

Readers

  • Acoustics.
  • Computational Linguistics
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference