Word Recognition in Continuous Speech Using Linear Prediction Analysis

Abstract

A promising method of automatic word recognition in continuous speech, recently designated as 'word spotting', has been demonstrated. The method uses error residual ratios from LPC (Linear Predictive Coding) vocoder analysis for waveform comparison and a dynamic programming procedure for time registration between the incoming speech and a template of the key word. Using a similarity threshold, the incoming speech is compared with several templates to account for variability in spectral shape. This system can work in real time using a real time vocoder. The multiple templates are used in such a way that a small number of templates, 3 or 4, is made to look like several hundred or more. This is accomplished by dynamically constructing a composite template from parts of each single template as part of the processing of the incoming speech so a particular composite template is constructed for each word being recognized. An accuracy of 99% with no false alarms was achieved using 205 key words, 5 different speakers, and approximately 10 minutes of speech text. Performance in the presence of additive white gaussian noise of approximately 11 dB signal-to- noise ratio was 66%; when the speech was processed to account for the noise, results improved to 85%-90% accuracy.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1976
Accession Number
ADA039009

Entities

People

  • Richard Wesley Christiansen

Organizations

  • University of Utah

Tags

Communities of Interest

  • C4I
  • Energy and Power Technologies
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Automated Speech Recognition
  • Computer Programming
  • Computer Programs
  • Computers
  • Detection
  • Detectors
  • False Alarms
  • Filters
  • Frequency Bands
  • Frequency Domain
  • Gaussian Noise
  • Literature Surveys
  • Pattern Recognition
  • Signal Processing
  • Statistics
  • Time Domain

Readers

  • Computer Vision.
  • Speech Processing/Speech Recognition.