Multiclassifier Fusion of an Ultrasonic Lip Reader in Automatic Speech Recognition.

Abstract

This thesis investigates the use of two active ultrasonic devices in collecting lip information for performing and enhancing automatic speech recognition. The two devices explored are called the 'Ultrasonic Mike' and the 'Lip Lock Loop.' The devices are tested in a speaker dependent isolated word recognition task with a vocabulary consisting of the spoken digits from zero to nine. Two automatic lip readers are designed and tested based on the output of the ultrasonic devices. The automatic lip readers use template matching and dynamic time warping to determine the best candidate for a given test utterance. The automatic lip readers alone achieve accuracies of 65-89%, depending on the number of reference templates used. Next the automatic lip reader is combined with a conventional automatic speech recognizer. Both classifier level fusion and feature level fusion are investigated. Feature fusion is based on combining the feature vectors prior to dynamic time warping. Classifier fusion is based on a pseudo probability mass function derived from the dynamic time warping distances. The combined systems are tested with various levels of acoustic noise added. In one typical test, at a signal to noise ratio of 0dB, the acoustic recognizer's accuracy alone was 78%, the automatic lip reader's accuracy was 69%, but the combined accuracy was 93%. This experiment demonstrates that a simple ultrasonic lip motion detector, that has an output data rate 12,500 times less than a typical video camera, can significantly improve the accuracy of automatic speech recognition in noise.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1994
Accession Number
ADA289207

Entities

People

  • David L. Jennnings

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies
  • Sensors

DTIC Thesaurus Topics

  • Acoustic Signals
  • Automated Speech Recognition
  • Computer Programming
  • Computers
  • Detectors
  • Electrical Engineering
  • Feature Extraction
  • Hidden Markov Models
  • Identification
  • Pattern Recognition
  • Probability
  • Recognition
  • Signal Processing
  • Sound Waves
  • Standing Waves
  • Ultrasounds
  • Word Recognition

Readers

  • Sensor Fusion and Tracking Systems.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML