Speech Recognition Using the Mellin Transform

Abstract

The purpose of this research was to improve performance in speech recognition. Specifically, a new approach was investigating by applying an integral transform known as the Mellin transform (MT) on the output of an auditory model to improve the recognition rate of phonemes through the scale-invariance property of the Mellin transform. Scale-invariance means that as a time-domain signal is subjected to dilations, the distribution of the signal in the MT domain remains unaffected. An auditory model was used to transform speech waveforms into images representing how the brain ?sees? a sound. The MT was applied and features were extracted. The features were used in a speech recognizer based on Hidden Markov Models. The results from speech recognition experiments showed an increase in recognition rates for some phonemes compared to traditional methods.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2006
Accession Number
ADA451292

Entities

People

  • Jesse R. Hornback

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Engineered Resilient Systems
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Automated Speech Recognition
  • Databases
  • Ear
  • Electrical Engineering
  • Feature Extraction
  • Hidden Markov Models
  • Integral Transforms
  • Integrals
  • Markov Models
  • Models
  • Pattern Recognition
  • Probability
  • Recognition
  • Time Domain

Readers

  • Calculus or Mathematical Analysis
  • Computational Linguistics
  • Radar Systems Engineering.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Translation