Speech Coding and Phoneme Classification Using a Back-Propagation Neural Network

Abstract

Speech is a natural, unspecialized method of communication that is perhaps the ultimate machine interface. Previous attempts to provide such an interface, however, have been limited to pre-defined vocabularies of an artificial syntax. This paper presents a method for speaker-dependent speech identification that uses a back-propagation neural network to determine the phonemes present within a voice signal. The vocal tract changes slowly in time and can be modeled using the pitch and formant frequencies of the voice. These frequencies relate the position of the vocal tract to specific pronunciations and are obtained by using a homomorphic filtering process that separates the vocal tract's impulse response from the excitation source. The frequency representation of this response is concatenated with the excitation containing the pitch frequency and applied to the input layer of the neural network. From this information, the network selects combinations of features that identify the phonemes which are present. This network was trained on a set of speaker dependent phonemes, and now phonetically classifies new speech input. This classification scheme could be used to translate linguistic messages into machine code with a very high data rate. This benefit would allow for real-time interaction with machines with no specialized training. Applications could be as simple as providing quick voice to text processing or as diverse as implementing a control system with response time tied to specified voice patterns.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 07, 1997
Accession Number
ADA418472

Entities

People

  • Brett A. St. George

Organizations

  • United States Naval Academy

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Automated Speech Recognition
  • Computational Science
  • Computer Programming
  • Electrical Engineering
  • Filtration
  • Frequency
  • Frequency Domain
  • Language
  • Neural Networks
  • Pattern Recognition
  • Recognition
  • Signal Processing
  • Speech Compression
  • Statistical Analysis
  • United States Naval Academy

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Control Systems Engineering.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks