Frequency Domain Speech Coding

Abstract

The major goal of this research was to investigate speech coding techniques in an attempt to achieve high quality speech transmittable at 4800 bits per second. The approach taken to achieve this goal was to code the frequency domain representation of speech. Speech was represented by a sparse set of frequency components. Four frequency selections schemes were implemented, and the resulting frequency coefficients (magnitude and phase) were coded in an efficient manner for transmission. Specific techniques involved in the speech coder included: (1) a recurrent neural architecture to make a periodic/noiselike decision, (2) the use of variable length windows for analysis and synthesis, and (3) a representation of noiselike speech using frequency banded energy information. The quality of the reconstructed speech was tested using listening tests which compared the different frequency selection schemes, along with original and sampled speech. The system did not achieve 'toll quality' speech; however, the resulting speech was highly intelligible. Specific quality degradation was noted at window transitions.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1991
Accession Number
ADA243757

Entities

People

  • Shane Switzer

Organizations

  • Air Force Institute of Technology

Tags

DTIC Thesaurus Topics

  • Coding
  • Computer Programming
  • Computer Programs
  • Computers
  • Decoding
  • Detection
  • Digital Signal Processing
  • Ear
  • Energy Bands
  • Frequency
  • Frequency Bands
  • Frequency Domain
  • Larynx
  • Processing Equipment
  • Speech Compression
  • Standards
  • Time Domain

Fields of Study

  • Engineering

Readers

  • Neural Network Machine Learning.
  • Radio communications and signal processing.
  • Speech Processing/Speech Recognition.