Frequency Domain Speech Coding

Abstract

The major goal of this research was to investigate speech coding techniques in an attempt to achieve high quality speech transmittable at 4800 bits per second. The approach taken to achieve this goal was to code the frequency domain representation of speech. Speech was represented by a sparse set of frequency components. Four frequency selections schemes were implemented, and the resulting frequency coefficients (magnitude and phase) were coded in an efficient manner for transmission. Specific techniques involved in the speech coder included: (1) a recurrent neural architecture to make a periodic/noiselike decision, (2) the use of variable length windows for analysis and synthesis, and (3) a representation of noiselike speech using frequency banded energy information. The quality of the reconstructed speech was tested using listening tests which compared the different frequency selection schemes, along with original and sampled speech. The system did not achieve 'toll quality' speech; however, the resulting speech was highly intelligible. Specific quality degradation was noted at window transitions.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Dec 01, 1991
Accession Number: ADA243757

Entities

People

Shane Switzer

Organizations

Air Force Institute of Technology

Frequency Domain Speech Coding

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers