FACP Speech Recognition/Transmission System.

Abstract

This report describes a phoneme vocoder capable of transmitting compressed speech data over bandlimited communication channels at rates lower than 200 bits per second. Using linear prediction analysis for parameter extraction, and sophisticated segmentation and labeling techniques, the vocoder analyzer codes the incoming speech signal into a sequence of discrete sound units, or phonemes. At the receiving end of the channel, the phoneme sequence is input to a digital speech synthesizer. An area function dyad synthesis procedure is described which is based on an area function model representing the vocal tract as a set of 14-cross-sectional areas. Area functions representing phoneme steady states (nuclei) and transitions between any ordered pair of phonemes (dyads) are stored in a dyad table. Given an input phoneme string, the synthesizer selects the corresponding sequence of nuclei and transitions and interpolates between each of the 14 cross-sectional areas, producing a model of the shape of the vocal tract changing in time. The filtering process of this vocal tract model is identical to the optimum inverse filter of linear prediction analysis, allowing direct conversion to linear predictive coding (LPC) synthesis. A terminal analog synthesizer is also described. Diagnostic Rhyme Tests (DRT) of vocoder performance for two male speakers yielded scores of 70.6% using area function dyad synthesis and 83.5% for terminal analog synthesis. (Author)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1978
Accession Number
ADA060115

Entities

People

  • B. T. Oshika

Organizations

  • System Development Corporation

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes
  • Weapons Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Analyzers
  • Automated Speech Recognition
  • Classification
  • Command And Control
  • Communication Channels
  • Communication Systems
  • Filters
  • Filtration
  • Frequency
  • Plastic Explosives
  • Recognition
  • Speech Analysis
  • Speech Compression
  • Speech Transmission
  • Steady State
  • Voice Communications

Readers

  • Approximation Theory.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms