Speech Compression and Synthesis

Abstract

This report concludes our work for the past two years on speech compression and synthesis. A real-time variable-frame-rate LPC vocoder was implemented operating at an average rate of 2000 bits/s. We also tested our mixed-source model as part of the vocoder. To improve the reliability of the extraction of LPC parameters, we implemented and tested a range of adaptive lattice and autocorrelation algorithms. For data rates above 5000 bits/s, we developed and tested a new high-frequency regeneration technique, spectral duplication, which reduces the roughness in the synthesized speech. As the first part of our effort towards a very-low-rate (VLR) vocoder, we implemented a phonetic synthesis program that would be compatible with our initial design for a phonetic recognition program. We also recorded and partially labeled a large data base of diphone templates. During the second year we continued our work toward a VLR vocoder, and also developed a multirate embedded-coding speech compression program that could transmit speech at rates varying from 9600 to 2400 b/s. The phonetic synthesis program and the labeling of the diphone template network were completed. There are currently 2845 diphone templates. We also implemented an initial version of a phonetic recognizer based on a network representation of diphone templates. The recognizer allows for incremental training of the network by modification of existing templates or addition of new templates.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 1980
Accession Number
ADA092578

Entities

People

  • John Makoul
  • John Sorensen
  • Michael Berouti
  • Richard Schwartz

Organizations

  • BBN Technologies

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Bandwidth
  • Coding
  • Computer Programming
  • Computer Programs
  • Computers
  • Data Rate
  • Databases
  • Debugging
  • Electrical Engineering
  • Frequency
  • Frequency Bands
  • Signal Processing
  • Speech Compression
  • Speech Quality
  • Time Domain
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Speech Processing/Speech Recognition.