Speech Compression and Synthesis

Abstract

This report concludes our work for the past two years on speech compression and synthesis. A real-time variable-frame-rate LPC vocoder was implemented operating at an average rate of 2000 bits/s. We also tested our mixed-source model as part of the vocoder. To improve the reliability of the extraction of LPC parameters, we implemented and tested a range of adaptive lattice and autocorrelation algorithms. For data rates above 5000 bits/s, we developed and tested a new high-frequency regeneration technique, spectral duplication, which reduces the roughness in the synthesized speech. As the first part of our effort towards a very-low-rate (VLR) vocoder, we implemented a phonetic synthesis program that would be compatible with our initial design for a phonetic recognition program. We also recorded and partially labeled a large data base of diphone templates. During the second year we continued our work toward a VLR vocoder, and also developed a multirate embedded-coding speech compression program that could transmit speech at rates varying from 9600 to 2400 b/s. The phonetic synthesis program and the labeling of the diphone template network were completed. There are currently 2845 diphone templates. We also implemented an initial version of a phonetic recognizer based on a network representation of diphone templates. The recognizer allows for incremental training of the network by modification of existing templates or addition of new templates.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Oct 01, 1980
Accession Number: ADA092578

Entities

People

John Makoul
John Sorensen
Michael Berouti
Richard Schwartz

Organizations

BBN Technologies

Speech Compression and Synthesis

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers