Speech Compression and Synthesis
Abstract
This report concludes our work for the past two years on speech compression and synthesis. A real-time variable-frame-rate LPC vocoder was implemented operating at an average rate of 2000 bits/s. We also tested our mixed-source model as part of the vocoder. To improve the reliability of the extraction of LPC parameters, we implemented and tested a range of adaptive lattice and autocorrelation algorithms. For data rates above 5000 bits/s, we developed and tested a new high-frequency regeneration technique, spectral duplication, which reduces the roughness in the synthesized speech. As the first part of our effort towards a very-low-rate (VLR) vocoder, we implemented a phonetic synthesis program that would be compatible with our initial design for a phonetic recognition program. We also recorded and partially labeled a large data base of diphone templates. During the second year we continued our work toward a VLR vocoder, and also developed a multirate embedded-coding speech compression program that could transmit speech at rates varying from 9600 to 2400 b/s. The phonetic synthesis program and the labeling of the diphone template network were completed. There are currently 2845 diphone templates. We also implemented an initial version of a phonetic recognizer based on a network representation of diphone templates. The recognizer allows for incremental training of the network by modification of existing templates or addition of new templates.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 01, 1980
- Accession Number
- ADA092578
Entities
People
- John Makoul
- John Sorensen
- Michael Berouti
- Richard Schwartz
Organizations
- BBN Technologies