Speech Analysis and Synthesis Based on Pitch-Synchronous Segmentation of the Speech Waveform.
Abstract
This report describes a new speech analysis/synthesis method. This new technique does not attempt to model the human speech production mechanism. Instead, we represent the speech waveform directly in terms of the speech waveform defined in a pitch period. A significant merit of this approach is the complete elimination of pitch interference because each pitch-synchronously segmented waveform does not include a waveform discontinuity. One application of this new speech analysis/synthesis method is the alteration of speech characteristics directly on raw speech. With the increased use of man-made speech in tactical voice message systems and virtual reality environments, such a speech generation tool is highly desirable. Another application is speech encoding operation at low data rates (2400 b/s or less). According to speech intelligibility tests, our new 2400 b/s encoder outperforms the current 2400-b/s LPC. This is also true in noisy environments. Because most tactical platforms are noisy (e.g., helicopter, high-performance aircraft, tank, destroyer), our 2400-b/s speech encoding technique will make tactical voice communication more effective; it will become an indispensable capability for future C4I.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 09, 1994
- Accession Number
- ADA288824
Entities
People
- George S. Kang
- Lawrence J. Fransen
Organizations
- United States Naval Research Laboratory