Speech Transformations Based on a Sinusoidal Representation

Abstract

In this report, a new speech analysis/synthesis technique is presented which provides the basis for a general class of speech transformation including time-scale modification, frequency scaling, and pitch modification. These modifications can be performed with a time-varying change, permitting continuous adjustment of a speaker's fundamental frequency and rate of articulation. The method is based on a sinusoidal representation of the speech production mechanism that has been shown to produce synthetic speech that preserves the waveform shape and is essentially perceptually indistinguishable from the original. Although the analysis/synthesis system originally was designed for single-speaker signals, it is equally capable of recovering and modifying nonspeech signals such as music; multiple speakers, marine biologic sounds, and speakers in the presence of interferences such as noise and musical backgrounds. Keywords: Sinusoidal models; Sine waves; Timer varying modifications; Joint time frequency modifications.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 16, 1986
Accession Number
ADA169740

Entities

People

  • Robert J. Mcaulay
  • Thomas E. Quatieri

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Air Platforms
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Adaptive Systems
  • Algorithms
  • Background Noise
  • Bandwidth
  • Computational Complexity
  • Frequency
  • Frequency Domain
  • High Resolution
  • Integrals
  • Larynx
  • Production
  • Signal Processing
  • Sine Waves
  • Speech Analysis
  • Three Dimensional
  • Waveforms
  • Waves

Fields of Study

  • Engineering

Readers

  • Atmospheric Science / Meteorology, specifically Wind Wave Turbulence.
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design