Design of a Robust Maximum Likelihood Pitch Estimator for Speech in Additive Noise.

Abstract

Using the maximum likelihood technique an algorithm is developed for the extraction of pitch for speech that has been corrupted by additive noise. The speech model includes the effects of pitch periodicity and the spectral envelope which results in a processing structure that consists of a noise suppression prefilter in cascade with a comb filter bank estimator-correlator. The prefilter attenuates those frequency bands where the speech signal-to-noise ratio is low, hence most of the deleterious noise is rejected prior to the determination of pitch by the comb filter bank correlator. The comb filter interpretation leads to an implementation of the correlation function which avoids the problem of anomalous pitch errors due to the effects of windowing and formant sidelobe interaction which obviates the need for any type of spectral flattening. Pitch ambiguities are resolved using a majority logic scoring algorithm and a carefully designed pitch tracker that can adapt rapidly to gross pitch variations. The voiced/unvoiced decision is based on an adaptive minimum energy threshold, a high/low band energy measurement, a normalized pitch correlation coefficient and a pitch track continuity coefficient. A time domain implementation of the algorithm that runs in real time in conjunction with an LPC analysis/synthesis system at 2400 bps is described. (Author)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 11, 1979
Accession Number
ADA077159

Entities

People

  • Robert J. Mcaulay

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Additives (Chemicals)
  • Airborne
  • Background Noise
  • Bandwidth
  • Comb Filters
  • Correlators
  • Detection
  • Detectors
  • Filters
  • Frequency
  • Frequency Bands
  • Measurement
  • Sequences
  • Signal Processing
  • Time Domain
  • Waveforms
  • White Noise

Fields of Study

  • Engineering

Readers

  • Phased Array Antenna Design.
  • Regression Analysis.
  • Speech Processing/Speech Recognition.