A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation

Abstract

A lot of effort has been made in computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. The performance of current CASA systems on voiced speech segregation is limited by lacking a robust algorithm for pitch estimation. We propose a tandem algorithm that performs pitch estimation of a target utterance and segregation of voiced portions of target speech jointly and iteratively. This algorithm first obtains a rough estimate of target pitch, and then uses this estimate to segregate target speech using harmonicity and temporal continuity. It then improves both pitch estimation and voiced speech segregation iteratively. Systematic evaluation shows that the tandem algorithm extracts a majority of target speech without including much interference, and it performs substantially better than previous systems for either pitch extraction or voiced speech segregation.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2008
Accession Number
AD1001206

Entities

People

  • DeLiang Wang
  • Guoning Hu

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Automated Speech Recognition
  • Cognitive Science
  • Cognitive Systems Engineering
  • Computer Science
  • Computer Vision
  • Databases
  • Detection
  • Engineering
  • Frequency
  • Machine Learning
  • Noise
  • Noise Reduction
  • Recognition
  • Supervised Machine Learning
  • Validation

Fields of Study

  • Computer science
  • Engineering

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Speech Processing/Speech Recognition.