Segregation of Unvoiced Speech from Nonspeech Interference

Abstract

Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech which lacks harmonic structure and has weaker energy, hence more susceptible to interference. We propose a new approach to the problem of segregating unvoiced speech from nonspeech interference. We first address the question of how much speech is unvoiced. The segregation process occurs in two stages: Segmentation and grouping. In segmentation, our model decomposes an input mixture into contiguous time-frequency segments by a multiscale analysis of event onsets and offsets. Grouping of unvoiced segments is based on Bayesian classification of acoustic-phonetic features. Systematic evaluation shows that the proposed system extracts a majority of unvoiced speech without including much interference, and it performs substantially better than spectral subtraction.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2007
Accession Number
AD1001218

Entities

People

  • DeLiang Wang
  • Guoning Hu

Organizations

  • Ohio State University

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Automated Speech Recognition
  • Cognitive Science
  • Computer Science
  • Computer Vision
  • Databases
  • Detection
  • Detectors
  • Electrical Engineering
  • Feature Extraction
  • Language
  • Pattern Recognition
  • Recognition
  • Signal Processing
  • Speech
  • Standards
  • Two Dimensional

Readers

  • Computer Vision.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML