Segregation of Unvoiced Speech from Nonspeech Interference
Abstract
Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech which lacks harmonic structure and has weaker energy, hence more susceptible to interference. We propose a new approach to the problem of segregating unvoiced speech from nonspeech interference. We first address the question of how much speech is unvoiced. The segregation process occurs in two stages: Segmentation and grouping. In segmentation, our model decomposes an input mixture into contiguous time-frequency segments by a multiscale analysis of event onsets and offsets. Grouping of unvoiced segments is based on Bayesian classification of acoustic-phonetic features. Systematic evaluation shows that the proposed system extracts a majority of unvoiced speech without including much interference, and it performs substantially better than spectral subtraction.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2007
- Accession Number
- AD1001218
Entities
People
- DeLiang Wang
- Guoning Hu
Organizations
- Ohio State University