A Supervised Learning Approach to Monaural Segregation of Reverberant Speech

Abstract

A major source of signal degradation in real environments is room reverberation. Monaural speech segregation in reverberant environments is a particularly challenging problem. Although inverse filtering has been proposed to partially restore the harmonicity of reverberant speech before segregation, this approach is sensitive to specific source/receiver and room configurations. This study proposes a supervised learning approach to monaural segregation of reverberant voiced speech, which learns to map from a set of pitch-based auditory features to a grouping cue encoding the posterior probability of a time-frequency (T-F) unit being target dominant given observed features. We devise a novel objective function for the learning process, which directly relates to the goal of maximizing signal-to-noise ratio. The models trained using this objective function yield significantly better T-F unit labeling. A segmentation and grouping framework is utilized to form reliable segments under reverberant conditions and organize them into streams. Systematic evaluations show that our approach produces very promising results.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 2008
Accession Number: AD1001153

Entities

People

DeLiang Wang
Zhaozhang Jin

Organizations

Ohio State University

A Supervised Learning Approach to Monaural Segregation of Reverberant Speech

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas