Sequential Organization and Room Reverberation for Speech Segregation

Abstract

Inspired by the perceptual account of auditory scene analysis, significant advances were made in speech segregation in recent years. Despite these advances, two major challenges remained: sequential organization and room reverberation. This project aimed to address these two challenges. Substantial progress has been made along the following directions. First a tandem algorithm was developed that performs pitch tracking and voiced speech segregation iteratively. Second, a multipitch tracking algorithm was proposed for noisy and reverberant speech, which was then used in a novel, supervised learning approach to segregation of voiced speech in reverberant environments. Third, a method was suggested for unvoiced speech segregation by first removing voiced speech and periodic components, and then grouping unvoiced speech segments through analyzing their spectral characteristics. Two algorithms were proposed for sequential organization, an unsupervised clustering algorithm applicable to monaural recordings and a binaural algorithm that integrates monaural and binaural analyses. In addition, speech intelligibility tests were conducted and their results firmly establish the effectiveness of binary masking for improving human speech recognition in noisy backgrounds.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 28, 2012
Accession Number
ADA567198

Entities

People

  • DeLiang Wang

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automated Speech Recognition
  • Background Noise
  • Computer Science
  • Hidden Markov Models
  • Human Factors Engineering
  • Identification
  • Intelligibility
  • Language
  • Machine Learning
  • Neural Networks
  • Probability
  • Recognition
  • Speech
  • Supervised Machine Learning

Readers

  • Distributed Systems and Data Platform Development
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms