Techniques for Preprocessing Speech Signals for More Effective Audio Interfaces

Abstract

A user receiving instructions from various speech sources may be overwhelmed when these sources are speaking simultaneously in an unfavorable acoustical and noisy environment. In such a case, the user is required to separate the various sources from the mixture in order to make the speech intelligible. If no one source dominates or the mixing occurs for a sustained period of time, the human user may become mentally and physically overloaded resulting in fatigue and thus failing to separate the various speech sources into intelligible signals. In the case of the speech recognizer, recognition accuracy may be degraded to unacceptable levels. In this final report in research into methods for blind enhancement and separation of mixtures of speech signals, we present our results form further development and refinement of Frequency Domain, Second-Order Statistics-based decorrelation algorithms and in particular the Multi-resolution Frequency-Domain algorithm. These results include modifications to the algorithm for better performance at reduced cost, implementation, and performance evaluation under a wide set of noise and acoustic environments. Finally, we report on the development of a publically-available database specifically designed for evaluating various speech enhancement/separation algorithms.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Dec 01, 2001
Accession Number: ADA412195

Entities

People

Phillip L. Deleon

Organizations

New Mexico State University

Techniques for Preprocessing Speech Signals for More Effective Audio Interfaces

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers