Deep Neural Networks for Speech Separation With Application to Robust Speech Recognition

Abstract

This project will investigate the speech separation problem and apply the results of speech separation to robust automatic speech recognition (ASR). Speech separation has been recently formulated as a time-frequency masking problem, which shifts the research focus to supervised learning. The proposed effort will employ deep neural networks(DNN) as the learning machine for supervised separation The proposed research aims to achieve the following objectives. The first objective is separation of speech from background noise. This will be accomplished by training DNN classifiers on extracted acoustic-phonetic features. The second objective is integration of spectrotemporal context for improved separation performance. Conditional random fields will be used to encode contextual constraints. The third objective is to achieve robust ASR in the DNN framework through integrated acoustic modeling and separation. The performance of the proposed system will be systematically evaluated using the recently constructed CHIME-2corpus.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 2018
Accession Number
AD1054337

Entities

People

  • DeLiang Wang

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Artificial Intelligence Computing
  • Artificial Intelligence Software
  • Automated Speech Recognition
  • Decoding
  • Deep Learning
  • Frequency
  • Frequency Bands
  • Frequency Domain
  • Information Science
  • Neural Networks
  • Noise
  • Recognition
  • Recurrent Neural Networks
  • Training

Fields of Study

  • Computer science

Readers

  • Speech Processing/Speech Recognition.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks