On Ideal Binary Mask as the Computational Goal of Auditory Scene Analysis (Preprint)

Abstract

This chapter intends to examine the goal of CASA. After analyzing the advantages and disadvantages of different computational objectives, I suggest ideal time-frequency (T-F) mask as the computational goal of auditory scene analysis. The remainder of the chapter is organized as follows. The next section reviews different CASA evaluation criteria. Section 3 is devoted to a general discussion of the CASA goal, including an analysis of several alternative CASA objectives. Section 4 introduces the ideal binary mask, analyzes its properties, and argues for its use as the CASA goal. Section 5 describes two models that explicitly estimate the ideal binary mask. Finally, Section 6 concludes the chapter.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2014
Accession Number
AD1151363

Entities

People

  • DeLiang Wang

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Amplitude Modulation
  • Automated Speech Recognition
  • Cognitive Science
  • Computer Science
  • Computers
  • Ear
  • Electrical Engineering
  • Frequency
  • Frequency Bands
  • Hearing Loss
  • Human Factors Engineering
  • Information Processing
  • Information Systems
  • Modulation
  • Neural Networks
  • Psychology
  • Recognition
  • Signal Processing
  • Spatial Filtering

Readers

  • Business Analytics
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design