Auditory Segmentation Based on Onset and Offset Analysis

Abstract

A typical auditory scene in a natural environment contains multiple sources. Auditory scene analysis (ASA) is the process in which the auditory system segregates an auditory scene into streams corresponding to different sources. Segmentation is a major stage of ASA by which an auditory scene is decomposed into segments, each containing signal mainly from one source. We propose a system for auditory segmentation based on analyzing onsets and offsets of auditory events. The proposed system first detects onsets and offsets, and then generates segments by matching corresponding onset and offset fronts. This is achieved through a multiscale approach based on scale-space theory. A quantitative measure is suggested for segmentation evaluation. Systematic evaluation shows that most of target speech, including unvoiced speech, is correctly segmented, and target speech and interference are well separated into different segments. Our approach performs much better than a cross-channel correlation method.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2005
Accession Number
AD1001126

Entities

People

  • DeLiang Wang
  • Guoning Hu

Organizations

  • Ohio State University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Acoustic Signals
  • Cognitive Science
  • Cognitive Systems Engineering
  • Computer Science
  • Computer Vision
  • Computers
  • Detection
  • Differential Equations
  • Event Detection
  • Filtration
  • Frequency
  • Frequency Bands
  • Hidden Markov Models
  • Image Segmentation
  • Markov Models
  • Models
  • Recognition

Readers

  • Radar Systems Engineering.
  • Speech Processing/Speech Recognition.

Technology Areas

  • Space