Exploration of Behavioral, Physiological, and Computational Approaches to Auditory Scene Analysis
Abstract
We present an overview for the study of auditory perception and scene analysis through the three main approaches researchers have used to study perception in general behavioral, physiological, and computational. At the behavioral level, we discuss the principles and origins of auditory scene analysis, and establish the relationship between auditory scene analysis and auditory masking. Within auditory masking, we note the coexistence of informational and energetic masking, and utilize the ideal time-frequency binary masks in a series of speech intelligibility experiments to isolate the energetic component of speech-on-speech masking. At the physiological level, we propose the adoption of the two-dimensional time-frequency oscillatory correlation representation as a main representation in auditory perception, after reviewing several of the theories and experiments in neurophysiology in effort to find its support. Finally, at the computational level, we extend an existing implementation of oscillatory correlation, LEGION [144], to simulate the major behavioral principles in alternating-tone sequences. Most notably, the decision boundaries of the temporal coherence boundary (TCB) and fission boundary (FB) first observed by Van Noorden [135] are automatically generated by the model. The results are compared to several existing implementations designed to simulate alternating-tone sequences [11, 104, 139]. Throughout this thesis, we use the three levels of analysis proposed by Marr in vision [89]. We emphasize the importance of balance at each level of analysis, and their relationship with the three approaches in the study auditory perception.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2004
- Accession Number
- ADA611902
Entities
People
- Peter S. Chang
Organizations
- Ohio State University