Test Token Driven Acoustic Balancing for Sparse Enrollment Data in Cohort GMM Speaker Recognition
Abstract
For this study, we address the problem to in-set/out-of-set speaker recognition with sparse enrollment data. Sparse enrollment data presents a unique challenge due to a lack of acoustic space coverage. The proposed algorithm focuses on filling acoustic holes and fortifying the phone expectation in the test stage. This scheme is possible by using the GMM model to classify the speaker phone information at the feature level. The parallel training for most occurred (top) and less occurred (bottom) rank ordered mixture classification (speaker phone class) information is called "Sweet-16", and the employing a test data mixture histogram using the Sweet-16 is called "Sweet-16 On-The-Fly (OTF)". The Sweet-16 OTF method is evaluated using telephone conversation speech from the FISHER corpus. The Sweet-16 OTF improves on average 2.17% absolute EER over the previous Sweet-16, and average 4.03% absolute EER over GMM-UBM baseline using 2sec test data. The proposed algorithm improvement is a noteworthy stage to compensate for both sparse enrollment data and limited test data.
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 08, 2009
- Accession Number
- ADA517221
Entities
People
- John H. Hansen
- Jun-won Suh