The 2005 AFRL/HEC One-Speaker Detection Systems
Abstract
This paper describes the one-speaker detection systems submitted by AFRL/HEC for several of the training and testing conditions in the 2005 NIST Speaker Recognition Evaluation. For each condition, the overall system score was the weighted combination of scores from several component systems. The component systems were based on: mel-frequency cepstral coefficients (MFCCs) and (Gaussian mixture models (GMMs); MFCCs and phoneme-specific GMMs (PS-GMMs); linear-prediction-based cepstral coefficients (LPCCs) from closed-phase analysis; formant center frequencies, formant bandwidths, and fundamental frequency(FMBWFO); and word language modeling (WLM). The score combination was done using single-layer perceptrons, with the grouping of the component systems depending on the lengths of the training and testing files. For some of the testing and/or training conditions involving ten-second speech files, the system performance improved from the inclusion of the FMBWFO and LPCC systems, while the MFCC/PS-GMM system provided additional benefits in the one-conversation testing conditions involving larger amounts of training data.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 2006
- Accession Number
- ADA445157
Entities
People
- Brian M. Ore
- Eric G. Hansen
- Raymond E. Slyh
Organizations
- General Dynamics