The 2005 AFRL/HEC One-Speaker Detection Systems

Abstract

This paper describes the one-speaker detection systems submitted by AFRL/HEC for several of the training and testing conditions in the 2005 NIST Speaker Recognition Evaluation. For each condition, the overall system score was the weighted combination of scores from several component systems. The component systems were based on: mel-frequency cepstral coefficients (MFCCs) and (Gaussian mixture models (GMMs); MFCCs and phoneme-specific GMMs (PS-GMMs); linear-prediction-based cepstral coefficients (LPCCs) from closed-phase analysis; formant center frequencies, formant bandwidths, and fundamental frequency(FMBWFO); and word language modeling (WLM). The score combination was done using single-layer perceptrons, with the grouping of the component systems depending on the lengths of the training and testing files. For some of the testing and/or training conditions involving ten-second speech files, the system performance improved from the inclusion of the FMBWFO and LPCC systems, while the MFCC/PS-GMM system provided additional benefits in the one-conversation testing conditions involving larger amounts of training data.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 2006
Accession Number: ADA445157

Entities

People

Brian M. Ore
Eric G. Hansen
Raymond E. Slyh

Organizations

General Dynamics

The 2005 AFRL/HEC One-Speaker Detection Systems

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas