A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation

Abstract

The National Institute of Standards and Technology (NIST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different low- and high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI s submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2013
Accession Number
ADA614010

Entities

People

  • Luciana Ferrer
  • Martin Graciarena
  • Mitchell Mclaren
  • Nicolas Scheffer
  • Vikramjit Mitra
  • Yun Lei

Organizations

  • SRI International

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Amplitude Modulation
  • Coefficients
  • Data Sets
  • Detection
  • Discriminant Analysis
  • False Alarms
  • Filters
  • Frequency
  • Hidden Markov Models
  • Metadata
  • Models
  • Probability
  • Recognition
  • Standards
  • Test And Evaluation
  • Training

Readers

  • Instructional Design and Training Evaluation.
  • Neural Network Machine Learning.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML