Integrated Robust Open-Set Speaker Identification System (IROSIS)

Abstract

This report summarizes our effort towards building a robust open-set speaker recognition system. It reviews the various techniques we have used for acoustic feature extraction, speaker modeling, scoring and score normalization, and presents experiment results. We have worked on all the modules of speaker recognition systems. At the front end, we have studied a variety of acoustic features and pre-/post-processing techniques, and have come up with a PPMD feature that combines the benefits of multitaper MFCC, DSCC, pre-emphasis, and short-time feature Gaussianization. At the speaker modeling and scoring stages, we have investigated GMM speaker modeling, SVM speaker modeling, and joint factor analysis (JFA). We have demonstrated that compared to GMM modeling, SVM modeling and scoring are not only better and also faster. We have also shown that T-norm of the scores improves speaker identification performance on the ROSSI database.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2012
Accession Number
ADA562148

Entities

People

  • Qin Jin
  • Yun Wang

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Biometric Security
  • Databases
  • Factor Analysis
  • Feature Extraction
  • Identification
  • Identification Systems
  • Information Science
  • Kernel Functions
  • Power Spectra
  • Probability
  • Probability Distributions
  • Recognition
  • Statistics
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML