Integrated Robust Open-Set Speaker Identification System (IROSIS)
Abstract
This report summarizes our effort towards building a robust open-set speaker recognition system. It reviews the various techniques we have used for acoustic feature extraction, speaker modeling, scoring and score normalization, and presents experiment results. We have worked on all the modules of speaker recognition systems. At the front end, we have studied a variety of acoustic features and pre-/post-processing techniques, and have come up with a PPMD feature that combines the benefits of multitaper MFCC, DSCC, pre-emphasis, and short-time feature Gaussianization. At the speaker modeling and scoring stages, we have investigated GMM speaker modeling, SVM speaker modeling, and joint factor analysis (JFA). We have demonstrated that compared to GMM modeling, SVM modeling and scoring are not only better and also faster. We have also shown that T-norm of the scores improves speaker identification performance on the ROSSI database.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 2012
- Accession Number
- ADA562148
Entities
People
- Qin Jin
- Yun Wang
Organizations
- Carnegie Mellon University