Speaker Verification Using a Dynamic, 'Articulatory' Segmental Hidden Markov Model
Abstract
This is the final report for EOARD project #033060 "Speaker verification using a dynamic, 'articulatory' segmental hidden Markov model". A segmental HMM is HMM whose states are associated with sequences of acoustic feature vectors rather than individual vectors. This report describes the results of experiments in which such a model is applied to text-dependent and independent speaker-detection on the YOHO and Switch-board corpora, respectively. Text-dependent speaker verification results on YOHO using a simple segmental HMM show a 44% reduction in false acceptances compared with a conventional HMM. A type of 'segmental GMM' is then described for text-independent speaker detection. In order to apply this model to the NIST 2003 single-speaker test set, various techniques are developed to reduce its computational load. A range of experiments are then reported which investigate the utility of different aspects of this model for text-independent speaker-detection. From these experiments we have been unable to demonstrate a benefit, in terms of text-independent speaker-detection accuracy, from the use of dynamic segment models corresponding to linear trajectories with non-zero slope. Consequently we have also been unable to demonstrate any benefit from the use of longer segments. Thus there is little evidence from these experiments that non-stationary sections of a speech signal contain important individual differences which can be exploited for speaker-detection. If this is true, it goes some way towards explaining the success of GMM-based approaches. We conclude that further work, to determine definitively the contribution of non-stationary segments to speaker-detection is needed.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2004
- Accession Number
- ADA445441
Entities
People
- Martin Russell
- Ying Liu
Organizations
- University of Birmingham