Speaker Verification Using a Dynamic, 'Articulatory' Segmental Hidden Markov Model

Abstract

This is the final report for EOARD project #033060 "Speaker verification using a dynamic, 'articulatory' segmental hidden Markov model". A segmental HMM is HMM whose states are associated with sequences of acoustic feature vectors rather than individual vectors. This report describes the results of experiments in which such a model is applied to text-dependent and independent speaker-detection on the YOHO and Switch-board corpora, respectively. Text-dependent speaker verification results on YOHO using a simple segmental HMM show a 44% reduction in false acceptances compared with a conventional HMM. A type of 'segmental GMM' is then described for text-independent speaker detection. In order to apply this model to the NIST 2003 single-speaker test set, various techniques are developed to reduce its computational load. A range of experiments are then reported which investigate the utility of different aspects of this model for text-independent speaker-detection. From these experiments we have been unable to demonstrate a benefit, in terms of text-independent speaker-detection accuracy, from the use of dynamic segment models corresponding to linear trajectories with non-zero slope. Consequently we have also been unable to demonstrate any benefit from the use of longer segments. Thus there is little evidence from these experiments that non-stationary sections of a speech signal contain important individual differences which can be exploited for speaker-detection. If this is true, it goes some way towards explaining the success of GMM-based approaches. We conclude that further work, to determine definitively the contribution of non-stationary segments to speaker-detection is needed.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2004
Accession Number
ADA445441

Entities

People

  • Martin Russell
  • Ying Liu

Organizations

  • University of Birmingham

Tags

Communities of Interest

  • Cyber

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Automated Speech Recognition
  • Decoding
  • Detection
  • Dynamics
  • False Alarms
  • Frequency
  • Hidden Markov Models
  • Language
  • Markov Models
  • Probability
  • Recognition
  • Signal Processing
  • Test Sets
  • Trajectories
  • Universities
  • Verification

Readers

  • Computational Modeling and Simulation
  • Speech Processing/Speech Recognition.