Speaker Segmentation and Clustering Using Gender Information

Abstract

This paper considers the segmentation and clustering of conversational speech for the two-wire training (3conv2w) and two-wire testing (1conv2w) conditions of the NIST 2005 Speaker Recognition Evaluation. A notable feature of the system described is that each file is labeled as containing either opposite- or same-gender speakers The speech segments for opposite-gender files are clustered by gender, while those for same-gender files are processed by agglomerative clustering. By using gender information in the clustering of the opposite-gender files, the equal error rate in the 3conv2w training condition was reduced from 15.2% to 9.9%. For the 1conv2w testing condition, clustering opposite-gender files by gender did not improve performance over agglomerative clustering; however, it was over 100 times faster than agglomerative clustering on the opposite-gender files.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2006
Accession Number
ADA444863

Entities

People

  • Brian M. Ore
  • Eric G. Hansen
  • Raymond E. Slyh

Organizations

  • General Dynamics

Tags

Communities of Interest

  • Human Systems

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Automated Speech Recognition
  • Change Detection
  • Computer Vision
  • Detection
  • Detectors
  • Digital Signal Processing
  • False Alarms
  • Hidden Markov Models
  • Identification
  • Language
  • Recognition
  • Signal Processing
  • Test And Evaluation
  • Training

Readers

  • Gender and Food Studies
  • Neural Network Machine Learning.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML