Speaker Segmentation and Clustering Using Gender Information

Abstract

This paper considers the segmentation and clustering of conversational speech for the two-wire training (3conv2w) and two-wire testing (1conv2w) conditions of the NIST 2005 Speaker Recognition Evaluation. A notable feature of the system described is that each file is labeled as containing either opposite- or same-gender speakers The speech segments for opposite-gender files are clustered by gender, while those for same-gender files are processed by agglomerative clustering. By using gender information in the clustering of the opposite-gender files, the equal error rate in the 3conv2w training condition was reduced from 15.2% to 9.9%. For the 1conv2w testing condition, clustering opposite-gender files by gender did not improve performance over agglomerative clustering; however, it was over 100 times faster than agglomerative clustering on the opposite-gender files.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 2006
Accession Number: ADA444863

Entities

People

Brian M. Ore
Eric G. Hansen
Raymond E. Slyh

Organizations

General Dynamics

Speaker Segmentation and Clustering Using Gender Information

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas