Channel-Mismatch Compensation in Speaker Identification Feature Selection and Adaptation with Artificial Neural Networks.

Abstract

We develop and present results of an artificial neural network (ANN) based compensation technique for mismatched classifier training and testing conditions in speaker identification (SID). One ANN per feature per speaker is trained to perform a mapping of that feature from a corrupted condition to an undistorted condition. Therefore, a classifier trained under one condition may be used to classify data collected under a different condition. Speech utterances from 168 speakers, collected in a studio, and also re-recorded after transmission over telephone networks, are used for developing and testing the method. Peak formant resonant frequencies, their bandwidths, and pitch are used as features. These features from the studio speech are used to train Gaussian Mixture Model classifiers. Portions of the studio and telephone speech are used to train the compensation ANNs. In mismatched train and test conditions, features from telephone speech are modified by the trained ANNs and applied to the GMMs trained with features from studio speech. Without compensation, SID accuracy is 6%. The compensation method developed in this work provides mismatch SID accuracy of 58.3%. Previous research on the same data with the commonly used Mel Frequency Cepstral Coefficients as features and a typical compensation method of Cepstral Mean Subtraction with Band Limiting gives SID accuracy of 27.4% with the same type of classifiers.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 1998
Accession Number
ADA342401

Entities

People

  • Edmund A. Fitzgerald

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Human Systems
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accuracy
  • Air Force
  • Algorithms
  • Data Science
  • Databases
  • Electrical Engineering
  • Engineering
  • Families (Human)
  • Feature Selection
  • Frequency
  • Identification
  • Information Science
  • Larynx
  • Neural Networks
  • Probability
  • Resonant Frequency
  • Shell Scripts

Fields of Study

  • Engineering

Readers

  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Neural Networks