Frequency Domain Speech Compression Using the Karhunen-Loeve Transform

Abstract

The purpose of this study was test the influence of phase on the quality of speech reproduced by a speaker dependent compression system. The tests consisted of compressing frequency domain speech vectors using the Karhunen-Loeve Transform, with and without phase, then making subjective judgements as to the reproduced quality. Error Metrics were then tested for their suitability as predictors of reproduced quality. The compression software transformed each speech vector into a vector of complex Fourier coefficients (only half of the coefficients are needed as transform is hermitian). Phase was preserved by using the real frequency components to form one vector and the corresponding imaginary components to form a second vector of real numbers which were then separately compressed. The expanded vectors were recombined and speech reconstructed by Inverse Fourier Transformation. Compression ratios of 8:1 could be achieved without any perceivable difference between the original speech and reconstructed speech by minimizing the MSE of each vector of the pair. The 8:1 Compression Ratio corresponded to a covariance matrix Condition Number of 200. Recommendations for further study into voice characterization and an optimal transform for speech are made.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 1993
Accession Number
ADA262613

Entities

People

  • Donald W. Dryley

Organizations

  • Air Force Institute of Technology

Tags

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • C Programming Language
  • Compression Ratio
  • Computer Programming
  • Covariance
  • Electrical Engineering
  • Frequency
  • Frequency Domain
  • Intelligibility
  • Numbers
  • Real Numbers
  • Recognition
  • Signal Processing
  • Simulations
  • Speech Compression
  • Training
  • Wavelet Transforms

Fields of Study

  • Engineering

Readers

  • Approximation Theory.
  • Computational Modeling and Simulation
  • Radio communications and signal processing.