Segmentation and Labeling of Speech: A Comparative Performance Evaluation

Abstract

This thesis studies speech recognition at the parametric level. It attempts to evaluate and understand the relative merits of a number of alternative design choices at that level. In particular, it involves an investigation of segmentation and labeling techniques, and the use of parametric representations for the acoustic signal. Every speech recognition system employs some parametric representation and some initial signal to symbol transformation. The author shows the performance currently available for these initial processes, and asserts that such performance is comparable to human performance. After presenting the relative merits of some typical parametric representations, we develop a methodology for such comparative evaluation. Simple, parameter- independent schemes for segmenting, labeling, and training are also developed. The role of pattern classification techniques is clarified, as it relates to the initial signal to symbol transformation. Four parametric representations were chosen for study: a set of amplitudes and zero-crossing measurements from 5 octave filters; a set of energy measurements from a 1/3 octave filter bank; a smoothed, short-time spectrum computed from the LPC filter and the LPC coefficients themselves. Note that the first two involve the use of analog devices. Each method yields a set of measurements at uniform, short intervals--a pattern. Distance functions, chosen from pattern classification theory, are then applied to the parameter patterns as measures of acoustic similarity.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1975
Accession Number
ADA025094

Entities

People

  • Henry G. Goldberg

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Human Systems

DTIC Thesaurus Topics

  • Acoustic Signals
  • Artificial Intelligence
  • Automated Speech Recognition
  • Computational Science
  • Computer Science
  • Databases
  • Grammars
  • Information Processing
  • Information Science
  • Language
  • Linguistics
  • Machine Learning
  • Network Science
  • Pattern Recognition
  • Probabilistic Models
  • Statistical Analysis
  • Surveys

Fields of Study

  • Engineering

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Computer Vision.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference