Multi-Level Processing in Human Speech Recognition

Abstract

This project has investigated the thesis that perception of the speech signal occurs at different levels of resolution. It has addressed this thesis in the domain of the temporal components of speech, where multiple levels of resolution are evident in the prosodic (macrostructure) and segmental (microstructure) levels of analysis. The body of this report is divided into three parts. The first part addresses interactions between different levels of temporal information in the speech signal. The second part addresses complexities that occur in the use of temporal cues in recognizing phonetic segments. One study in this section explores the dependencies between vowel and fricative identities that are cued by the same durational acoustic cue. A second series of studies, conducted with Jennifer L. Eberhardt, explores the effects of attention on the perceptual salience of temporal cues to the identity of phonetic segments. The third part of this report, discusses work, conducted with David W. Gow, that addresses the macro-level of temporal information. This work explores the role of stress in recognition and memory. Keywords: Speech perception, Prosody, Context effects, Phonetic segments, Fricatives.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 06, 1989
Accession Number
ADA216475

Entities

People

  • Peter C. Gordon

Organizations

  • Harvard University

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Accuracy
  • Ambiguity
  • Amplitude
  • Automated Speech Recognition
  • Cognition
  • Computers
  • Consonants
  • Contrast
  • Information Processing
  • Language
  • Linguistics
  • Microstructure
  • Phonemes
  • Phonology
  • Psychology
  • Reaction Time
  • Recognition

Readers

  • Speech Processing/Speech Recognition.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML