Learning the Hidden Structure of Speech.

Abstract

In the work described here, we apply the back-propagation neural network learning procedure to the analysis and recognition of speech. Because this learning procedure requires only examples of input-outputs pairs, it is not necessary to provide it with any initial description of speech features. Rather, the network develops its own set of representational features during the course of learning. A series of computer simulation studies were carried out to assess the ability of these networks to accurately label sounds; to learn to recognize sounds without labels; and to learn feature representations of continuous speech. These studies demonstrated that the networks can learn to label pre-segmented naive sounds tokens with accuracies of up to 95%. Networks trained on segmented sounds using a strategy that requires no external labels were able to recognize and delineate sounds in continuous speech. These networks developed rich internal representations that included units which corresponded to such traditional distinctions as vowels and consonants, as well as units which were sensitive to novel and non-standard features. Networks trained on a large corpus of un-segmented, continuous speech without labels also developed interesting feature representations that may be useful in both segmentation and label learning.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 1987
Accession Number: ADA178058

Entities

People

David Zipser
Jeffrey L. Elman

Organizations

University of California, San Diego

Learning the Hidden Structure of Speech.

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas