Tree-Based Models of Speech and Language,

Abstract

We describe here the application of classification and regression trees to some problems in speech and language. We begin with a brief overview of the technique. We then describe their application to: (1) End of sentence detection: The not-so-simple problem of deciding when a period in text corresponds to the end of a declarative sentence (and not an abbreviation) is produced with trees using the Brown corpus as input. The result is 99.8% correct classification. (2) Segment duration modelling in speech synthesis: 400 utterances from a single speaker and 4000 utterances from 400 speakers were used to build decision trees that predict segment durations based on features such as lexical position, stress, and phonetic context. Over 70% of the durational variance for the single speaker and over 60% for the multiple speakers was accounted by these methods. (3) Phoneme to phone prediction: A lattice of possible close phonetic transcriptions given a phonemic transcription (from the orthography and a dictionary) is produced using the 4000 TIMIT database as input. The most likely phone corresponding to a phoneme can be predicted 83% correctly. The five most likely phones can be predicted 99% correctly.

Tree-Based Models of Speech and Language,

Abstract

Document Details

Entities

People

Tags

DTIC Thesaurus Topics

Readers

Technology Areas