Parsing Conversational Speech Using Enhanced Segmentation

Abstract

The lack of sentence boundaries and presence of disfluencies pose difficulties for parsing conversational speech. This work investigates the effects of automatically detecting these phenomena on a probabilistic parser's performance. We demonstrate that a state-of-the-art segmenter, relative to a pause-based segmenter, gives more than 45% of the possible error reduction in parser performance, and that presentation of interruption points to the parser improves performance over using sentence boundaries alone. Parsing speech can be useful for a number of tasks, including information extraction and question answering from audio transcripts. However, parsing conversational speech presents a different set of challenges than parsing text: sentence boundaries are not well-defined, punctuation is absent, and disfluencies (edits and restarts) impact the structure of language.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2004
Accession Number: ADA457886

Entities

People

Ciprian Chelba
Jeremy G. Kahn
Mari Ostendorf

Organizations

University of Washington

Parsing Conversational Speech Using Enhanced Segmentation

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas