Language Modeling With Sentence-Level Mixtures

Abstract

This paper introduces a simple mixture language model that attempts to capture long distance constraints in a sentence or paragraph. The model is an m-component mixture of trigram models. The models were constructed using a 5K vocabulary and trained using a 76 million word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7% improvement in recognition accuracy with the mixture trigram models as compared to using a trigram model.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 1994
Accession Number: ADA459584

Entities

People

J. R. Rohlicek
Mari Ostendorf
Rukmini Iyer

Organizations

Boston University

Language Modeling With Sentence-Level Mixtures

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers