Lattice Based Language Models

Abstract

This paper introduces lattice based language models, a new language modeling paradigm. These models construct multi-dimensional hierarchies of partitions and select the most promising partitions to generate the estimated distributions. We discussed a specific two dimensional lattice and propose two primary features to measure the usefulness of each node: the training-set history count and the smoothed entropy of its prediction. Smoothing techniques are reviewed and a generalization of the conventional backoff strategy to multiple dimensions is proposed. Preliminary experimental results are obtained on the SWITCHBOARD corpus which lead to a 6.5% perplexity reduction over a word trigram model.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Sep 01, 1997
Accession Number: ADA333294

Entities

People

Pierre Dupont
Roni Rosenfeld

Organizations

Carnegie Mellon University

Lattice Based Language Models

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers