A Hybrid Approach to Adaptive Statistical Language Modeling

Abstract

We describe our latest attempt at adaptive language modeling. At the heart of our approach is a Maximum Entropy (ME) model, which incorporates many knowledge sources In a consistent manner. The other components are a selective unigram cache. a conditional bigram cache, and a conventional static trigram. We describe the knowledge sources used to build such a model with ARPA's official WSJ corpus. and repon on perplexity and word error rate results obtained with iL Then, three different adaptation paradigms are discussed, and an additional experiment, based on AP wire data, is used to compare them.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1994
Accession Number
ADA458711

Entities

People

  • Roni Rosenfeld

Organizations

  • Carnegie Mellon University

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Computations
  • Computer Science
  • Cross Domain
  • Information Operations
  • Language
  • Military Research
  • Natural Languages
  • Probability
  • Probability Distributions
  • Training
  • Vocabulary

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Parallel and Distributed Computing.