A Gaussian Prior for Smoothing Maximum Entropy Models

Abstract

In certain contexts, maximum entropy (ME) modeling can be viewed as maximum likelihood training for exponential models, and like other maximum likelihood methods is prone to overfitting of training data. Several smoothing methods for maximum entropy models have been proposed to address this problem, but previous results do not make it clear how these smoothing methods compare with smoothing methods for other types of related models. In this work, we survey previous work in maximum entropy smoothing and compare the performance of several of these algorithms with conventional techniques for smoothing n-gram language models. Because of the mature body of research in n-gram model smoothing and the close connection between maximum entropy and conventional n-gram models, this domain is well-suited to gauge the performance of maximum entropy smoothing methods. Over a large number of data sets, we find that an ME smoothing method proposed to us by Lafferty performs as well as or better than all other algorithms under consideration. This general and efficient method involves using a Gaussian prior on the parameters of the model and selecting maximum a posteriori instead of maximum likelihood parameter values. We contrast this method with previous n-gram smoothing methods to explain its superior performance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 1999
Accession Number
ADA360974

Entities

People

  • Roni Rosenfeld
  • Stanley F. Chen

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Automated Speech Recognition
  • Computational Science
  • Computations
  • Computer Science
  • Contrast
  • Data Sets
  • Equations
  • Frequency
  • Gaussian Distributions
  • Language
  • Maximum Likelihood Estimation
  • Probability
  • Probability Distributions
  • Random Variables
  • Test Sets

Fields of Study

  • Mathematics

Readers

  • Approximation Theory.
  • Computational Linguistics
  • Computational Modeling and Simulation