Localized Smoothing for Multinomial Language Models

Abstract

We explore a formal approach to dealing with the zero frequency problem that arises in applications of probabilistic models to language. In this report we introduce the zero frequency problem in the context of probabilistic language models, describe several popular solutions, and introduce localized smoothing, a potentially better alternative. We formulate localized smoothing as a two-step maximization process, outline the estimation details for both steps and present the experiments which show the technique to have potential for improving performance.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2000
Accession Number
ADA478094

Entities

People

  • Victor Lavrenko

Organizations

  • University of Massachusetts Amherst

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Bayes Theorem
  • Data Compression
  • Data Sets
  • Detection
  • False Alarms
  • Frequency
  • Information Operations
  • Information Retrieval
  • Information Science
  • Language
  • Models
  • Probabilistic Models
  • Probability
  • Random Variables
  • Vocabulary
  • Warning Systems

Fields of Study

  • Computer science
  • Mathematics

Readers

  • Artificial Intelligence
  • Computational Modeling and Simulation