A Maximum Likelihood Ratio Information Retrieval Model

Abstract

In this paper we present a novel probabilistic information retrieval model that scores documents based on the relative change in the document likelihoods, expressed as the ratio of the conditional probability of the document given the query and the prior probability of the document before the query is specified. The document likelihoods are computed using statistical language modeling techniques and the model parameters are estimated automatically and dynamically for each query to optimize well-specified (maximum likelihood) objective functions. We derive the basic retrieval model, describe the details of the model, and present some extensions to the model including a method to perform automatic feedback. Development experiments are performed using the TREC-6 ad hoc text retrieval task and performance is measured using the TREC-7 ad hoc task. Official evaluation results on the 1999 TREC-8 ad hoc task are also reported. The performance results demonstrate that the model is competitive with current state-of-the-art retrieval approaches.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA456243

Entities

People

  • Kenney Ng

Organizations

  • Massachusetts Institute of Technology

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Automatic
  • Computations
  • Computer Science
  • Data Sets
  • Equations
  • Feedback
  • Hidden Markov Models
  • Information Retrieval
  • Language
  • Markov Models
  • Models
  • Probabilistic Models
  • Probability
  • Standards
  • Test And Evaluation
  • Test Sets

Fields of Study

  • Computer science

Readers

  • Computational Modeling and Simulation
  • Information Retrieval
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Information Retrieval