Probabilistic Structured Query Methods

Abstract

Structured methods for query term replacement rely on separate estimates of term frequency and document frequency to compute the weight for each query term. This paper reviews prior work on structured query techniques and introduces three new variants that leverage estimates of replacement probabilities. Statistically significant improvements in retrieval effectiveness are demonstrated for cross-language retrieval and for retrieval based on optical character recognition when replacement probabilities are used to estimate both term frequency and document frequency.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2003
Accession Number
ADA459304

Entities

People

  • Douglas W. Oard
  • Kareem Darwish

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Character Recognition
  • Dictionaries
  • Equations
  • Frequency
  • Index Terms
  • Information Retrieval
  • Language
  • Machine Translation
  • Optical Character Recognition
  • Personality
  • Precision
  • Probability
  • Standards
  • Translations
  • Universities
  • Vector Spaces

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Information Retrieval
  • Statistical inference.