Retrieval Performance Prediction and Document Quality

Abstract

The ability to predict retrieval performance has potential applications in many important IR (Information Retrieval) areas. In this thesis, we study the problem of predicting retrieval quality at the granularity of both the retrieved document set as a whole and individual retrieved documents. At the level of ranked lists of documents, we propose several novel prediction models that capture different aspects of the retrieval process that have a major impact on retrieval effectiveness. These techniques make performance prediction both effective and efficient in various retrieval settings including a Web search environment. As an application, we also provide a framework to address the problem of query expansion prediction. At the level of documents, we predict the quality of documents in the context of Web ad-hoc retrieval. We explore document features that are predictive of quality. Furthermore, we propose a document quality language model to improve retrieval effectiveness by incorporating quality information.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2007
Accession Number
ADA477543

Entities

People

  • Yun Zhou

Organizations

  • University of Massachusetts Amherst

Tags

Communities of Interest

  • Biomedical
  • Human Systems

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Computations
  • Computer Science
  • Data Science
  • Data Sets
  • Databases
  • Information Retrieval
  • Information Science
  • Language
  • Machine Learning
  • Network Science
  • Probability
  • Prostate Cancer
  • Random Variables
  • Statistical Algorithms
  • Statistics
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Organizational Process Management (OPM).

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks