FUB at TREC 2008 Relevance Feedback Track: Extending Rocchio with Distributional Term Analysis

Abstract

The main goals of our participation in the Relevance Feedback track at TREC 2008 were as follows: (1) Test the effectiveness of using a combination of Rocchio and distributional term analysis on a relevance feedback task (so far, this approach has usually been used with good results in a pseudo-relevance setting); (2) Test whether and when negative relevance feedback is useful (e.g., is negative relevance feedback most effective when the distribution of terms in the negative documents is different than the distribution in the positive documents?); (3) Study how the performance of relevance feedback varies as the size of the set of feedback documents grows; (4) Check if/how the performance of relevance feedback is influenced by the size of the expanded query; and (5) Compare relevance feedback to pseudo-relevance feedback (e.g., is relevance feedback more effective and also more robust than pseudo-relevance feedback?). The main conclusions that can be drawn from our experiments are as follows: (1) The use of distribution-based scores within Rocchio's formula was an effective relevance feedback method; (2) The performance of relevance feedback in general increased as the number of feedback documents and the number of expansion terms grew, even when the two parameters were taken in combination; and (3) Other conditions being equal, the use of truly relevant documents resulted in a clear performance improvement over using pseudo-relevance feedback, both in terms of mean retrieval effectiveness and robustness.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2008
Accession Number
ADA512741

Entities

People

  • Andrea Bernardini
  • Claudio Carpineto

Organizations

  • Fondazione Ugo Bordoni

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Equations
  • Feedback
  • Frequency
  • Information Operations
  • Information Retrieval
  • Instructions
  • Standards
  • Test And Evaluation

Readers

  • Information Retrieval
  • Mathematics or Statistics