Relevance Feedback Track Overview: TREC 2008

Abstract

Relevance Feedback has been one of the successes of information retrieval research for the past 30 years. It has been proven to be worthwhile in a wide variety of settings, both when actual user feedback is available, and when the user feedback is implicit. However, while the applications of relevance feedback and type of user input to relevance feedback have changed over the years, the actual algorithms have not changed much. Most algorithms are either pure statistical word based (for example, Rocchio or Language Modeling), or are domain dependent. We should be able to do better now, but there have been surprisingly few advances in the area. In part, that's because relevance feedback is hard to study, evaluate, and compare. It is difficult to separate out the effects of an initial retrieval run, the decision procedure to determine what documents will be looked at, the user dependent relevance judgment procedure (including interface), and the actual relevance feedback reformulation algorithm. Setting up a framework to look at these separate effects for future research is an important goal for this track. Why now? We have a lot more natural language tools than we had 10 or 20 years ago. We're hopeful we can get people to actually use those tools to suggest what makes a document relevant or non-relevant to a particular topic. The question-answering community has been very successful at categorizing questions and taking different approaches for different categories. The success has not transferred over to the IR task, partly because there simply isn't enough syntactic information in a typical IR topic to offer clues as to what is wanted. But given relevant and non-relevant judgments, it should be much easier to form categories for topics (e.g., this topic requires these two aspects to both be present, while this other topic does not), and take different approaches depending on topic.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2008
Accession Number
ADA512675

Entities

People

  • Chris Buckley
  • Stephen Robertson

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Agreements
  • Algorithms
  • Feedback
  • Hong Kong
  • Information Operations
  • Information Retrieval
  • Judgment
  • Language
  • Natural Languages
  • Precision
  • Residuals
  • Standards
  • Terabytes
  • Test And Evaluation
  • Test Sets
  • Urban Areas

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Systems Analysis and Design
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks