Overview of the TREC 2009 Web Track

Abstract

The TRECWeb Track explores and evaluates Web retrieval technologies. Currently, the Web Track conducts experiments using the new billion-page ClueWeb09 collection. The TREC 2009 Web Track includes both a traditional ad hoc retrieval task and a new diversity task. The goal of this diversity task is to return a ranked list of pages that together provide complete coverage for a query, while avoiding excessive redundancy in the result list. Topics for the track were created from the logs of a commercial search engine, with the aid of tools developed at Microsoft Research. Given a target query, these tools extracted and analyzed groups of related queries, using co-clicks and other information, to identify clusters of queries that highlight different aspects and interpretations of the target query. These clusters were employed by NIST for topic development. Each resulting topic is structured as a representative set of subtopics, each related to a different user need. Documents were judged with respect to the subtopics, as well as with respect to the topic as a whole. For each subtopic, NIST assessors made a binary judgment as to whether or not the document satisfies the information need associated with the subtopic. These topics were used for both the ad hoc task and the diversity task. A total of 26 groups submitted runs to the track, with many groups participating in both tasks. This report provides an overview of the track, including topic development, evaluation measures, and results.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2009
Accession Number
ADA517817

Entities

People

  • Charles L. Clarke
  • Ian Soboroff
  • Nick Craswell

Organizations

  • University of Waterloo

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Clustering
  • Computations
  • English Language
  • Information Operations
  • Judgment
  • Language
  • Natural Language Processing
  • New Jersey
  • Newspapers
  • Periodicals
  • Precision
  • Probability
  • Standards
  • Test And Evaluation

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Information Retrieval
  • Systems Analysis and Design