University of Waterloo: Logistic Regression and Reciprocal Rank Fusion at the Microblog Track

Abstract

For the second iteration of the Microblog Track, two tasks were given to participants to complete. The first was to perform the same ad hoc search task as the 2011 iteration. The goal of the task was to expand on last year's methods with 60 new topics and to explore different measures of evaluation. The second task was to filter the corpus with respect to the 2011 topics in an attempt to simulate a streaming environment and how simulating user feedback can affect retrieval results. For the ad hoc search task, we decided to expand on last year's approach by continuing to use the Wumpus Search Engine and adding in a logistic regression classifier (denoted GCLR in this article), first used in the TREC 2007 Spam Track. In addition, pseudo- relevance feedback was conducted this year by taking a swapdocs approach, which will be expanded upon later. As well, a semi-automatic logistic regression run was conducted using seed documents provided by a user. For the filtering task, only different methods of training GCLR were examined. No manual feedback was conducted for the filtering task.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2012
Accession Number
ADA581525

Entities

People

  • Adam Roegiest
  • Gordon V. Cormack

Organizations

  • University of Waterloo

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Classification
  • Computer Science
  • English Language
  • Feedback
  • Filtration
  • Information Operations
  • Iterations
  • Language
  • Precision
  • Probability
  • Schools
  • Standards
  • Test And Evaluation
  • Training
  • Universities

Readers

  • Aerospace Test and Evaluation
  • Information Retrieval
  • Regression Analysis.