Northeastern University in TREC 2009. Million Query Track

Abstract

Ranking is a central problem in information retrieval. Modern search engines, especially those designed for the World Wide Web, commonly analyze and combine hundreds of features extracted from the submitted query and underlying documents in order to assess the relative relevance of a document to a given query and thus rank the underlying collection. The sheer size of this problem has led to the development of learning to rank (LTR) algorithms that can automate the construction of such ranking functions: Given a training set of (feature vector, relevance) pairs, a machine learning procedure learns how to combine the query and document features in such a way so as to effectively assess the relevance of any document to any query and thus rank a collection in response to a user input. Much thought and research has been placed on the development of sophisticated learning-to-rank algorithms. However, relatively little research has been conducted on the construction of appropriate learning to rank data sets nor on the effect of these data sets on the ability of a learning-to-rank algorithm to "learn" effectively.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 01, 2009
Accession Number: ADA517738

Entities

People

Evangelos Kanoulas
Javed Aslam
Keshi Dai
Stefan Savev
Virgil Pavlu

Organizations

University of Sheffield

Northeastern University in TREC 2009. Million Query Track

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas