Northeastern University in TREC 2009. Million Query Track
Abstract
Ranking is a central problem in information retrieval. Modern search engines, especially those designed for the World Wide Web, commonly analyze and combine hundreds of features extracted from the submitted query and underlying documents in order to assess the relative relevance of a document to a given query and thus rank the underlying collection. The sheer size of this problem has led to the development of learning to rank (LTR) algorithms that can automate the construction of such ranking functions: Given a training set of (feature vector, relevance) pairs, a machine learning procedure learns how to combine the query and document features in such a way so as to effectively assess the relevance of any document to any query and thus rank a collection in response to a user input. Much thought and research has been placed on the development of sophisticated learning-to-rank algorithms. However, relatively little research has been conducted on the construction of appropriate learning to rank data sets nor on the effect of these data sets on the ability of a learning-to-rank algorithm to "learn" effectively.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2009
- Accession Number
- ADA517738
Entities
People
- Evangelos Kanoulas
- Javed Aslam
- Keshi Dai
- Stefan Savev
- Virgil Pavlu
Organizations
- University of Sheffield