Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web
Abstract
This paper provides an overview of our participation in the Contextual Suggestion Track. The TREC 2014 Contextual Suggestion Track allowed participants to submit personalized rankings using documents either from the Open Web or from an archived, static Web collection (ClueWeb12) collection. One of the main steps in recommending attractions for a particular user in a given context is the selection of the candidate documents. This task is more challenging when relying on ClueWeb12 collection rather than public tourist APIs for finding suggestions. In this paper, we present our approach for selecting candidate suggestions from the entire ClueWeb12 collection using the tourist domain knowledge available in the Open Web. We show that the generated recommendations to the provided user profiles and contexts improve significantly using this inferred domain knowledge. The Contextual Suggestion TREC Track investigates search techniques for complex information needs that are highly dependent on context and user interests. Input to the task are a set of profiles (users), a set of example suggestions (attractions), and a set of contexts (locations). Each attraction has a title, a description, and a URL. Each profile corresponds to a single user, and indicates the user s preference with respect to each attraction. Two ratings are used: one for the attraction s description and another one for its website. Finally, each context corresponds to a particular geographical location (a city and its corresponding state in the United States). With this information, the task is to provide a personalized ranked list of up to 50 suggestions for every (user, context) pair. Each suggestion should be appropriate to both the user s profile and the context. The description and title of the suggestion may be tailored to reflect the preferences of that user.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2014
- Accession Number
- ADA618737
Entities
People
- Alejandro Bellogin
- Arjen P. De Vries
- Thaer Samar
Organizations
- Autonomous University of Madrid