POSTECH at TREC 2009 Blog Track: Top Stories Identification
Abstract
This paper describes our participation in the TREC 2009 Blog Track. Our system consists of the query likelihood component and the news headline prior component, based on the language model framework. For the query likelihood, we propose several approaches to estimate the query language model and the news headline language model. We also suggest two approaches to choose the 10 supporting relevant posts: Feed-Based Selection and Cluster-Based Selection. Furthermore, we propose two criteria to estimate the news headline prior for a given day. Experimental results show that using the prior significantly improves the performance of the task.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2009
- Accession Number
- ADA517740
Entities
People
- Hun-young Jung
- Jong-hyeok Lee
- Woosang Song
- Yeha Lee
Organizations
- Pohang University of Science and Technology