POSTECH at TREC 2009 Blog Track: Top Stories Identification

Abstract

This paper describes our participation in the TREC 2009 Blog Track. Our system consists of the query likelihood component and the news headline prior component, based on the language model framework. For the query likelihood, we propose several approaches to estimate the query language model and the news headline language model. We also suggest two approaches to choose the 10 supporting relevant posts: Feed-Based Selection and Cluster-Based Selection. Furthermore, we propose two criteria to estimate the news headline prior for a given day. Experimental results show that using the prior significantly improves the performance of the task.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 01, 2009
Accession Number: ADA517740

Entities

People

Hun-young Jung
Jong-hyeok Lee
Woosang Song
Yeha Lee

Organizations

Pohang University of Science and Technology

POSTECH at TREC 2009 Blog Track: Top Stories Identification

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers