ICTNET at Blog Track TREC 2009

Abstract

This paper describes our participation in blog track of TREC2009. All runs are submitted for both two task, namely Top stories identification task and faceted blog distillation task. The "FirteX" platform was used to index and retrieval posts. As for top stories identification task, to identify important headlines, we measure the importance of headline by accumulating the BM25 relevance score with posts on the query day. We propose a graph-based iterative approach and a sub-topic detecting based approach respectively to identify diverse blog posts. As for faceted blog distillation task: we adopt a very straightforward approach and measure the topical relevance by only exploiting top ad-hoc 10000 posts. To identify facet inclination, we either train centroid classifier or compute facet inclination weights of terms to compute facet inclination score and rerank feed by combining relevance score and facet inclination score.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2009
Accession Number
ADA517734

Entities

People

  • Feng Guan
  • Hongbo Xu
  • Linhai Song
  • Xiaoming Yu
  • Xueke Xu
  • Xueqi Cheng
  • Yue Liu
  • Zeying Peng

Organizations

  • Chinese Academy of Sciences

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Automated Text Summarization
  • Coverings
  • Distillation
  • Extraction
  • Filtration
  • Identification
  • Information Operations
  • Machine Learning
  • Models
  • Online Communications
  • Preprocessing
  • Prototypes
  • Standards
  • Terminals

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Materials Science (Mechanical Engineering).
  • Regression Analysis.