Document and Query Expansion Models for Blog Distillation

Abstract

This paper presents the CMU submission to the 2008 TREC blog distillation track. Similar to last year's experiments, we evaluate different retrieval models and apply a query expansion method that leverages the link structure in Wikipedia. We also explore using a corpus that combines several different representations of the documents, using both the feed XML and permalink HTML, and apply initial experiments with spam filtering.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2008
Accession Number
ADA512700

Entities

People

  • Changkuk Yoo
  • Jaime Arguello
  • Jaime Callan
  • Jaime Carbonell
  • Jonathan L. Elsas

Organizations

  • Carnegie Mellon University

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Artificial Intelligence
  • Compression
  • Computer Science
  • Distillation
  • Error Analysis
  • Filtration
  • Information Operations
  • Intervals
  • Judgment
  • Language
  • Machine Learning
  • National Parks
  • Natural Language Processing
  • Precision
  • Singapore
  • Time Intervals

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Information Retrieval
  • Systems Analysis and Design