HU_DB at TREC 2014 Microblog Track

Abstract

This paper describes our system for the Tweet Timeline Generation (TTG) task of the Microblog track, at the Text Retrieval Conference (TREC) 2014. Intuitively, given a collection of microblog posts (i.e., tweets), and a keyword query Q, the goal is to generate a timeline of related tweets. Such a timeline consists of representative tweets, relevant to Q. In our system we employ query expansion to identify highly relevant tweets, and then use affinity propagation to cluster the tweets based on their word similarity, hashtag similarity and temporal similarity. We then return a representative tweet from each cluster. The result is a system with relatively good precision, but unfortunately, poor recall. We discuss the techniques employed as well as the insights gleaned while developing and testing our system.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618673

Entities

People

  • Jennifer Klein
  • Nerya Or
  • Sara Cohen
  • Yishai Oltchik

Organizations

  • Hebrew University of Jerusalem

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Clustering
  • Computer Science
  • Data Sets
  • Demographic Cohorts
  • Information Operations
  • Information Retrieval
  • Ontologies
  • Precision
  • Standards

Fields of Study

  • Computer science

Readers

  • Information Retrieval
  • Systems Analysis and Design