Frequent Itemset Mining for Query Expansion in Microblog Ad-hoc Search

Abstract

The high volume of Tweets arriving every second and the requirement to index them in real time emphasize the importance of the computational complexity of algorithms used to process them. In this paper, we investigate the use of Frequent Itemsets Mining to quickly discover patterns that can later be used for query expansion. Frequent Itemsets Mining (FIM) has been highly adopted to mine data streams because of its computational simplicity and the possibility to parallelize some of its steps. Initial experiments using the TREC 2011 Microblogs track queries showed that it is possible to improve the performance of BM25, however this was not the case with the 2012 queries. Our analysis of the difference in performance provides insight about how to make best use of FIM for microblog search.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2012
Accession Number
ADA581528

Entities

People

  • Charles L. Clarke
  • Younos Aboulnaga

Organizations

  • University of Waterloo

Tags

Communities of Interest

  • Biomedical
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Analysis Of Variance
  • Computational Complexity
  • Computer Science
  • Computers
  • Databases
  • Governments
  • Information Retrieval
  • Language
  • Natural Languages
  • New York
  • Online Communications
  • Personality
  • Probability
  • Social Media
  • Standards

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computational Modeling and Simulation
  • Information Retrieval