Frequent Itemset Mining for Query Expansion in Microblog Ad-hoc Search
Abstract
The high volume of Tweets arriving every second and the requirement to index them in real time emphasize the importance of the computational complexity of algorithms used to process them. In this paper, we investigate the use of Frequent Itemsets Mining to quickly discover patterns that can later be used for query expansion. Frequent Itemsets Mining (FIM) has been highly adopted to mine data streams because of its computational simplicity and the possibility to parallelize some of its steps. Initial experiments using the TREC 2011 Microblogs track queries showed that it is possible to improve the performance of BM25, however this was not the case with the 2012 queries. Our analysis of the difference in performance provides insight about how to make best use of FIM for microblog search.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2012
- Accession Number
- ADA581528
Entities
People
- Charles L. Clarke
- Younos Aboulnaga
Organizations
- University of Waterloo