Siena's Twitter Information Retrieval System: The 2012 Microblog Track
Abstract
Since 1992, the National Institute of Standards and Technology (NIST) has been annually hosting the Text Retrieval Conference (TREC). One of the newest tracks, which started in 2011, is the Microblog Track which uses a well-known social network site, Twitter, as its source of microblog data. Twitter allows its users to post 140 character length tweets to share messages with their followers, posting personal updates, and share major media stories from around the world. In order to evaluate information retrieval on microblog data, groups were provided with a file of about 16 million tweet IDs from January 24th to February 8th, 2011. This allowed us to download the tweet content of each ID for a total of 16,141,812 tweets. Participating teams were given a set of topics to test their retrieval process, and their program would return relevant tweets about that topic. The Siena College Institute of Artificial Intelligence expanded on STIRS, Siena's Twitter Information Retrieval System. The results for our adhoc run showed STIRS' best run to be at 18.08% precision, while the average of the median from all participating teams was 14.86%.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2012
- Accession Number
- ADA581304
Entities
People
- Darren Lim
- Karl Appel
- Lauren Mathews
- Sharon Small
Organizations
- Siena College