Topic Time Series Analysis of Microblogs
Abstract
Social media data tends to cluster in time and space around events, such as sports competitions and local news-worthy phenomena. However, transforming raw, free-form, real time text into meaningful information remains a challenging task. Confounding factors include the massive volume of posted data lack of reliable event information, hidden temporal trends, and the vastly diverse nature of content. In the present work, we examine spatio-temporal topic distributions and self-exciting time series models as applied to social media microblog data. We apply topic modeling using non-negative matrix factorization with sparsity constraints to discover prevalent topics as well as latent thematic word associations within topics. We then present two methods for mining interesting spatio-temporal dynamics and relations among topics one that compares the topic distributions directly, and another that models topics over time as temporal or spatio-temporal Hawkes process with exponential trigger functions. This second method allows identification of self-exciting topics and reveals unique temporal and spatial relationships among them.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 01, 2014
- Accession Number
- ADA610278
Entities
People
- Andrea Bertozzi
- Baichuan Yuan
- Blake Hunter
- Daniel Moyer
- Eric Fox
- Eric Lai
- Jeffrey Brantingham
Organizations
- University of California, Los Angeles