Studying the Spread of Fake News with Population Genetic Models via Longitudinal Data
Abstract
This project was foundered under DARPA's AIDA program. It studied the problem of combining dynamic and semantic information in news articles to characterize how information changes over time for fake news detection. The projected collected three unique datasets of news articles from various news outlets on the internet: (1) Jussie Smollett incident (1/28/2019 - 3/14/2019, 1009 outlets), (2) Ukraine International Airlines Flight 752 (1/2/2020 - 2/1/2020, 2136 outlets), (3) COVID news in early 2020 (1/8/2020 - 4/9/2020, 10449 outlets). Semantic triples (subject; predicate; object) are extracted from collected article text each day. These triples correspond to nodes (subject and objects) and edges(predicate) from an RDF (resource description framework) graph. A sequence of RDF graphs represents a corpus of articles covering the same event over multiple days. We considered four features to describe how the structure of RDF graphs change over time: copy, mutate, append, and extend. Analyzing the sequence of RDF graphs, containing both dynamic and semantic information, showed a distinction between articles published on the Jussie Smollett incident by major news outlets vs. celebrity news outlets. We showed that the discrete-time, multivariate Hawkes process can be used to model the dynamics of copy, mutate, append, and extend events.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 01, 2021
- Accession Number
- AD1139413
Entities
People
- June Zhang
Organizations
- University of Hawaiʻi at Mānoa