The Use of Twitter to Predict the Level of Influenza Activity in the United States

Abstract

Controlling the outbreak of epidemic diseases such as influenza has always been a concern for the United States. Traditional surveillance tools such as the ILINet and Virologic provide the Centers for Disease Control and Prevention (CDC) with influenza surveillance statistics at a lag of 1 to 2 weeks. The CDC requires a tool that can forecast the level of influenza activity. The rise in the popularity of social media websites such as Flickr, Twitter and Facebook has transformed the web into an interactive sharing platform. The huge amount of generated unstructured data has become an invaluable source for detecting patterns or novelties. This research explores the correlation between Twitter messages (tweets) and CDC ILI and Virologic surveillance data. Using 17 months of tweets, regression models are developed to predict influenza-related statistics. The proposed approach aggregates the weekly frequencies of hand-chosen words that are indicative of an influenza attack using separate predictor variables. The predictions generated by the best models are found to have a Pearson s correlation coefficient of 0.900 (95% CI: 0.732, 0.965) and 0.833 (95% CI: 0.574, 0.940) against the CDC ILI surveillance data and CDC Virologic surveillance data, respectively.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2014
Accession Number
ADA620696

Entities

People

  • Kok W. Ng

Organizations

  • Naval Postgraduate School

Tags

DTIC Thesaurus Topics

  • Data Analysis
  • Diseases And Disorders
  • Geographic Regions
  • Health Services
  • Infectious Diseases
  • Information Science
  • Language
  • Linguistics
  • Media
  • Online Communications
  • Regression Analysis
  • Respiratory Tract Diseases
  • Social Media
  • Social Networking Services
  • Statistics
  • Text Messaging
  • United States

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Infectious Disease/Epidemiology
  • Regression Analysis.