The Islamic State Battle Plan: Press Release Natural Language Processing

Abstract

The purpose of this study is to develop methods to accelerate and enhance the analysis of Islamic State Movement text documents. We analyze a unique database collected by Dr. Craig Whiteside, which is comprised of nearly 3,000 open-source translated press releases from 20032014. Using Natural Language Processing tools, the text data is aggregated into a corpus and processed based on document term structure and frequency. In order to reduce analyst workload, we validate Whitesides manual analysis and construct cross-validated generalized linear models to automatically classify documents into one of seven types. A cascade classification model outperforms all other models with a mean cross-validated misclassification rate of 5.71 percent. Islamic State Movement operational summaries are classified as type Celebrate. We develop a layered algorithm based on regular expressions and location searches to extract critical information from each attack event and display the details on a map using a web-based interactive R Shiny application. With the ability to automatically classify Islamic State Movement text documents and visually interact with the data contained within those classified as type Celebrate, analysts and decision makers are able to process and understand large amounts of text data more quickly and effectively.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 2016
Accession Number
AD1026573

Entities

People

  • James R Friedlein

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies
  • Materials and Manufacturing Processes
  • Weapons Technologies

DTIC Thesaurus Topics

  • Computer Languages
  • Computer Programming
  • Computers
  • Data Visualization
  • Databases
  • Information Science
  • Language
  • Machine Learning
  • Natural Language Processing
  • Natural Languages
  • Reliability
  • Social Media
  • Spreadsheet Software
  • Terrorism
  • United States
  • Word Processors
  • World Geodetic System

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Geospatial Intelligence and Artificial Intelligence Analytics
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks