Automated Chat Thread Analysis: Untangling the Web

Abstract

As networked digital communications proliferate in military operational command and control (C2), chat messaging is emerging as a preferred communications method for team coordination. Chat room logs provide a potentially rich source of data for analysis in after-action reviews, affording considerable insight into the decision-making processes among the training audience. The multitasking nature of these types of operations, and the large number of chat channels and participants lead to multiple, parallel threads of dialogs that are tightly intertwined. It is necessary to identify and separate these threads to facilitate analysis of chat communication in support of team performance assessment. This presents a significant challenge as chat is prone to informal language usage, abbreviations, and typos. Techniques for conventional language analysis do not transfer very well. Few inroads have been made in tackling the problem of dialog analysis and topic detection from chat messages. In this paper, we will discuss the application of natural language techniques to automate chat log analysis, using an AOC team training exercise as the source of data. We have found it necessary to enhance these techniques to take into consideration the specific characteristics of chat-based C2 communications. Additionally, our domain of interest provides other data sources besides chat that can be leveraged to improve classification accuracy. We will describe how such considerations have been folded into traditional data analysis techniques to address this problem and discuss their performance. In particular, we explore the problem of automatically detecting content-based coherence between messages. We present techniques to address this problem and analyze their performance in comparison with using distinguishing keywords provided by subject matter experts. We discuss the lessons learned from our results and how it impacts future work.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2010
Accession Number
ADA532774

Entities

People

  • Oscar Bascara
  • Randy Jansen
  • Shaun Sucillon
  • Sowmya Ramachandran
  • Tamitha Carpenter
  • Todd Denning

Organizations

  • Air Force Research Laboratory

Tags

Communities of Interest

  • C4I
  • Engineered Resilient Systems
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Artificial Intelligence
  • Behavioral Sciences
  • Command And Control
  • Computational Science
  • Data Analysis
  • Education
  • Information Science
  • Instructors
  • Language
  • Machine Learning
  • Military Research
  • Natural Language Processing
  • Natural Languages
  • Simulations
  • Training

Fields of Study

  • Computer science
  • Engineering

Readers

  • Parallel and Distributed Computing.
  • Systems Analysis and Design
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • Fully Networked C3
  • Fully Networked C3 - Command and Control