Approaches to Generate Keywords
Abstract
Insider threat analysts require efficacious and reliable methods to generate lists of keywords which can be used to support the detection of a given topic of interest. These keywords may be used to create keyword-based detection policies or may be used by analysts to refer to as a reference guide. Direct keyword detection serves as a generalizable approach, as the complexity, and therefore the ability to take advantage of this method in a variety of tools, is much less than context-based detection. This document presents two approaches for generating lists of keywords by describing the intuition and science behind each approach and by discussing the accompanying software code which performs the automatic keyword extraction. Experimentally, we found that both approaches generated lists of keywords that are reasonably indicative of hate or extremism. We recommend considering the incorporation of these approaches into a keyword development process. However, we also note that a manual review of each of the generated lists of keywords be performed prior to the inclusion of the terms into any automated detection capability. We discuss this note in the Results and Implementation section.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2021
- Accession Number
- AD1137194
Entities
Organizations
- Carnegie Mellon University