Empirical Determination of Pattern Match Confidence in Labeled Graphs

Abstract

The ability to represent complex, arbitrary situations in terms of labeled graphs has profound implications for situational awareness across domains. However, such graphs are fundamentally difficult for manual processing by experts; although our visual system typically outperforms all algorithms for pattern detection a graph with a few as several hundred nodes and edges reveals very little upon visual inspection. Thus we are forced to rely on pattern-matching algorithms to extract meaning from graph representations where nodes, labels and edges represent specific entities, general categories, and relationships. Algorithms such as Complex Event Processing (CEP), search a graph for a particular set of relationships between the categories that make up the ontology of the situation. In this paper, we will present an empirical method for determining the likelihood that a pattern search will return a false positive for a given pattern, ontology, and graph. This likelihood is analogous to the signal-to-noise ratio in traditional sensing schemes. We demonstrate our method using algorithmically generated datasets and in datasets with known ground truths. We also show scale-free (power-law) behavior in several graph types, which allows for estimation of maximum graph size before false positives are expected to occur. Finally, we present a preliminary analytical study that describes the number of arbitrary pattern matches expected to appear by chance in a larger labeled graph. In any operationally relevant situation, assigning a confidence or quality metric to data used for decision making is crucial. The method presented in this paper is one of the first methods for doing so with complex patterns detected in large, highly-interrelated datasets. We believe that an improved understanding of pattern match quality will improve the usefulness of search techniques applied to social media, operation intelligence and tactical intelligence.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 07, 2014
Accession Number
ADA607717

Entities

People

  • Ben Migliori
  • Daniel Grady
  • James Law

Organizations

  • Naval Information Warfare Systems Command

Tags

Communities of Interest

  • C4I
  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Counterterrorism
  • Data Sets
  • High Resolution
  • Inspection
  • Low Resolution
  • Media
  • Models
  • Naval Warfare
  • Network Centric Warfare
  • Ontologies
  • Probability
  • Situational Awareness
  • Social Media
  • Social Networks
  • Tactical Networks
  • Warfare

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computer Vision.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.