Empirical Determination of Pattern Match Confidence in Labeled Graphs
Abstract
The ability to represent complex, arbitrary situations in terms of labeled graphs has profound implications for situational awareness across domains. However, such graphs are fundamentally difficult for manual processing by experts; although our visual system typically outperforms all algorithms for pattern detection a graph with a few as several hundred nodes and edges reveals very little upon visual inspection. Thus we are forced to rely on pattern-matching algorithms to extract meaning from graph representations where nodes, labels and edges represent specific entities, general categories, and relationships. Algorithms such as Complex Event Processing (CEP), search a graph for a particular set of relationships between the categories that make up the ontology of the situation. In this paper, we will present an empirical method for determining the likelihood that a pattern search will return a false positive for a given pattern, ontology, and graph. This likelihood is analogous to the signal-to-noise ratio in traditional sensing schemes. We demonstrate our method using algorithmically generated datasets and in datasets with known ground truths. We also show scale-free (power-law) behavior in several graph types, which allows for estimation of maximum graph size before false positives are expected to occur. Finally, we present a preliminary analytical study that describes the number of arbitrary pattern matches expected to appear by chance in a larger labeled graph. In any operationally relevant situation, assigning a confidence or quality metric to data used for decision making is crucial. The method presented in this paper is one of the first methods for doing so with complex patterns detected in large, highly-interrelated datasets. We believe that an improved understanding of pattern match quality will improve the usefulness of search techniques applied to social media, operation intelligence and tactical intelligence.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 07, 2014
- Accession Number
- ADA607717
Entities
People
- Ben Migliori
- Daniel Grady
- James Law
Organizations
- Naval Information Warfare Systems Command