Associativity-Peakiness Metrics for Contingency Tables
Abstract
For the use case of comparing the performance of clustering algorithms whose output is a contingency table, a single performance metric for contingency tables is needed. A survey of publicly available literature did not show the presence of such a metric. Metrics do exist for vector pairs of truth values and predicted values, which are an alternative form of output of clustering algorithms. These metrics could also be used to characterize the output as expressed in a contingency table, due to the interchangeability of contingency tables and the vector pairs. However, the metrics for vector pairs do not reveal the presence of detailed performance features that are apparent in contingency tables. This report presents the Associativity-Peakiness (AP) metric, which characterizes aspects of clustering algorithm performance that are critical for predicting a clustering algorithms performance when deployed. The AP metric is analogous to measures of quality for confusion matrices that are outputs of supervised learning algorithms. This report presents results from simulations in which 500 contingency tables were generated for multiple test scenarios. The results show that for the use case of evaluating clustering algorithms, the AP metric characterizes performance of contingency tables with higher dynamic range than publicly available metrics, and that these metrics do not correlate well with the AP metric.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 11, 2023
- Accession Number
- AD1216913
Entities
People
- Naomi E. Zirkind
- William J. Diehl
Organizations
- United States Army Research Laboratory