Associativity-Peakiness Metrics for Contingency Tables

Abstract

For the use case of comparing the performance of clustering algorithms whose output is a contingency table, a single performance metric for contingency tables is needed. A survey of publicly available literature did not show the presence of such a metric. Metrics do exist for vector pairs of truth values and predicted values, which are an alternative form of output of clustering algorithms. These metrics could also be used to characterize the output as expressed in a contingency table, due to the interchangeability of contingency tables and the vector pairs. However, the metrics for vector pairs do not reveal the presence of detailed performance features that are apparent in contingency tables. This report presents the Associativity-Peakiness (AP) metric, which characterizes aspects of clustering algorithm performance that are critical for predicting a clustering algorithms performance when deployed. The AP metric is analogous to measures of quality for confusion matrices that are outputs of supervised learning algorithms. This report presents results from simulations in which 500 contingency tables were generated for multiple test scenarios. The results show that for the use case of evaluating clustering algorithms, the AP metric characterizes performance of contingency tables with higher dynamic range than publicly available metrics, and that these metrics do not correlate well with the AP metric.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 11, 2023
Accession Number
AD1216913

Entities

People

  • Naomi E. Zirkind
  • William J. Diehl

Organizations

  • United States Army Research Laboratory

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Artificial Intelligence
  • Computational Complexity
  • Data Analysis
  • Data Sets
  • Dynamic Range
  • Frequency Domain
  • Learning
  • Literature
  • Machine Learning
  • Neurobehavioral Manifestations
  • Peak Values
  • Simulations
  • Supervised Machine Learning
  • Two Dimensional
  • Unsupervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Computational Modeling and Simulation
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference