Multi-Sample Cluster Analysis as an Alternative to Multiple Comparison Procedures.

Abstract

This paper studies multi-sample cluster analysis, the problem of grouping samples, as an alternative to multiple comparison procedures through the development and the introduction of model-selection criteria such as those: Akaike's Information criterion (AIC) and Schwarz's Criterion (SC), as new procedures for comparing means, groups, or samples, and so forth, in identifying and selecting the homogeneous groups or samples from the heterogeneous ones in multi-sample data analysis problems. An enumerative clustering technique is presented to generate all possible choices of clustering alternatives of groups, or samples on the computer using efficient combinatorial algorithms without forcing an arbitrary choice among the clustering alternatives, and to find all sufficiently simple groups or samples consistent with the data and identify the best clustering among the alternative clusterings. Numerical examples are carried out and presented on a real data set on grouping the samples into fewer than K groups. Through a Monte Carlo study, an application of multi-sample cluster analysis is shown in designing optimal decision tree classifiers in reducing the dimensionality of remotely sensed heterogenous data sets to achieve a parsimonious grouping of samples. The results obtained demonstrate the utility and versatility of model-selection criteria which avoid the notorious choice of levels of significance and which are free from the ambiguities inherent in the application of conventional hypothesis testing procedures. Originator suggested keywords include: Multi-Sample Cluster Analysis; Multiple Comparison Procedures; Model Selection Criteria; Akaike's Information Criterion; Schwarz's Criterion. (Author)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 20, 1984
Accession Number
ADA149960

Entities

People

  • H. Bozdogan

Organizations

  • University of Illinois at Chicago

Tags

Communities of Interest

  • Biomedical
  • Energy and Power Technologies
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Analysis Of Variance
  • Artificial Intelligence
  • Business Administration
  • Computer Programs
  • Computers
  • Data Analysis
  • Data Science
  • Data Sets
  • Information Science
  • Machine Learning
  • Mathematics
  • Multivariate Analysis
  • Probability
  • Statistical Algorithms
  • Statistics
  • Theorems

Readers

  • Graph Algorithms and Convex Optimization.
  • Statistical inference.
  • Systems Analysis and Design