Model-Based Clustering, Discriminant Analysis, and Density Estimation

Abstract

Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as `How many clusters are there?" "Which clustering method should be used?" and "How should outliers be handled?". We outline a general methodology for model-based clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, minefield detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, and discuss recent developments in model-based clustering for non-Gaussian data, high-dimensional datasets, large datasets, and Bayesian estimation.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 2000
Accession Number
ADA458798

Entities

People

  • Adrian Raftery
  • Chris Fraley

Organizations

  • George Washington University

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Abstracts
  • Clustering
  • Computer Programs
  • Computing-Related Activities
  • Data Science
  • Data Sets
  • Discriminant Analysis
  • Information Operations
  • Information Science
  • Interdisciplinary Science
  • Mathematical Analysis
  • Mathematics
  • Multivariate Analysis
  • Statistical Analysis
  • Statistics

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Regression Analysis.
  • Software Engineering.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms