Determining the Number of Subpopulations.

Abstract

The aim of cluster analysis is to find groups of similar objects. An important problem in clustering is finding the number of clusters. This is a statistical inference problem if the objects to be clustered are sampled from an underlying population. This thesis addresses the problem of inferring from a data sample the location and the number of subpopulations in the underlying population. This will be accomplished by introducing a new measure of the degree of multimodality of a density f, and then using the sample value of this measure as a basis for determining the number of modes, or subpopulations, when sampling from f. It is assumed that each subpopulation of the population corresponds to a mode of the underlying density f. Each population cluster is then characterized as a modal region, a high-density region surrounded by low-density regions. Two methods of characterizing these modes are considered. The first measures a relative distance, or how far apart two modes are from one another. The second measures how flat, or how close to uniform, the corresponding modal regions are. By combining these two concepts into a single parameter, a measure of the degree of multimodality of a density f is developed. In general, for a density with two or more modes, the degree of multimodality will be a measure of how far from the other modes and how close to uniform the weakest and flattest mode is.

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 1986
Accession Number
ADA175655

Entities

People

  • Guy Manuel

Organizations

  • Massachusetts Institute of Technology

Tags

DTIC Thesaurus Topics

  • Clustering
  • Data Science
  • High Density
  • Information Science
  • Low Density
  • Physical Properties
  • Sampling
  • Statistical Inference

Fields of Study

  • Mathematics

Readers

  • Computer Vision.
  • Statistical inference.
  • Wave Propagation and Nonlinear Chaotic Dynamics.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms