Cluster Analysis.
Abstract
Current clustering techniques possess several common features which seem undesirable. For example, a 'cluster' remains an undefined concept; each clustering technique tends to work properly only under unstated, but often restrictive, implied assumptions; and the nonexistence of clustering statistics or the lack of theory about the sampling distributions of the statistics (when they do exist) makes the assessment of the statistical significance of a cluster quite impossible. In this paper after a brief review and critique of the clustering methods that are most widely used, definitions of a cluster and its related concepts are proposed. The clusters so defined and their associated statistics will remain invariant under any monotonic transformation of the elements of the data matrix on which they depend. Their sampling distributions are investigated by analytic and Monte Carlo methods. Both aritificial and real data are employed to illustrate the methodology and probability theory of the proposed clustering method. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1971
- Accession Number
- AD0717333
Entities
People
- R. F. Ling
Organizations
- Yale University