A Framework for Hierarchical Ensemble Clustering

Abstract

Ensemble clustering, as an important extension of the clustering problem, refers to the problem of combining different (input) clusterings of a given dataset to generate a final (consensus) clustering that is a better fit in some sense than existing clusterings. Over the past few years, many ensemble clustering approaches have been developed. However, most of them are designed for partitional clustering methods, and few research efforts have been reported for ensemble hierarchical clustering methods. In this article, a hierarchical ensemble clustering framework that can naturally combine both partitional clustering and hierarchical clustering results is proposed. In addition, a novel method for learning the ultra-metric distance from the aggregated distance matrices and generating final hierarchical clustering with enhanced cluster separation is developed based on the ultra-metric distance for hierarchical clustering. We study three important problems: dendrogram description, dendrogram combination, and dendrogram selection. We develop two approaches for dendrogram selection based on tree distances, and we investigate various dendrogram distances for representing dendrograms. We provide a systematic empirical study of the ensemble hierarchical clustering problem. Experimental results demonstrate the effectiveness of our proposed approaches.

Document Details

Document Type
Pub Defense Publication
Publication Date
Sep 23, 2014
Source ID
10.1145/2611380

Entities

People

  • Chris Ding
  • Tao Li
  • Zheng Li

Organizations

  • Army Research Office
  • Division of Biological Infrastructure
  • Division of Human Resource Development
  • Florida International University
  • Nanjing University of Science and Technology
  • National Science Foundation Division of Mathematical Sciences
  • United States Department of Homeland Security
  • University of Texas at Arlington

Tags

Fields of Study

  • Computer science

Readers

  • Pavement Materials Engineering.
  • Regression Analysis.