Utilizing Fused Features to Mine Unknown Clusters in Training Data
Abstract
In this paper, a previously introduced data mining technique, utilizing the Mean Field Bayesian Data Reduction Algorithm (BDRA), is extended for use in finding unknown data clusters in a fused multidimensional feature space. In the BDRA the modeling assumption is that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and where the primary metric for selecting and discretizing all relevant features is an analytic formula for the probability of error conditioned on the training data. In extending the BDRA for this application, notice that its built-in dimensionality reduction aspects are exploited for isolating and automatically sorting out and mining all points contained in each unknown data cluster. In previous work, this approach was shown to have comparable performance to the classier that knows all cluster information when mining a single feature containing multiple unknown clusters. Therefore, the primary contribution of the work presented here is to demonstrate that this approach can be extended to cases where the features are fused and contain more than one dimension. To illustrate performance, results are demonstrated using simulated data containing multiple clusters, and where the fused feature space contains relevant classification information.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jul 01, 2006
- Accession Number
- ADA521524
Entities
People
- Peter Willett
- Robert S. Lynch Jr.
Organizations
- Naval Undersea Warfare Center