Bayesian Classification Using Noninformative Dirichlet Priors

Abstract

In this dissertation, the Combined Bayes Test (CBT) and its average probability of error, P(s), are developed. The CBT combines training and test data to infer symbol probabilities where a Dirichlet (completely noninformative) prior is assumed for all classes. Using P(s), several results are shown based on the best quantization complexity, M*(which is related to the Hughes Phenomenon). For example, it is shown that M* increases with the training and test data. Also, it is demonstrated that the OST outperforms a more conventional Maximum Likelihood (ML) based test, and the Kolmogorov-Smimov Test (KST). With this, the Bayesian Data Reduction Algorithm (BDRA) is developed. The BDRA uses P(s) (conditioned on the training data) and a greedy approach for reducing irrelevant features from each class, and its performance is shown to be superior to that of a neural network From here, the CBT is extended to demonstrate performance when the training data of each class are mislabeled. Performance is shown to degrade when mislabeling exists in the training data, being dependent on the mislabeling probabilities. However, it is also shown that the BRDA can be used to diminish the effect of mislabeling. Further, the BDRA is modified, using two different approaches, to classify test observations when the training data of each class contain missing feature values. In the first approach, each missing feature is assumed to be uniformly distributed over its range of values; in the second approach, the number of discrete levels for each feature is increased by one. Both methods of modeling missing features are shown to perform similarly, and both also outperform a neural network.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 15, 1999
Accession Number: ADA366461

Entities

People

Robert S. Lynch

Organizations

Naval Undersea Warfare Center

Bayesian Classification Using Noninformative Dirichlet Priors

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers

Technology Areas