Bayesian Kernel Methods for Non-Gaussian Distributions: Binary and Multi-class Classification Problems

Abstract

Recent advances in data mining have integrated kernel functions with Bayesian probabilistic analysis of Gaussian distributions. These machine learning approaches can incorporate prior information with new data to calculate probabilistic rather than deterministic values for unknown parameters. This paper analyzes extensively a specific Bayesian kernel model that uses a kernel function to calculate a posterior beta distribution that is conjugate to the prior beta distribution. Numerical testing of the beta kernel model on several benchmark data sets reveal that this model's accuracy is comparable with those of the support vector machine and relevance vector machine, and the model runs more quickly than the other algorithms. When one class occurs much more frequently than the other class, the beta kernel model often outperforms other strategies to handle imbalanced data sets. If data arrive sequentially over time, the beta kernel model easily and quickly updates the probability distribution, and this model is more accurate than an incremental support vector machine algorithm for online learning when fewer than 50 data points are available.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 28, 2013
Accession Number
ADA595533

Entities

People

  • Cameron A. MacKenzie
  • Kash Barker
  • Theodore B. Trafalis

Organizations

  • University of Oklahoma

Tags

Communities of Interest

  • Autonomy
  • Engineered Resilient Systems
  • Space

DTIC Thesaurus Topics

  • Algorithms
  • Bayesian Networks
  • Data Mining
  • Data Sets
  • Distance Learning
  • Gaussian Distributions
  • Information Science
  • Kernel Functions
  • Machine Learning
  • Operations Research
  • Probability
  • Probability Distributions
  • Reliability Engineering
  • Statistical Analysis
  • Students
  • Supervised Machine Learning
  • Systems Engineering

Fields of Study

  • Computer science

Readers

  • Computational Modeling and Simulation
  • Parallel and Distributed Computing.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms