Stochastic Algorithms for Learning with Incomplete Data: An Application to Bayesian Networks

Abstract

A goal of machine learning is to develop intelligent systems that improve with experience. To achieve this goal the field has drawn on many diverse disciplines. Because of this diversity, there is no common framework for the field to develop a research vision. This research begins by proposing a machine learning framework. From the framework three hypotheses pertaining to learning Bayesian networks are proposed. The current state-of-the-art learning paradigm for inducing Bayesian networks from incomplete data involves using deterministic greedy hill-climbing algorithms. These algorithms suffer the fate of getting "stuck" at the nearest local maximum. In order to get around this problem, researchers use random restarts. The first hypotheses is that stochastic population-based algorithms will find networks with "good" predictive performance and do not get "stuck" at the nearest local maximum. I demonstrate this hypothesis with an evolutionary algorithm and a Markov Chain Monte Carlo algorithm. A problem with the Markov Chain Monte Carlo algorithm is that it is slow to converge to the stationary distribution. The second hypothesis developed from the framework is that using global information to propose new states will speed convergence. Finally, because the population-based algorithms have multiple models readily available, I explore the hypothesis that multiple models have a better predictive capability than the single "best" model. I demonstrate these hypotheses empirically with three carefully selected datasets.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 25, 1999
Accession Number
ADA370490

Entities

People

  • James W. Myers

Organizations

  • Air Force Institute of Technology

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Bayesian Networks
  • Cognitive Science
  • Computational Science
  • Data Mining
  • Databases
  • Information Processing
  • Information Science
  • Information Systems
  • Intelligent Agents
  • Machine Learning
  • Monte Carlo Method
  • Network Science
  • Probabilistic Models
  • Surveys

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks