Stochastic Algorithms for Learning with Incomplete Data: An Application to Bayesian Networks
Abstract
A goal of machine learning is to develop intelligent systems that improve with experience. To achieve this goal the field has drawn on many diverse disciplines. Because of this diversity, there is no common framework for the field to develop a research vision. This research begins by proposing a machine learning framework. From the framework three hypotheses pertaining to learning Bayesian networks are proposed. The current state-of-the-art learning paradigm for inducing Bayesian networks from incomplete data involves using deterministic greedy hill-climbing algorithms. These algorithms suffer the fate of getting "stuck" at the nearest local maximum. In order to get around this problem, researchers use random restarts. The first hypotheses is that stochastic population-based algorithms will find networks with "good" predictive performance and do not get "stuck" at the nearest local maximum. I demonstrate this hypothesis with an evolutionary algorithm and a Markov Chain Monte Carlo algorithm. A problem with the Markov Chain Monte Carlo algorithm is that it is slow to converge to the stationary distribution. The second hypothesis developed from the framework is that using global information to propose new states will speed convergence. Finally, because the population-based algorithms have multiple models readily available, I explore the hypothesis that multiple models have a better predictive capability than the single "best" model. I demonstrate these hypotheses empirically with three carefully selected datasets.
Document Details
- Document Type
- Technical Report
- Publication Date
- Oct 25, 1999
- Accession Number
- ADA370490
Entities
People
- James W. Myers
Organizations
- Air Force Institute of Technology