MARKOVIAN DECISION PROCESSES WITH UNCERTAIN TRANSITION PROBABILITIES OR REWARDS
Abstract
In most Markov process studies to date it has been assumed that both the transition probabilities and rewards are known exactly. The primary purpose of this thesis is to study the effects of relaxing these assumptions to allow more realistic models of real world situations. The Bayesian approach used leads to statistical decision frameworks for Markov processes. The first section is concerned with situations where the transition probabilities are not known exactly. One approach used incorporates the concept of multi-matrix Markov processes, processes where it is assumed that one of several known transition matrices is being utilized, but we only have a probability vector on the various matrices rather than knowing exactly which one is governing the process. The second approach assumes more directly that the transition probabilities themselves are random variables. It is shown that the multidimensional Beta distribution is a most convenient distribution (for Bayes calculations) to place over the probabilities of a single row of the transition matrix. Several important properties of the distribution are displayed. Then a method is suggested for determining the multidimensional Beta prior distributions to use for any particular Markov process.
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 01, 1963
- Accession Number
- AD0417150
Entities
People
- Edward A. Silver
Organizations
- Massachusetts Institute of Technology