ALTERNATIVE SCHEMES FOR INVESTIGATING MARKOV DECISION PROCESSES.

Abstract

The report is concerned with the problem of finding optimal policies for infinite-horizon Markovian decision models. Several algorithms applicable to discrete time, discrete state, single-chain Markov processes with undiscounted rewards are described. One of these algorithms, called 'the iterative method with limits,' is proved and its properties are discussed. The method consists of finding upper and lower limits for the optimal gain per transition of the infinite-horizon process. These limits improve on each iteration of the algorithm, and converge to the final value of the optimal gain, ultimately. Simultaneously, the algorithm produces the 'relative value' variables, as well as the optimal policy. The computational efficiency of the aforementioned algorithms was compared by solving several test problems. It was found that for problems for which the number of alternatives per state is considerably less than the number of states, the iterative method with limits enjoys a serious speed advantage over the well-known policy iteration algorithms. The former method also produces very accurate results with very little programming effort. (Author)

Document Details

Document Type: Technical Report
Publication Date: Jun 01, 1967
Accession Number: AD0655893

Entities

People

Amedeo R. Odoni

Organizations

Massachusetts Institute of Technology

ALTERNATIVE SCHEMES FOR INVESTIGATING MARKOV DECISION PROCESSES.

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers