A MODIFIED DYNAMIC PROGRAMMING METHOD FOR MARKOVIAN DECISION PROBLEMS

Abstract

At the beginning of each period a system is in a certain state. Depending on the state, choice of an action determines the income for that period and the transition probabilities for moving to the next state. The problem is to choose the action at the beginning of each period which will maximize future total discounted income. A modified dynamic programming method for this problem is described. The method gives improved error control by showing how to compute upper and lower bonds on the optimal return which are, respectively, monotone decreasing and monotone increasing, and converge to the optimal return. The convergence appears to be quite rapid.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 1965
Accession Number
AD0619764

Entities

People

  • James B. Macqueen

Organizations

  • University of California, Los Angeles

Tags

Communities of Interest

  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • California
  • Computer Programming
  • Contracts
  • Convergence
  • Dynamic Programming
  • Equations
  • Iterations
  • Military Research
  • Probability
  • Random Variables
  • Sequences
  • Standards
  • Stationary
  • United States
  • Universities

Readers

  • Military Mobilization and Reserve Forces Studies.
  • Operations Research