FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING,
Abstract
The problem considered is the discrete time multichain decision problem introduced and largely solved by Howard (Dynamic Programming and Markov Processes, Technology Press and John Wiley + Sons, Inc., New York 1960). A policy iteration algorithm is developed that obtains an optimal stationary policy for the discount problem involving all discount factors sufficiently close to one. (These are the policies that Blackwell has termed optimal and has proved existed.) The algorithm is excellent for a decisionmaker who knows that he should use a low discount rate but doesn't know what it should be. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 01, 1968
- Accession Number
- AD0668760
Entities
People
- Bruce L. Miller
Organizations
- RAND Corporation