FINITE STAGE CONTINUOUS TIME MARKOV DECISION PROCESSES WITH AN INFINITE PLANNING HORIZON,
Abstract
The system considered may be in one of n states at any point in time; its probability law is a Markov process that depends on the policy (control) chosen. The return to the system over a given planning horizon is the integral over that horizon of a return rate that depends on both the policy and the sample path of the process. The objective is to find a policy that maximizes the expected discounted return as the planning horizon tends to infinity. The case where the discount factor goes to zero is also considered. In all cases it is shown that there is a stationary policy that is optimal, and an algorithm is given to obtain that policy. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 1967
- Accession Number
- AD0659730
Entities
People
- Bruce L. Miller
Organizations
- RAND Corporation