Markov Decision Processes with Policy Constraints.
Abstract
This work is concerned with Markov Decision Processes with policy constraints. The selection of an optimum stationary policy for such processes, in the absence of policy constraints, is a problem which has received a great deal of attention, and has been satisfactorily solved. Relatively little attention has been given to the case when policy constraints are present or to the formulation of such constraints. Optimum policy sensitivity analysis is also a subject in which little has been achieved. Towards those ends, this work makes three major contributions. First, policy constraints are formulated and categorized. Secondly, a computationally efficient iterative algorithm is developed for selecting the optimum policy for completely ergodic, infinite time horizon Markov Decision Processes with policy constraints for both the risk-indifferent and risk-sensitive cases. Finally, the sensitivity of optimum policies to the policy constraints is analyzed by using the algorithm to compute the value of removing a constraint or a group of constraints. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Apr 01, 1976
- Accession Number
- ADA034249
Entities
People
- John Nafeh
Organizations
- Stanford University