Dynamic Programming with Negative Rewards and Average Reward Criterion.
Abstract
Basic results on dynamic programming with finite state and action spaces are generalized to a class of dynamic programming problems that includes most queuing control problems treated in literature. The results establish existence of average reward optimal stationary policies and uniqueness of solution to the functional equation of dynamic programming in the average reward case among nonpositive functions up to an additive constant. The question of convergence of discount optimal policies to average optimal policies is settled in a fairly general setting. Existence of nearly optimal policies is shown in a particular queuing control problem. The principle which turns out to be most useful in the study is a basic probabilistic systems theorem to the effect that a fair game remains fair under transformation by certain systems of optional stopping. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 1975
- Accession Number
- ADA016636
Entities
People
- Prakash Awate
Organizations
- Cornell University