Dynamic Programming with Negative Rewards and Average Reward Criterion.

Abstract

Basic results on dynamic programming with finite state and action spaces are generalized to a class of dynamic programming problems that includes most queuing control problems treated in literature. The results establish existence of average reward optimal stationary policies and uniqueness of solution to the functional equation of dynamic programming in the average reward case among nonpositive functions up to an additive constant. The question of convergence of discount optimal policies to average optimal policies is settled in a fairly general setting. Existence of nearly optimal policies is shown in a particular queuing control problem. The principle which turns out to be most useful in the study is a basic probabilistic systems theorem to the effect that a fair game remains fair under transformation by certain systems of optional stopping. (Author)

Document Details

Document Type: Technical Report
Publication Date: May 01, 1975
Accession Number: ADA016636

Entities

People

Prakash Awate

Organizations

Cornell University

Dynamic Programming with Negative Rewards and Average Reward Criterion.

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Readers

Technology Areas