Dynamic Programming with Negative Rewards and Average Reward Criterion.

Abstract

Basic results on dynamic programming with finite state and action spaces are generalized to a class of dynamic programming problems that includes most queuing control problems treated in literature. The results establish existence of average reward optimal stationary policies and uniqueness of solution to the functional equation of dynamic programming in the average reward case among nonpositive functions up to an additive constant. The question of convergence of discount optimal policies to average optimal policies is settled in a fairly general setting. Existence of nearly optimal policies is shown in a particular queuing control problem. The principle which turns out to be most useful in the study is a basic probabilistic systems theorem to the effect that a fair game remains fair under transformation by certain systems of optional stopping. (Author)

Document Details

Document Type
Technical Report
Publication Date
May 01, 1975
Accession Number
ADA016636

Entities

People

  • Prakash Awate

Organizations

  • Cornell University

Tags

DTIC Thesaurus Topics

  • Additives (Chemicals)
  • Computer Programming
  • Convergence
  • Dynamic Programming
  • Equations
  • Literature
  • Mathematics
  • Stationary

Readers

  • Mathematical Modeling and Probability Theory.

Technology Areas

  • Space
  • Space - Spacecraft Maneuvers