DISCRETE DYNAMIC PROGRAMMING WITH SENSITIVE DISCOUNT OPTIMALITY CRITERIA.

Abstract

Stationary, discrete and continuous time parameter, finite state and action Markovian decision processes with substochastic transition matrices are studied where the future rewards are discounted. The methods of successive approximations and policy improvement are shown to converge to the maximal expected discounted reward when the interst rate is negative provided the expected discounted time until the process stops is finite under every stationary policy. A family of discount optimality criteria of varying degrees of sensitivity is introduced. The family includes and unifies the discount criteria studied by Blackwell. The existence of stationary optimal policies is established constructively and the class of such policies is characterized. The algorithms are essentially specializations and refinements of a policy improvement method recently devised by Miller and the author for finding stationary policies having maximal expected discounted reward for all small enough positive interest rates. The results exploit the fact that the Laurent expansion in rho (the interest rate) about the origin of the discounted return under a stationary policy has a simple form. (Author)

Document Details

Document Type: Technical Report
Publication Date: Aug 30, 1968
Accession Number: AD0673965

Entities

People

Arthur F. Veinott Jr.

Organizations

Stanford University

DISCRETE DYNAMIC PROGRAMMING WITH SENSITIVE DISCOUNT OPTIMALITY CRITERIA.

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers