Optimality of Stationary Halting Policies and Finite Termination of Successive Approximations.
Abstract
The stopping and halting optimality of stationary halting policies in discrete-time-parameter S-state finite-action branching Markov decision chains are characterized by the finite termination of successive approximations. A policy is called halting (resp., stopping) if the expected population size at time N is zero for some N (resp., converges to zero as N approaches infinity). The value of a policy is the expected infinite-horizon income that it earns. An optimal stopping (resp., halting) policy is one having maximum value in that class of policies. It is shown that when the rewards are real (resp., real or minus infinity) valued, the N-th iterate of successive approximations (and a Gauss-Seidel improvement thereof) is a fixed point of the optimal return operator for some N when initiated with the value of a stationary halting policy if and only if that is so for some N < or = S; moreover this occurs if and only if there exists a halting stationary optimal stopping (resp., halting) policy.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 1978
- Accession Number
- ADA061522
Entities
People
- Ranel E. Erickson
Organizations
- Stanford University