Constrained Markov Decision Chains.

Abstract

Consider a finite state and action discrete time parameter Markov decision chains. The objective is to provide an algorithm for finding a policy that minimizes the long run expected average cost when there are linear side conditions on the limit points of the expected state-action frequencies. This problem has been solved previously only for the case where every deterministic stationary policy has at most one ergodic class. The note removes that restriction by applying the Dantzig-Wolfe decomposition principle. (Author)

Document Details

Document Type
Technical Report
Publication Date
Oct 22, 1971
Accession Number
AD0737644

Entities

People

  • Arthur F. Veinott Jr.
  • Cyrus Derman

Organizations

  • Stanford University

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Chemical Reactions
  • Decomposition
  • Frequency
  • Stationary

Readers

  • Mathematical Modeling and Probability Theory.
  • Operations Research