Approximate Receding Horizon Approach for Markov Decision Processes: Average Award Case

Abstract

The authors consider an approximation scheme for solving Markov Decision Processes (MDPs) with countable state space, finite action space, and bounded rewards that uses an approximate solution of a fixed finite-horizon sub-MDP of a given infinite-horizon MDP to create a stationary policy, which they call "approximate receding horizon control." They first analyze the performance of the approximate receding horizon control for infinite-horizon average reward under an ergodicity assumption, which also generalizes the result obtained by White. The authors then study two examples of the approximate receding horizon control via lower bounds to the exact solution to the sub-MDP. The first control policy is based on a finite-horizon approximation of Howard's policy improvement of a single policy and the second policy is based on a generalization of the single policy improvement for multiple policies. They also provide a simple alternative proof on the policy improvement for countable state space. The authors discuss practical implementations of these schemes via simulation.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 24, 2002
Accession Number
ADA438476

Entities

People

  • Hyeong S. Chang
  • Steven I Marcus

Organizations

  • University of Maryland

Tags

Communities of Interest

  • C4I
  • Human Systems
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Air Force
  • Algorithms
  • Communication Networks
  • Dynamic Programming
  • Electronic Mail
  • Engineering
  • Equations
  • Ergodic Processes
  • Information Operations
  • Inventory Control
  • Monte Carlo Method
  • Probability
  • Probability Distributions
  • Random Variables
  • Simulations
  • Stationary
  • Universities

Readers

  • Mathematical Modeling and Probability Theory.

Technology Areas

  • Space
  • Space - Spacecraft Maneuvers