Markov Games: Receding Horizon Approach

Abstract

The authors consider a receding horizon approach as an approximate solution to two-person, zero-sum Markov games with infinite horizon discounted cost and average cost criteria. They first present error bounds from the optimal equilibrium value of the game when both players take correlated equilibrium receding horizon policies that are based on exact or approximate solutions of receding finite horizon subgames. Motivated by the worst-case optimal control of queueing systems by Altman, they then analyze error bounds when the minimizer plays the (approximate) receding horizon control and the maximizer plays the worst case policy. They give three heuristic examples of the approximate receding horizon control. They extend "rollout" by Bertsekas and Castanon and "parallel rollout" and "hindsight optimization" by Chang et al. into the Markov game setting within the framework of the approximate receding horizon approach and analyze their performances. From the rollout/parallel rollout approaches, the minimizing player seeks to improve the performance of a single heuristic policy it rolls out or to combine dynamically multiple heuristic policies in a set to improve the performances of all of the heuristic policies simultaneously under the guess that the maximizing player has chosen a fixed worst-case policy. Given epsilon > 0, they give the value of the receding horizon, which guarantees that the parallel rollout policy with the horizon played by the minimizer dominates any heuristic policy in the set by epsilon. From the hindsight optimization approach, the minimizing player makes a decision based on his or her expected optimal hindsight performance over a finite horizon. The authors discuss practical implementations of the receding horizon approaches via simulation.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 15, 2001
Accession Number
ADA438504

Entities

People

  • Hyeong S. Chang
  • Steven I Marcus

Organizations

  • University of Maryland

Tags

Communities of Interest

  • C4I
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Electronic Mail
  • Engineering
  • Equations
  • Errors
  • Game Theory
  • Inequalities
  • Iterations
  • Markov Chains
  • Optimization
  • Probability
  • Probability Distributions
  • Random Variables
  • Sampling
  • Simulations
  • Theorems
  • Zero-Sum Games

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Game Theory.
  • Operations Research