Multi-time Scale Markov Decision Processes

Abstract

This paper proposes a simple analytical model called M time-scale Markov Decision Process (MMDP) for hierarchically structured sequential decision-making processes, where decisions at each level in the M-level hierarchy are made in M different time-scales. In this model, the state space and the control space of each level in the hierarchy are non-overlapping with those of the other levels, respectively, and the hierarchy is structured in a "pyramid" sense such that a decision made at level m (slower time-scale) state will affect the evolutionary decision-making process of the lower level m + 1 (faster time-scale) until a new decision is made at the higher level, but the lower level decisions themselves do not affect the higher level's transition dynamics. The performance produced by the lower level's decisions will affect the higher level's decisions. A hierarchical objective function is defined such that the finite-horizon value of following a (nonstationary) policy at the level m + 1 over a decision epoch of the level m plus an immediate reward at the level m is the single step reward for the level m decision-making process. From this the authors define "multi-level optimal value function" and derive "multi-level optimality equation." They then discuss how to solve MMDPs exactly or approximately and also examine heuristic online methods to solve MMDPs. Finally, they give some example control problems that can be modeled as MMDPs.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 24, 2002
Accession Number
ADA438506

Entities

People

  • Hyeong S. Chang
  • Mark Shayman
  • Pedram Fard
  • Steven I Marcus

Organizations

  • University of Maryland

Tags

Communities of Interest

  • C4I

DTIC Thesaurus Topics

  • Algorithms
  • Composite Materials
  • Differential Equations
  • Engineering
  • Equations
  • Ergodic Processes
  • Governments
  • Heuristic Methods
  • Markov Chains
  • Money
  • Operations Research
  • Probability
  • Probability Distributions
  • Production Planning
  • Random Variables
  • Scale Models
  • Standards

Readers

  • Computational Fluid Dynamics (CFD)
  • Mathematical Modeling and Probability Theory.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • Space