Hierarchical Average Reward Reinforcement Learning

Abstract

Hierarchical reinforcement learning (HRL) is the study of mechanisms for exploiting the structure of tasks in order to learn more quickly. By decomposing tasks into subtasks, fully or partially specified subtask solutions can be reused in solving tasks at higher levels of abstraction. The theory of semi-Markov decision processes provides a theoretical basis for HRL. Several variant representational schemes based on SMDP models have been studied in previous work, all of which are based on the discrete-time discounted SMDP model. In this approach, policies are learned that maximize the long-term discounted sum of rewards. In this paper we investigate two formulations of HRL based on the average-reward SMDP model, both for discrete time and continuous time. In the average-reward model, policies are sought that maximize the expected reward per step. The two formulations correspond to two different notions of optimality that have been explored in previous work on HRL: hierarchical optimality, which corresponds to the set of optimal policies in the space defined by a task hierarchy, and a weaker local model called recursive optimality.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 25, 2003
Accession Number
ADA445728

Entities

People

  • Mohammad Ghavamzadeh
  • Sridhar Mahadevan

Organizations

  • University of Massachusetts Amherst

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Automated Guided Vehicles
  • Autonomous Systems
  • Computer Science
  • Learning
  • Machine Learning
  • Manufacturing
  • Markov Chains
  • Navigation
  • Probability
  • Probability Distributions
  • Random Variables
  • Reinforcement Learning
  • Simulations
  • Steady State
  • Systems Engineering

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Mathematical Modeling and Probability Theory.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • Space
  • Space - Spacecraft Maneuvers