Tree-Based Hierarchical Reinforcement Learning

Abstract

In this thesis, the author investigates methods for speeding up automatic control algorithms. Specifically, he provides new abstraction techniques for Reinforcement Learning and Semi-Markov Decision Processes (SMDPs). He also introduces the use of policies as temporally abstract actions. This is different from previous definitions of temporally abstract actions as he does not have termination criteria. He provides an approach for processing previously solved problems to extract these policies. He also contributes a method for using supplied or extracted policies to guide and speed up the solving of new problems. He treats extracting policies as a supervised learning task and introduces the Lumberjack algorithm, which extracts repeated sub-structure within a decision tree. He then introduces the TTree algorithm, which combines state and temporal abstraction to increase problem solving speed on new problems. TTree solves SMDPs by using both user- and machine-supplied policies as temporally abstract actions while generating its own tree-based abstract state representation. By combining state and temporal abstraction in this way, TTree is the only known SMDP algorithm that is able to ignore irrelevant or harmful subregions within a supplied abstract action while still making use of other parts of the abstract action.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Aug 01, 2002
Accession Number: ADA457553

Entities

People

William T. Uther

Organizations

Carnegie Mellon University

Tree-Based Hierarchical Reinforcement Learning

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas