PECASE: Gradients and Hierarchy in Deep Reinforcement Learning

Abstract

Short Work Statement:Develop deep hierarchical reinforcement learning for integration of perception and control for complex, long-duration tasks where rewards are delayed.Objective:Investigate reinforcement learning for integration of perception and control and develop algorithms for learning complex, long-duration tasks.Approach:The PI will investigate reinforcement learning, particularly in cases where rewards have substantial delays. The investigation domain is primarily for integration of perception and control. Reinforcement learning for problems that have substantial reward delays is generally intractable. The PI proposes to develop deep hierarchical architectures to make these problems tractable. They will also investigate if such hierarchical architectures would make learning of newtasks easier. The other aspect of this proposal is developing methods for computing policy gradients that are at the core of many learning algorithms including reinforcement learning, deep neural nets, LSTM networks and others. They will investigate the stochastic computation graphs formalism for computing policy gradients. Merit/Relevance:This effort is related to ONR~s Autonomy focus area, as well as Information Dominance focus area. This effort willcontribute to building capable robots that can perform complex tasks that require long durations. This work is expected to result in tractable algorithms for reinforcement learning for problems with substantial reward delays, and fast and accurate methods for computing policy gradients used in a wide variety of learning algorithms.

Document Details

Document Type: DoD Grant Award
Publication Date: Aug 12, 2016
Source ID: N000141612723

Entities

People

Pieter Abbeel

Organizations

Office of Naval Research
United States Navy
University of California Regents

PECASE: Gradients and Hierarchy in Deep Reinforcement Learning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas