AFOSR YIP ROBUST MAXIMUM ENTROPY PLANNING, LEARNING AND CONTROL IN UNCERTAIN ENVIRONMENTS

Abstract

This work will develop flexible, robust, and efficient methods for sequential decision making in scenarios where there is significant uncertainty in the environment and reward signal. This work is motivated by the hypothesis that learning accurate models of complex environments is prohibitive, and that learning must be robust even in the setting of low-fidelity models. The approach builds on maximum entropy reinforcement learning (MaxEnt RL), which encourages high reward while maintaining policy uncertainty via entropy. The first research effort focuses on developing robust and sample-efficient model-based learning methods that extend the MaxEnt RL approach. The proposed methods simultaneously learn model representations and policy, which encouraging high policy uncertainty. Additional robustness is obtained by developing a diversity-preserving sample mechanism to identify distinct high-quality trajectories in high-dimensional continuous state-action spaces. The second research effort addresses random and unknown reward signals by specifying a generative model including a prior belief over random reward functions. Efficient variational techniques are developed to marginalize unknown rewards. Finally, the proposed work will build on this random rewards model to learn reward from expert demonstrations, when they are available, performing so-called inverse reinforcement learning (IRL). The proposed approach is referred to as MaxEnt IRL, since it extends the maximum entropy RL framework that is developed throughout this project.

Document Details

Document Type: DoD Grant Award
Publication Date: Mar 07, 2023
Source ID: FA95502210194

Entities

People

Jason Pacheco

Organizations

Air Force Office of Scientific Research
United States Air Force
University of Arizona

AFOSR YIP ROBUST MAXIMUM ENTROPY PLANNING, LEARNING AND CONTROL IN UNCERTAIN ENVIRONMENTS

Abstract

Document Details

Entities

People

Organizations

Tags

Readers

Technology Areas