Modularity, Constraints and Multimodality in Learning for Complex, Long-Horizon Sequential Decision-Making

Abstract

The objective of this work is to develop a theoretical and algorithmic foundation for efficiently learning policies for sequential decision-making that are safe during training and deployment; generalize to unforeseen environments; can easily adapt to new tasks; and are robust to adversarial perturbations. While these attributes are critical for real-world learning, current algorithms fail to provide such guarantees in a manner that scales beyond simple, short-horizon tasks. The proposed effort seeks to take a significant step towards overcoming these limitations using a range of techniques that draw upon three core organizing principles: (1) using modularity and compositionality to provide guarantees for long-horizon problems, (2) leveraging constraints at learning-time, rather than post-hoc, and (3) using multimodal, heterogenous data to triangulate concepts. These synergistic principles, listed below, collectively support the safety, generalization, adaptability, and adversarial robustness of the resulting learning algorithms.

Document Details

Document Type: DoD Grant Award
Publication Date: Jan 04, 2021
Source ID: W911NF2110009

Entities

People

Scott Niekum

Organizations

Army Contracting Command
Defense Advanced Research Projects Agency
University of Texas at Austin

Modularity, Constraints and Multimodality in Learning for Complex, Long-Horizon Sequential Decision-Making

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers