Robot Learning with Formal Guarantees on Out-of-Distribution Detection and Generalization

Abstract

The goal of this project is to develop a principled framework for learning control policies for robots with guaranteed out-of-distribution (OOD) generalization. As an example, consider a micro aerial vehicle (MAV) trained to perform vision-based navigation using a dataset of outdoor environments and deployed in environments with varying weather conditions, lighting, or obstacle densities. Similarly, consider a robotic manipulator tasked with manipulating a new set of objects or an autonomous vehicle deployed in a new city. Current state-of-the-art techniques for learning-based control of robots are extremely brittle when faced with such distribution shifts, i.e., when the distribution of environments the robot is tested on is different from the training distribution. The significant consequences of failure for safety-critical robotic systems demands an approach that allows us to make formal guarantees on OOD generalization. The goal of this project is to develop precisely such an approach.Technical approach. The key technical insight of this project is to leverage and extend ideas from generalization theory, differential privacy, and causal inference to achieve three tightly-integrated objectives. First, we propose algorithms for detecting when a robot is operating in OOD environments (e.g., in order to trigger an emergency safety policy). These algorithms are developed using the PAC-Bayes generalization framework and provide guaranteed confidence bounds on OOD detection while only responding to task-relevant variations in the robot#s environment. Second, we propose techniques based on differential privacy to learn control policies that are insensitive to a large set of realistic distribution shifts (e.g., as measured by the Wasserstein distance). Third, we propose approaches based on causal inference for learning policies that generalize beyond the support of the training distribution (i.e., generalize to environments that have zero probabilityof appearing under the training distribution).Anticipated outcome. The proposed effort targets the fundamental science of OOD generalization for robotic systems. We anticipate that the proposed project will lead to a foundational theoretical framework and practical algorithms for providing guarantees on OOD detection and generalization for robots with rich sensory inputs and neural network-based control policies. An integral part of this effort will be to thoroughly demonstrate and validate our approach on hardware platforms, with a particular focus on aerial inspection and manipulation using MAVs. We hypothesize that our experiments will demonstrate significant gains over state-of-the-art approaches in terms of (i) the ability to detect task- relevant distribution shifts, (ii) OOD generalization performance (as measured by sample efficiency and amount of distribution shift that can be tolerated), and (iii) the ability to provide strong theoretical guarantees on OOD generalization.Impact on DoD capabilities. The proposed work has the potential to significantly improve theU.S. Navy#s ability to deploy robotic systems in novel, unstructured, and complex environments. The proposed framework is directly applicable to a broad range of robotic systems and application domains including mobile manipulators performing infrastructure inspection and repair tasks, MAVs performing reconnaissance missions in cluttered environments, and underwater vehicles operating in contested maritime environments. The proposed approach for learning control policies for robots with guaranteed OOD generalization could overcome challenges associated with state-of-the-art solutions and allow the U.S. Navy to deploy such systems in settings that were previously beyond reach. The proposed project is well-aligned with ONR programs in machine learning and autonomy (Codes 311 and 351), along with the broad goals outlined in the Naval Research and Development Framework.

Document Details

Document Type: DoD Grant Award
Publication Date: Jan 12, 2023
Source ID: N000142312148

Entities

People

Anirudha Majumdar

Organizations

Office of Naval Research
Trustees of Princeton University
United States Navy

Robot Learning with Formal Guarantees on Out-of-Distribution Detection and Generalization

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas