Advanced Autonomy for Unmanned Surface Vehicles via Distributional Reinforcement Learning

Abstract

This proposal describes a five-year project that will investigate the potential for distributional reinforcement learning to support advanced autonomy capabilities for unmanned surface vehicles (USVs). Reinforcement learning (RL) has enjoyed many successes in itsapplication thus far to robots and autonomous systems. However, as a tool to support reliable, long-duration autonomy in the real world, the standard process of decision-making based only on expectation offers a limited perspective on the potential outcomes of the heavy-tailed and multi-modal probability distributions that may govern a USV s actions in real-world stochastic environments. Distributional reinforcement learning offers a potential way forward. Compared to traditional RL methods, Distributional RL is shown to provide more stable learning behavior in environments with high uncertainty as it learns return distributions. In addition, risk measures can be applied to adjust the level of sensitivity to aleatoric uncertainty in learned distributions, and enhance the safety ofa Distributional RL agent. Our research performed to date suggests that Distributional RL, and specifically, our adaptation of Implicit Quantile Networks (IQN) to USV navigation, can permit USVs to safely and efficiently navigate densely packed obstacles and flowdisturbances, under minimalistic sensing. Our preliminary results from congested multi-vehicle scenarios also suggest that there isstrong potential for Distributional RL to support safe multi-vehicle autonomy and complex tasks.This research project will investigate how distributional reinforcement learning can support advanced autonomy capabilities for USVs. Firstly, we will use Distributional RL to learn a diverse portfolio of individual policies capable of safely executing complex tasks in dynamic, unstructured, and uncertain environments, characterized by harsh disturbances, minimalistic sensing, and degraded perception. In such settings, distributional RL offers thepotential for improved performance and increased safety. Secondly, one layer above the aforementioned task execution policies, we will establish the infrastructure for centralized mission planning and task allocation to address complex missionsrequiring multiple USVs, alongside decentralized task execution compatible with low-bandwidth and intermittent communications. Whena deployment of USVs is sufficiently large, our planning layer will also manage team formation, and tasks will be allocated hierarchically to teams of USVs based on their physical locations. Thirdly, to support our development, testing and validation of the abovecapabilities, experiments will be performed with different levels of fidelity, with a #Sim2Sim2Real# approach that encompasses a low-fidelity simulator, a high-fidelity simulator, and physical experiments with real USVs.

Document Details

Document Type
DoD Grant Award
Publication Date
Nov 09, 2024
Source ID
N000142412522

Entities

People

  • Brendan Englot

Organizations

  • Office of Naval Research
  • Stevens Institute of Technology
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Statistical inference.
  • Unmanned Aerial System (UAS) Autonomous Capabilities and Mission Reconnaissance.

Technology Areas

  • AI & ML
  • AI & ML - Autonomous Systems
  • AI & ML - DoD AI Strategy
  • Autonomy
  • Autonomy - Autonomous System Control