Collaborative Proposal: Feasible, Model-Free Distributionally Robust Policy Learning

Abstract

Research Problem and ObjectivesStandard Reinforcement Learning (RL) assumes training matches deployment. This proposal tackles distributionally robust RL to provide reliable performance under distribution shifts at test time. Goals are to: develop sample-efficient model-free algorithms with theoretical guarantees; ensure feasibility without generative models; provide guarantees for episodic and infinite-horizon RL; handle large state spaces and partial observability; empirical evaluation of robustness.Technical ApproachesDesign model-free algorithms for episodic and infinite-horizon RL settings. Episodic RL will use threshold estimators and empirical Bernstein inequality. Infinite-horizon RL will modify Q-learning and RMAX/UCRL, incorporating constrained MDPs. Function approximation and history-based approximations will be explored.Anticipated OutcomesNew sample complexity limits, feasible algorithms, demonstrated robustness over shifts, open-source release, guidelines for tuning/tradeoffs, insights into robustness vs efficiency, extensions for generalization and partial observability. Significantly advance theory and practice of safe, reliable RL under uncertainty.Impact on DoD CapabilitiesEnable reliable decision-making/control for autonomous systems, reducing real-world data needs for sim-to-real transfer. Speed certification and acquisition. Produce adaptable autonomy that safely handles novel scenarios. Provide performanceguarantees that increase trust in autonomous systems. Overall, enable robust, reliable, practical AI that can operate under uncertainty across defense applications.Approved for public release.

Document Details

Document Type
DoD Grant Award
Publication Date
Nov 09, 2024
Source ID
N000142412655

Entities

People

  • Jose Blanchet

Organizations

  • Office of Naval Research
  • Stanford University
  • United States Navy

Tags

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • Autonomy
  • Autonomy - Autonomous System Control
  • Space
  • Space - Spacecraft Maneuvers