Reinforcement Learning as a Rehearsal for Planning in Air Battle Management (RLAR)

Abstract

This project leveraged some of the recent advances in RL to develop planners for real time strategy games, specifically MicroRTS in lieu of Stratagem program's wargame. One of these advances from the PIs lab is called reinforcement learning as a rehearsal (RLaR). Previously, RLaR had only been evaluated in toy benchmark tasks to establish its efficacy in sample complexity reduction. This project developed RLaR for the actor-critic architecture and applied it for the first time to a complex domain with incomplete information such as MicroRTS. Another technique applied in this project originated from the recent successes of multi-agent learning in the complex StarCraft II game, specifically the architecture of multi-stage training that develop league and league-exploiter policies during intermediate stages for training robust policies. We trained RLaR against MicroPhantomthe runner-up from recent MicroRTS competitions and showed its ability to plan effectively against this opponent but using fewer samples than relevant baselines. Separately, we trained RLaR in self-play using the 4-stage training scheme and evaluated the trained policy against MentalSeal (champion program) and MicroPhantom. While the policy once again showed good performance against MicroPhantom, it did not perform competently against MentalSeal. Based on an earlier preliminary finding that training against MentalSeal is extremely slow, we speculate that vastly more training time is required than what we could devote to this step during the extended period for this project.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2023
Accession Number
AD1196764

Entities

People

  • Bikramjit Banerjee

Organizations

  • University of Southern Mississippi

Tags

Communities of Interest

  • Autonomy
  • Human Systems

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Battle Management
  • Computational Science
  • Contracts
  • Engineering
  • Information Exchange
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Military Research
  • Network Architecture
  • Neural Networks
  • Probability
  • Reinforcement Learning
  • Standards
  • United States

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Military Training and Readiness Simulation
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy
  • AI & ML - Neural Networks