Optimal game theoretic solution of the pursuit‐evasion intercept problem using on‐policy reinforcement learning

Abstract

This article presents a rigorous formulation for the pursuit‐evasion (PE) game when velocity constraints are imposed on agents of the game or players. The game is formulated as an infinite‐horizon problem using a non‐quadratic functional, then sufficient conditions are derived to prove capture in a finite‐time. A novel tracking Hamilton–Jacobi–Isaacs (HJI) equation associated with the non‐quadratic value function is employed, which is solved for Nash equilibrium velocity policies for each agent with arbitrary nonlinear dynamics. In contrast to the existing remedies for proof of capture in PE game, the proposed method does not assume players are moving with their maximum velocities and considers the velocity constraints a priori. Attaining the optimal actions requires the solution of HJI equations online and in real‐time. We overcome this problem by presenting the on‐policy iteration of integral reinforcement learning (IRL) technique. The persistence of excitation for IRL to work is satisfied inherently until capture occurs, at which time the game ends. Furthermore, a nonlinear backstepping control method is proposed to track desired optimal velocity trajectories for players with generalized Newtonian dynamics. Simulation results are provided to show the validity of the proposed methods.

Document Details

Document Type
Pub Defense Publication
Publication Date
Aug 01, 2021
Source ID
10.1002/rnc.5719

Entities

People

  • Atilla Dogan
  • Frank Lewis
  • Kamesh Subbarao
  • Yusuf Kartal

Organizations

  • Army Research Office
  • Center for Hierarchical Manufacturing
  • Office of Naval Research Global
  • University of Texas at Arlington

Tags

Readers

  • Calculus or Mathematical Analysis
  • Game Theory.
  • Robotics and Automation.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms