Autonomous Learning in a Pseudo-Episodic Physical Environment

Abstract

Forpractical considerations reinforcement learning has proven to be a difficult task outside of simulation when applied to a physical experiment. Here we derive an optional approach to model free reinforcement learning, achieved entirely online, through careful experimental design and algorithmic decision making. We design a reinforcement learning scheme to implement traditionally episodic algorithms for an unstable 1-dimensional mechanical environment. The training scheme is completely autonomous, requiring no human to be present throughout the learning process. We show that the pseudo-episodic technique allows for additional learning updates with off-policy actor-critic and experience replay methods. We show that including these additional updates between periods of traditional training episodes can improve speed and consistency of learning. Furthermore, we validate the procedure in experimental hardware. In the physical environment, several algorithm variants learned rapidly, each surpassing baseline maximum reward. The algorithms in this research are model free and use only information obtained by an onboard sensor during training.

Document Details

Document Type
Pub Defense Publication
Publication Date
Feb 01, 2022
Source ID
10.1007/s10846-022-01577-5

Entities

People

  • Daniel J Inman
  • Kevin Haughn

Organizations

  • Air Force Office of Scientific Research

Tags

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Computational Modeling and Simulation
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks