Autonomous Learning in a Pseudo-Episodic Physical Environment
Abstract
Forpractical considerations reinforcement learning has proven to be a difficult task outside of simulation when applied to a physical experiment. Here we derive an optional approach to model free reinforcement learning, achieved entirely online, through careful experimental design and algorithmic decision making. We design a reinforcement learning scheme to implement traditionally episodic algorithms for an unstable 1-dimensional mechanical environment. The training scheme is completely autonomous, requiring no human to be present throughout the learning process. We show that the pseudo-episodic technique allows for additional learning updates with off-policy actor-critic and experience replay methods. We show that including these additional updates between periods of traditional training episodes can improve speed and consistency of learning. Furthermore, we validate the procedure in experimental hardware. In the physical environment, several algorithm variants learned rapidly, each surpassing baseline maximum reward. The algorithms in this research are model free and use only information obtained by an onboard sensor during training.
Document Details
- Document Type
- Pub Defense Publication
- Publication Date
- Feb 01, 2022
- Source ID
- 10.1007/s10846-022-01577-5
Entities
People
- Daniel J Inman
- Kevin Haughn
Organizations
- Air Force Office of Scientific Research