Understanding the Adverse Effects of Accelerating Reinforcement Learning with Human Trainers
Abstract
Recent advances in reinforcement learning (RL) have propelled the idea that artificially intelligent agents may one day replace humans in performing complex tasks. There are numerous challenges associated with moving RL from a simulated environment to the real world. In particular, understanding the decision making process of the RL agents and ascertaining the viability of use in safety-constrained environments are key challenges. An evolving approach to addressing these challenges is to impart human knowledge into the learning algorithms. Through a comprehensive evaluation using a Pong RL agent, this thesis provides evidence that incorporating human influence into an RL algorithm can cause a strategy conflict and impede learning. In particular, it shows that (i) there is an inflection point measured by training episodes with respect to the positive effect of incorporating human influence for the Pong agent, and that (ii) if human influence is not decayed beyond the inflection point, the negative effect can intensify and eventually undo all prior training gains.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2020
- Accession Number
- AD1126458
Entities
People
- Brandon R Hee
Organizations
- Naval Postgraduate School