An Evaluation of Using Deterministic Heuristics to Accelerate Reinforcement Learning
Abstract
Neural networks frequently face long training times based on the corpus of data available to them. Reinforcement learning in particular can take a long time to attain satisfactory performance. Recent efforts to incorporate deterministic logical rules and physical laws into a neural network have met with promising results. From an existing baseline neural network that is designed to learn Pong strictly from pixel representation of the game board, this thesis adds a ball trajectory-based heuristic into the learning process and evaluates its performance. The evaluation initially shows game score improvements, but demonstrates a sharp score degradation after about 25,000 games. Another evaluation shows the heuristic incurs a training time increase of approximately 35%. More work remains for assessing the long-term viability of this approach.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 2018
- Accession Number
- AD1069772
Entities
People
- Garret M Walton
Organizations
- Naval Postgraduate School