Enhanced Experience Replay for Deep Reinforcement Learning

Abstract

Deep reinforcement learning recently has performed very well in the task of learning control policies for Atari 2600 games. Using raw frames taken directly from an Atari emulator, these systems train a convolutional neural network to interpret the state of the game and select the optimal action. Temporal-difference Q-learning is used to train the network, and a memory of state-action-reward transitions is kept and used in an experience-reply algorithm to increase training efficiency. Recent work reports performance at or above the level of an expert human player in many of the games; however, when evaluating behavior on a more qualitative level, there are major inconsistencies with the actions of an intelligent player. To improve these behavioral characteristics, we introduce 3 new techniques: 1) we bias the experience-replay-selection step toward state transitions that received a positive reward; 2) we compare newly observed states to a set of recently observed states and take a random action rather than accept the action of the current policy if the states are similar to within a threshold; and 3) we only perform the reinforcement learning updates on the topmost linear layers as experiences are generated. This report details these techniques and preliminary results.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2015
Accession Number
ADA624278

Entities

People

  • Bryan Dawson
  • David Doria
  • Manuel Vindiola

Organizations

  • United States Army Research Laboratory

Tags

Communities of Interest

  • Autonomy
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence Software
  • Convolutional Neural Networks
  • Data Science
  • Failure Mode And Effect Analysis
  • Information Science
  • Learning
  • Machine Learning
  • Military Research
  • Neural Networks
  • Reinforcement Learning
  • Sequences
  • Standards
  • Training
  • Unsupervised Machine Learning
  • Video Games

Fields of Study

  • Computer science

Readers

  • Mathematical Modeling and Probability Theory.
  • Neural Network Machine Learning.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks