Measuring and characterizing generalization in deep reinforcement learning

Abstract

Deep reinforcement learning (RL) methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re‐examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on‐policy, off‐policy, and unreachable states. We propose a set of practical methods for evaluating agents with these definitions of generalization. We demonstrate these techniques on a common benchmark task for deep RL, and we show that the learned networks make poor decisions for states that differ only slightly from on‐policy states, even though those states are not selected adversarially. We focus our analyses on the deep Q‐networks (DQNs) that kicked off the modern era of deep RL. Taken together, these results call into question the extent to which DQNs learn generalized representations, and suggest that more experimentation and analysis is necessary before claims of representation learning can be supported.

Document Details

Document Type: Pub Defense Publication
Publication Date: Nov 17, 2021
Source ID: 10.1002/ail2.45

Entities

People

Akanksha Atrey
David Jensen
Emma Tosch
Jun K. Lee
Kaleigh Clary
Michael L. Littman
Sam Witty

Organizations

Brown University
Defense Advanced Research Projects Agency
United States Air Force
University of Vermont

Measuring and characterizing generalization in deep reinforcement learning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas