Convergence Behavior of Temporal Difference Learning.

Abstract

Temporal difference learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuel's checker-player and Tesauro's Backgammon program. Their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: May 01, 1996
Accession Number: ADA318671

Entities

People

Raj P. Malhotra

Convergence Behavior of Temporal Difference Learning.

Abstract

Document Details

Entities

People

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers