Explorations of the Practical Issues of Learning Prediction-Control Tasks Using Temporal Difference Learning Methods

Abstract

There has been recent interest in using a class of incremental learning algorithms called temporal difference learning methods to attack problems of prediction. These algorithms have been brought to bear on various prediction problems in the past, but have remained poorly understood. It is the purpose of this thesis to further explore this class of algorithms, particularly the TD (lambda) algorithm. A number of practical issues are raised and discussed from a general theoretical perspective and then explored in the context of several case studies. the thesis presents a framework for viewing these algorithms independent of the particular task at hand and uses this framework to explore not only tasks of prediction, but also prediction tasks that require control, whether complete or partial. This includes applying the TD (Lambda) algorithm to two tasks: (1) learning to play tic-tac-toe from the outcome of self-play and the outcome of play against a perfectly-playing opponent and (2) learning two simple one-dimensional image segmentation tasks.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1992
Accession Number
ADA270836

Entities

People

  • Charles L. Isbell

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Cognitive Science
  • Computer Languages
  • Computer Science
  • Computer Vision
  • Computers
  • Dimensionality Reduction
  • Electrical Engineering
  • Information Science
  • Machine Learning
  • Neural Networks
  • Signal Processing
  • Supervised Machine Learning
  • Unsupervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Computer Vision.
  • Game Theory.