Reinforcement Learning With High-Dimensional, Continuous Actions

Abstract

Many reinforcement learning systems, such as Q-learning (Watkins, 1989), or advantage updating (Baird, 1993), require that a function f(x,u) be learned, and that the value of argmax f(x,u) be calculated quickly for any given x. The function f could be learned by a function approximation system such as a multilayer preceptron, but the maximum of f for a given x cannot found analytically and is difficult to approximate numerically for high-dimensional u vectors. A new method is proposed, wire fitting, in which a function approximation system is used to learn a set of functions called control wires, and the function f is found by fitting a surface to the control wires. Wire fitting has the following four properties: (1) any continuous f function can represented to any desired accuracy given sufficient parameters; (2) the function f(x,u) can be evaluated quickly; (3) argmax f(x,u) can found exactly in constant time after evaluating f(x,U); (4) wire fitting can incorporate any general function approximation system. These four properties are discussed and it is shown how wire fitting can be combined with a memory-based learning system and Q-learning to control an inverted-pendulum system

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 04, 1993
Accession Number
ADA280844

Entities

People

  • A. H. Klopf
  • Leemon C. Baird Iii

Organizations

  • Wright Laboratory

Tags

Communities of Interest

  • Air Platforms
  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Air Force
  • Algorithms
  • Applied Computer Science
  • Artificial Intelligence
  • Artificial Intelligence Computing
  • Avionics
  • Computer Science
  • Control Systems
  • Dynamic Programming
  • Governments
  • Learning
  • Machine Learning
  • Neural Networks
  • Reinforcement Learning
  • Simulations
  • United States

Readers

  • Approximation Theory.
  • Electrical Engineering
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks