Control And Learning Of Uncertain Dynamical Systems: Optimization, Sampling, And Regret

Abstract

This report shows that first order methods can be used to provide an effective bridge between optimal control theory and sample-based reinforcement learning. The work focuses on the linear quadratic regulator problem and Markov decision processes. Some of the results include a proof that gradient descent starting from a stabilizing policy converges to the globally optimal policy and an algorithm that provides nearly tight regret bounds for the control of a linear dynamical system with adversarial disturbances.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2019
Accession Number
AD1093314

Entities

People

  • Maryam Fazel
  • Mehran Mesbahi
  • Sham Kakade

Organizations

  • University of Washington

Tags

Communities of Interest

  • Autonomy
  • Space

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Computational Complexity
  • Control Systems
  • Control Theory
  • Equations
  • Learning
  • Linear Systems
  • Machine Learning
  • Optimization
  • Regulators
  • Reinforcement Learning
  • Riccati Equation
  • Sampling
  • Standards
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Control Systems Engineering.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms