Control And Learning Of Uncertain Dynamical Systems: Optimization, Sampling, And Regret

Abstract

This report shows that first order methods can be used to provide an effective bridge between optimal control theory and sample-based reinforcement learning. The work focuses on the linear quadratic regulator problem and Markov decision processes. Some of the results include a proof that gradient descent starting from a stabilizing policy converges to the globally optimal policy and an algorithm that provides nearly tight regret bounds for the control of a linear dynamical system with adversarial disturbances.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 01, 2019
Accession Number: AD1093314

Entities

People

Maryam Fazel
Mehran Mesbahi
Sham Kakade

Organizations

University of Washington

Control And Learning Of Uncertain Dynamical Systems: Optimization, Sampling, And Regret

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas