Non-parametric methods in reinforcement learning:Instance-optimality, adaptivity and data-dependent

Abstract

Research problem and objectives: This research proposal addresses fundamental questionsconcerning the use of non-parametric methods,in reinforcement learning. For the subclass of ADP/RL problems in which the state and action spaces are discrete, taking on finitely, many values, our theoretical understanding of algorithms and fundamental limits is fairly complete. However, many RL problems invol,ve state and/or action spaces that are continuous, and apart from certain special cases (e.g., linear-quadratic control), there are,fewer theoretical guarantees in such settings. Such problems require the flexibility and richness provided by non-parametric functio,n classes, including kernel-based methods, regression trees, and neural networks, among others. The research will leverage a combina,tion of techniques from high-dimensional statistics and non-parametric andsemi-parametric statistics to develop new procedures that,are theoretically well-grounded.Technical tools: In terms of techniques, this research will exploit cutting edge techniquesfrom empi,rical process theory, concentration of measure, and high-dimensional statistics in order to obtain sharp upper and lower bounds. It,will also make use of randomized algorithms and approximation-theoretic techniques to derive computationally efficient procedures. A,nticipated outcomes: Expected outcomes of this work are fundamental theoretical guaran-tees for non-parametric methods in applicatio,n to RL problems, including the problems of fitted value iteration, fitted policy optimization and Q-learning, as well as off-policy, versions of these same problems. In addition to theoretical guarantees, this research should lead to computationally- efficient pro,cedures.Impact on DOD capabilities: This is a fundamental research project that is not expected toproduce any developmental items. S,hould any developmental items result from this work they will have both civilian and military applications. The intended research is, theoretical and will not result in any environmental impacts.

Document Details

Document Type
DoD Grant Award
Publication Date
Sep 08, 2022
Source ID
N000142212756

Entities

People

  • Martin J. Wainwright

Organizations

  • Massachusetts Institute of Technology
  • Office of Naval Research
  • United States Navy

Tags

Readers

  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • Space