A Computational Theory of Optimal Feedback Control
Abstract
OverviewThis proposal investigates the problem of optimal feedback control of partially observed nonlinear stochastic systems. Thes,e problems can, in theory, be solved via an associated Dynamic Programming (DP) problem but DP suffers from Bellman s infamous "Curs,e of Dimensionality" that severely limits its applicability. This proposal investigates the fundamental structure of feedback contro,l and proposes a novel computational theory for how such problems can, and should, be solved in a Scalable, Accurate, Reliable and G,lobally Optimal (SARGO) fashion. We examine the implications of this theory on various seemingly diverse sub-fields of Control inclu,ding Partially Observed systems, Model Predictive Control(MPC) and Reinforcement Learning (RL) problems. Technical ApproachWe util,ize a perturbation characterization of feedback laws, along with the DP equation, and the classical Method of Characteristics to stu,dy the optimal open loop/ deterministic problem, in order to reveal the underlying structure of feedback control problems, and show,that the implicit MPC feedback law of replanning the open loop at every time step is, in fact, the globally optimal solution to the,nonlinear problem. Moreover, we show empirical evidence that this is the best one can do as solving the stochastic problem is intrac,table and inaccurate. Algorithmically, we propose a highly data efficient approach to the solution of the open loop problem, that re,sults in SARGO techniques, even in the presence of nonlinear costs and dynamics, via utilizing the beautiful causal structure of opt,imal control. The implications of this result are considered for Partially Observed problems, and might provide tractable solutions,to longstanding questions regarding the "Dual Nature of feedback control" and "Certainty Equivalence". Finally, we consider the impl,ications for MPC and RL, and argue that "MPC is the correct way to perform RL" in a SARGO fashion. Relevance to DoDThe proposed work, will lay the foundation for a fundamental examination of the structure of feedback control systems. Such systems are ubiquitious in, the natural as well as the engineered world, and the principles engendered, such as the "Principle of Nominal Action", should form,a solid basis for the comprehension of decision making in such systems. The algorithmic work envisioned in this proposal can lead to, the development of feedback systems that are capable of controlling complex robotic systems such as swimming robots in information,sparse domains, and can be of great importance to the Navy, in particular, and the DoD in general.Approved for Public Release
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 13, 2022
- Source ID
- N000142212475
Entities
People
- Suman Chakravorty
Organizations
- Office of Naval Research
- Texas Engineering Experiment Station
- United States Navy