Integration of Domain Knowledge and Machine Learning in Iterative Learning for Complex Tasks
Abstract
Though machine learning (ML) has seen dramatic improvements over the past decade, we are fundamentally far away from systems that can interact with an uncertain environment to learn about system deficiencies and improveperformance. Such improvements are critical to advancing the field to the point where we can run machine learning in the wild. Until then, unforseen feedbackloops in ML systems will cause amplification of nefarious actors on social platforms and fatalities in autonomous vehicles. In this proposal, we aim to design a paradigm for reliable, explainable reinforcement learningbased on the premise of physical modeling, iterative improvement, uncertainty management. We will focus is on data-rich, but supervision-light and complex tasks which are 1. Dynamically challenging insofar as they involve modeling and understanding of nonlinear forces such as contact forces in legged robots, tire forces in wheeled robots, waves and current impacts on autonomous maritime vessels. 2. Executed in complex dynamical environments that are uncertain and possibly time varying. 3. Iterative in nature so that the system learns how to execute increasingly complex tasks while maintaining safety.Our approach will be to combine domain knowledge based on dynamics, Markovian principles, and the ability to repeat with domain agnostic tools from machine learning including uncertainty estimation, search algorithms, and nonparametric estimation. Statistical learning theory will be merged with predictive control theory using a mix of physics-based and data-driven models in the learning process. Our work will involve both theoretical and experimental components with a major focus on the theoretical component. The main theoretical thrusts will be: (G1) How to fuse hard-coded model-based declarative knowledge with statistically learned uncertainty sets. This will require coupling tools from dynamical system identification with methods to bound uncertainty from high dimensional statistics. (G2) Guarantees of performance improvement and safety during the learning process of a complex task. This will require combining tools from adaptive control and online learning theory, andrequire establishing new bounds on approximation errors of model forecasts.(G3) Skill acquisition inside iterative learning feedback loops. We will determine strategies and algorithms to identify new models (with possibly different structure outsides the operating region already covered) and use these models in a predictive control framework. We will use existing instrumentation, simulators and tests beds available at the PIs laboratories for the experimental component part. The main experimental test-beds will be in autonomous drivingand robotic locomotion, focusing on autonomous systems that learn agile maneuvers via safe exploration with multi-modal sensor data. This test-bed will be used to show that the proposed approach outperforms existing learning controllers designed with both classical model-based approach and ML based (as reinforcement learning or deep learning ) approach.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Sep 04, 2018
- Source ID
- N000141812833
Entities
People
- Francesco Borrelli
Organizations
- Office of Naval Research
- United States Navy
- University of California Regents