Probably Approximately Correct Protocols for Reactive Control and Learning

Abstract

The objective of this proposal is to develop decision-making algorithms for autonomous and intelligent systems that jointly learn and react in environments with stochastic as well as adversarial uncertainties. The algorithms will be not only efficient in learning in terms of their use of samples, time, and space (i.e., in the traditional probably approximate correctness ÒPACÓ sense) but also provably correct (by synthesis) with respect to rich temporal logic mission specifications. We balance theoretical development with immediate relevance by using shared decision-making between an human operator and autonomous mobile vehicles as a case study. The gap between the present autonomy capabilities and those needed for the future mission scenarios calls for a new generation of control protocols that enable operator-autonomy interactions at higher levels of decision-making. Such control protocols shall be able to not only react to the changesÑanticipated yet not necessarily known precisely at design timeÑbut also adapt to unforeseen changes at run time. Furthermore, the difficulty of and the lack of proper tools for establishing provable trust in that these autonomous systems operate in ways they are intended to constitute a major obstacle for their greater integration. We tackle these challenges by developing methods and tools for formal specification, analysis, and synthesis of control protocols that blend reactivity, adaptation, and human operatorsÕ inputs for the operation of autonomous systems. Despite recent fragmented efforts in respective domains, including formal methods, controls and learning, we still lack a unified formalism that can support the multitude of uncertainties, heterogeneity in dynamics and requirements, and variability in operational requirements. We address the limitations discussed above through a research plan in three main thrusts. The first thrust aims to establish a mathematically-based formalism for joint reactive control and learning. The second builds on the previous one to investigate several under-explored topics in joint control and learning subject to rich temporal logic constraints. The third thrust is on the demonstration of our results on shared autonomy scenarios. Thrust I: Probably approximate correctness in joint learning and temporal logic-constrained reactive synthesis Ñ How can we adapt the notions of probably approximate correctness for safety-critical systems operating in environments with both stochastic uncertainties and adversarial opponents subject to temporal logic specifications? Thrust II: Quantitative trade-offs, resilience and regret in joint learning and synthesis Ñ How can we leverage the formalism from Thrust I to investigate exploration vs. exploration trade-offs, resilience of the joint protocols to changes between the design and deployment domains, and unconventional interactions between the controlled system and its environment and opponents? Thrust III: Shared autonomy case study Ñ We will put the interplay between learning and reactive control into a concrete context and demonstrate the utility of our results through a case study in which a human operator works with unmanned systems with various levels of autonomy capabilities.

Document Details

Document Type: DoD Grant Award
Publication Date: Sep 11, 2018
Source ID: W911NF1510592

Entities

People

Ufuk Topcu

Organizations

Army Contracting Command
United States Army
University of Texas at Austin

Probably Approximately Correct Protocols for Reactive Control and Learning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas