TRACTABLE MODELS FOR PLANNING UNDER UNCERTAINTY: EXPLOITING THE POWER OF BELIEF RESETS
Abstract
Project Summary/Abstract Approved for Public ReleaseTo exhibit intelligence, autonomous systems must act prudently despite having,impreciseor erroneous state information: imperfections inherent in sensing and actuation, as well aslimitations of world models, mea,n robotic systems and autonomous vehicles are forced tocope with uncertainty. Unfortunately, it is computationally intractable to pr,oduce optimalsolutions for general formulations of the problem of planning under uncertainty. To addressthis challenge, this project, has identified a sub-class of settings with direct application toplanning problems of DoD-importance. The approach employed is to f,ocus on this specializedclass of settings that possesses specific structure, which can be exploited to obtain high-qualitysolutions,efficiently.Specifically, the research project studies the class of partially observable planning problemswhere information arrives,via an extrinsic observation process, a process which may beperiodic or have some other known temporal regularity. We refer to plann,ing in thisregime as sequential decision-making under temporally-structured observations, a novelclass of decision-making problems i,n-between Markov Decision Processes (MDPs), whichdo not account for uncertainty, and partially observable MDPs (POMDPs), which are a,rewell known to be intractable in general. Their relevance for DoD-related problems (e.g.,operation in the presence of adversaries,,scheduling of information-gathering UAV flyovers,circumstances where pre-determined arrangements for when communication can improves,tealthiness) mean they form a class of planning problems that justifies closer examination.Indeed, decision-making with temporally s,tructured observations appears at multiple levelsof abstractions, from navigation to task planning, and can also shed insight into o,ptimalcommunication scheduling and multi-agent coordination.The studys objectives are to understand these problems by examining the,m from twocomplementary perspectives: that of an MDP with exponentially many actions (but whichpossess a specific subsequence struct,ure), and a POMDP with shallow belief space possessingwhat we term belief resets. The research activity entails employing solution, strategiesfrom both classes (MDP and POMDPs), drawing insights from them, and developing a newspecialized solution method, which wi,ll likely take the form of a hybrid.A second objective is to understand the interplay of planning problems given sometemporally-stru,ctured observation process, and how one might optimize or design suchprocesses to best suit particular problem instances, naturally,subject to constraints on howthat information might be disclosed. The task here is to identify opportunities for specializedstrategi,es or useful heuristics (e.g., decoupling multiple agents, sequentializing search, etc.)to tame the linked optimization problems.The, third objective is to broaden our initial characterization of the problem class, seeking tounderstand how sharply the solution algo,rithm depends on the assumptions and whether theycan be relaxed.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 13, 2022
- Source ID
- N000142212476
Entities
People
- Dylan Shell
Organizations
- Office of Naval Research
- Texas Engineering Experiment Station
- United States Navy