Dynamic Context-Centric Commander s Decision Support through Real-time Inverse Re-inforcement Learning

Abstract

Dynamic Context-Centric Commander’s Decision Support through Real-time Inverse Reinforcement Learning (Santos and Nguyen) 2 Project Summary Existing technologies do not fully address the key component necessary to the development of a fully context-aware proactive decision support systems (PDS) – that is, determining what does the Commander want and why. This requires the system to understand the decision-making goals of the Commander in order to anticipate the Commander’s needs as well as his decision-making processes in order to define/build the context that is central to the different decision-trajectories that are available to support the Commander. Decision modeling (DM) and user modeling (UM) are necessarily at the center of proactive decision support. Unfortunately, there is a fundamental gap between decision modeling (DM) and user modeling (UM) – user models focus on the activities and information seeking (as well as cognitive) behaviors of the user whereas decision modeling focuses on how decisions are made. Information seeking and user cognitive behaviors represent the logical thinking as well as what information is being considered by the decision maker but does not capture how (or even if) these cognitive processes and information are used in the actual decision making process. What has been missed in research is that this gap has arisen because of a fundamental misunderstanding of the relationship between DM and UM, namely, that the process of decisionmaking is a sequence of decisions that includes information seeking actions which are themselves decisions that are being taken. With a model of this sequential/episodic decision making process, the information-seeking behavior can now be placed into context itself – i.e., UM in the context of DM and DM’s need to account for UM. The fundamental research question now becomes: how can we determine this sequential decision making process? Thus, the gap points to a missing formal mechanism that serves as the guide/principle driving the sequential decision process which can also be learned/derived from observation of the decision-maker and their activities over time. We propose to extend our concept of intent inferencing (Santos and Zhao, 2006) by using inverse reinforcement learning (Ng and Russell, 2000) and user modeling techniques to model commander decision making process. Specifically, we can formalize the Commander’s decision making process over time such as with a Markov decision process (MDP) in which each state reflects a stage on the way towards a decision, each action reflects a possible move from collecting data to hypothesizing to inferencing, and the reward which reflects how close a particular stage is to the final decision. The observations including the actions, their relevant information, and estimates of how close the actions advance towards the given goals can be used to build the MDP dynamically and incrementally. The MDP will then be used together with the captured domain knowledge in the UM to infer the next decision and actions that the Commander is likely to take. Through inverse reinforcement learning methods, our next generation of UM learns from discrepancies between observed and predicted user behavior to track shifting interests and goals of a Commander. The novelties of this approach are two-fold: First, it fills the gap between DM and UM areas by linking the captured knowledge with a decision making process to help make decisions more effectively and efficiently. Second, it models the decision-maker’s thought processes (trajectories) over time through MDPs. Our objective for this effort is to develop a computational framework for Context-Centric Commander’s Decision Support (C3DS) as follows: (a) design and develop a model to assist commander’s decision making process; (b) develop a set of measures to assess the effectiveness of this model; and, (c) identify testbeds and conduct experiments to evaluate the model.

Document Details

Document Type: DoD Grant Award
Publication Date: Aug 12, 2016
Source ID: N000141512154

Entities

People

Eugene Santos

Organizations

Board of Trustees of Dartmouth College
Office of Naval Research
United States Navy

Dynamic Context-Centric Commander s Decision Support through Real-time Inverse Re-inforcement Learning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas