Online Modeling of Heterogeneous Autonomy

Abstract

In our recent publication [1] the task of estimating a model of dynamic decisions by a single human agent based upon the history of implemented actions with hidden (or partially observable) states. Under the assumption of complete state observability (for both theagent and the modeler), this problem has been widely studied in two strands of the literature where it is referred to as structural estimation of Markov decision processes (MDP) [5] or alternatively as inverse reinforcement learning (IRL) [6]. In this paper we consider the case in which the agent makes decisions under partial observability of the relevant state variable and the modeler (which also only partially observes the state) uses a POMDP (Partially Observable MDP) model. We analyze the structural properties of the model and specify conditions under which the model is ident able without knowledge of the state dynamics. We consider a soft policy gradient algorithm to compute a maximum likelihood estimator and provide a finite-time characterization of convergence to a stationary point. We test the model with data from engine replacement dynamic decisions. First, we use synthetic data to highlight the robustness of the proposed methodology and characterize the potential for mis-specification when partial state observability is ignored. We then apply the model to a subset of the dataset in [5] on bus engine replacement decisions. The results show that the proposed model can significantly improve modelt as measured by the log-likelihood function by 17:7 percent. More interestingly, the model reveals a feature of bus route assignment behavior in the dataset which was hitherto ignored, i.e. buses with engines believed to be in worse condition exhibit less utilization (mileage) and higher maintenance costs.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 19, 2023
Accession Number
AD1230537

Entities

People

  • Alfredo Garcia
  • Mingyi Hong

Organizations

  • Texas Engineering Experiment Station
  • University of Minnesota

Tags

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms