Online Modeling of Heterogeneous Autonomy
Abstract
In our recent publication [1] the task of estimating a model of dynamic decisions by a single human agent based upon the history of implemented actions with hidden (or partially observable) states. Under the assumption of complete state observability (for both theagent and the modeler), this problem has been widely studied in two strands of the literature where it is referred to as structural estimation of Markov decision processes (MDP) [5] or alternatively as inverse reinforcement learning (IRL) [6]. In this paper we consider the case in which the agent makes decisions under partial observability of the relevant state variable and the modeler (which also only partially observes the state) uses a POMDP (Partially Observable MDP) model. We analyze the structural properties of the model and specify conditions under which the model is ident able without knowledge of the state dynamics. We consider a soft policy gradient algorithm to compute a maximum likelihood estimator and provide a finite-time characterization of convergence to a stationary point. We test the model with data from engine replacement dynamic decisions. First, we use synthetic data to highlight the robustness of the proposed methodology and characterize the potential for mis-specification when partial state observability is ignored. We then apply the model to a subset of the dataset in [5] on bus engine replacement decisions. The results show that the proposed model can significantly improve modelt as measured by the log-likelihood function by 17:7 percent. More interestingly, the model reveals a feature of bus route assignment behavior in the dataset which was hitherto ignored, i.e. buses with engines believed to be in worse condition exhibit less utilization (mileage) and higher maintenance costs.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 19, 2023
- Accession Number
- AD1230537
Entities
People
- Alfredo Garcia
- Mingyi Hong
Organizations
- Texas Engineering Experiment Station
- University of Minnesota