Online POMDP Algorithms for Very Large Observation Spaces
Abstract
Partially Observable Markov Decision Process (POMDP) provides a mathematically elegant modeling tool for planning and control under uncertainty. Substantial progress has been achieved in the past decade, allowing some large-scale problems to be solved using POMDPs. However, very large observation spaces still pose substantial difficulties for effective planning. In this project, two aspects of these difficulties are studied. One challenge posed by very large observation spaces is that Monte-Carlo methods used for scaling up the solvers to solve very large problems may fail to sample rare but critical events that are important for planning. The PI's team developed methods for handling these difficulties by using importance sampling to focus on sampling these events. They show that our online planning method retains good theoretical properties when importance sampling is used and propose a method for learning the importance sampling distribution. Experimentally, the method works well in simulation and on realistic data. Another issue with very large observation spaces is the high computational complexity of handling the very large space. The team studied the approach of using maximum likelihood determination, where only the most likely observations are used during the search for solution. They showed that solutions to some subclasses of POMDP problems can be well approximated in polynomial time using this approach.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 06, 2017
- Accession Number
- AD1043680
Entities
People
- Wee Sun Lee
Organizations
- National University of Singapore