Timezone: »

Symbolic Dynamic Programming for Continuous State and Observation POMDPs
Zahra Zamani · Scott Sanner · Pascal Poupart · Kristian Kersting

Mon Dec 03 07:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor

Partially-observable Markov decision processes (POMDPs) provide a powerful model for real-world sequential decision-making problems. In recent years, point- based value iteration methods have proven to be extremely effective techniques for finding (approximately) optimal dynamic programming solutions to POMDPs when an initial set of belief states is known. However, no point-based work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key insight is that while there may be an infinite number of possible observations, there are only a finite number of observation partitionings that are relevant for optimal decision-making when a finite, fixed set of reachable belief states is known. To this end, we make two important contributions: (1) we show how previous exact symbolic dynamic pro- gramming solutions for continuous state MDPs can be generalized to continu- ous state POMDPs with discrete observations, and (2) we show how this solution can be further extended via recently developed symbolic methods to continuous state and observations to derive the minimal relevant observation partitioning for potentially correlated, multivariate observation spaces. We demonstrate proof-of- concept results on uni- and multi-variate state and observation steam plant control.

Author Information

Zahra Zamani (ANU and NICTA)
Scott Sanner (University of Toronto)
Pascal Poupart (University of Waterloo)
Kristian Kersting (University of Bonn and Fraunhofer IAIS)

More from the Same Authors