Timezone: »
We address the scalability of symbolic planning under uncertainty with factored states and actions. Prior work has focused almost exclusively on factored states but not factored actions, and on value iteration (VI) compared to policy iteration (PI). Our first contribution is a novel method for symbolic policy backups via the application of constraints, which is used to yield a new efficient symbolic imple- mentation of modified PI (MPI) for factored action spaces. While this approach improves scalability in some cases, naive handling of policy constraints comes with its own scalability issues. This leads to our second and main contribution, symbolic Opportunistic Policy Iteration (OPI), which is a novel convergent al- gorithm lying between VI and MPI. The core idea is a symbolic procedure that applies policy constraints only when they reduce the space and time complexity of the update, and otherwise performs full Bellman backups, thus automatically adjusting the backup per state. We also give a memory bounded version of this algorithm allowing a space-time tradeoff. Empirical results show significantly improved scalability over the state-of-the-art.
Author Information
Aswin Raghavan (Oregon State University)
Roni Khardon (Indiana University, Bloomington)
Alan Fern (Oregon State University)
Prasad Tadepalli (Oregon State University)
More from the Same Authors
-
2021 Spotlight: Optimal Policies Tend To Seek Power »
Alex Turner · Logan Smith · Rohin Shah · Andrew Critch · Prasad Tadepalli -
2021 : Deep RePReL--Combining Planning and Deep RL for acting in relational domains »
Harsha Kokel · Arjun Manoharan · Sriraam Natarajan · Balaraman Ravindran · Prasad Tadepalli -
2021 Poster: One Explanation is Not Enough: Structured Attention Graphs for Image Classification »
Vivswan Shitole · Fuxin Li · Minsuk Kahng · Prasad Tadepalli · Alan Fern -
2021 Poster: Optimal Policies Tend To Seek Power »
Alex Turner · Logan Smith · Rohin Shah · Andrew Critch · Prasad Tadepalli -
2020 Poster: Avoiding Side Effects in Complex Environments »
Alex Turner · Neale Ratzlaff · Prasad Tadepalli -
2020 Spotlight: Avoiding Side Effects in Complex Environments »
Alex Turner · Neale Ratzlaff · Prasad Tadepalli -
2018 Poster: From Stochastic Planning to Marginal MAP »
Hao(Jackson) Cui · Radu Marinescu · Roni Khardon -
2017 Poster: Excess Risk Bounds for the Bayes Risk using Variational Inference in Latent Gaussian Models »
Rishit Sheth · Roni Khardon -
2012 Poster: A Bayesian Approach for Policy Learning from Trajectory Preference Queries »
Aaron Wilson · Alan Fern · Prasad Tadepalli -
2011 Poster: Budgeted Optimization with Concurrent Stochastic-Duration Experiments »
Javad Azimi · Alan Fern · Xiaoli Fern -
2011 Spotlight: Budgeted Optimization with Concurrent Stochastic-Duration Experiments »
Javad Azimi · Alan Fern · Xiaoli Fern -
2011 Poster: Autonomous Learning of Action Models for Planning »
Neville Mehta · Prasad Tadepalli · Alan Fern -
2011 Poster: Inverting Grice's Maxims to Learn Rules from Natural Language Extractions »
M. Shahed Sorower · Thomas Dietterich · Janardhan Rao Doppa · Walker Orr · Prasad Tadepalli · Xiaoli Fern -
2010 Poster: A Computational Decision Theory for Interactive Assistants »
Alan Fern · Prasad Tadepalli