Timezone: »
Markov Decision Processes (MDPs) are extremely useful for modeling and solving sequential decision making problems. Graph-based MDPs provide a compact representation for MDPs with large numbers of random variables. However, the complexity of exactly solving a graph-based MDP usually grows exponentially in the number of variables, which limits their application. We present a new variational framework to describe and solve the planning problem of MDPs, and derive both exact and approximate planning algorithms. In particular, by exploiting the graph structure of graph-based MDPs, we propose a factored variational value iteration algorithm in which the value function is first approximated by the multiplication of local-scope value functions, then solved by minimizing a Kullback-Leibler (KL) divergence. The KL divergence is optimized using the belief propagation algorithm, with complexity exponential in only the cluster size of the graph. Experimental comparison on different models shows that our algorithm outperforms existing approximation algorithms at finding good policies.
Author Information
Qiang Cheng (Tsinghua University)
Qiang Liu (UC Irvine)
Feng Chen (Tsinghua University)
Alexander Ihler (UC Irvine)
More from the Same Authors
-
2021 : Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates »
Litian Liang · Yaosheng Xu · Stephen McAleer · Dailin Hu · Alexander Ihler · Pieter Abbeel · Roy Fox -
2020 Poster: Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning »
Tianren Zhang · Shangqi Guo · Tian Tan · Xiaolin Hu · Feng Chen -
2020 Spotlight: Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning »
Tianren Zhang · Shangqi Guo · Tian Tan · Xiaolin Hu · Feng Chen -
2019 Poster: Convolution with even-sized kernels and symmetric padding »
Shuang Wu · Guanrui Wang · Pei Tang · Feng Chen · Luping Shi -
2018 Poster: Lifted Weighted Mini-Bucket »
Nicholas Gallo · Alexander Ihler -
2017 Workshop: NIPS Highlights (MLTrain), Learn How to code a paper with state of the art frameworks »
Alex Dimakis · Nikolaos Vasiloglou · Guy Van den Broeck · Alexander Ihler · Assaf Araki -
2017 Poster: Dynamic Importance Sampling for Anytime Bounds of the Partition Function »
Qi Lou · Rina Dechter · Alexander Ihler -
2016 Poster: Learning Infinite RBMs with Frank-Wolfe »
Wei Ping · Qiang Liu · Alexander Ihler -
2015 Poster: Probabilistic Variational Bounds for Graphical Models »
Qiang Liu · John Fisher III · Alexander Ihler -
2015 Poster: Decomposition Bounds for Marginal MAP »
Wei Ping · Qiang Liu · Alexander Ihler -
2014 Poster: Distributed Estimation, Information Loss and Exponential Families »
Qiang Liu · Alexander Ihler -
2013 Workshop: Crowdsourcing: Theory, Algorithms and Applications »
Jennifer Wortman Vaughan · Greg Stoddard · Chien-Ju Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill -
2013 Poster: Scoring Workers in Crowdsourcing: How Many Control Questions are Enough? »
Qiang Liu · Alexander Ihler · Mark Steyvers -
2013 Spotlight: Scoring Workers in Crowdsourcing: How Many Control Questions are Enough? »
Qiang Liu · Alexander Ihler · Mark Steyvers -
2012 Poster: Variational Inference for Crowdsourcing »
Qiang Liu · Jian Peng · Alexander Ihler -
2009 Poster: Particle-based Variational Inference for Continuous Systems »
Alexander Ihler · Andrew Frank · Padhraic Smyth -
2006 Poster: Learning Time-Intensity Profiles of Human Activity using Non-Parametric Bayesian Models »
Alexander Ihler · Padhraic Smyth