Timezone: »
Spotlight
Stable Dual Dynamic Programming
Tao Wang · Daniel Lizotte · Michael Bowling · Dale Schuurmans
Recently, a novel approach to dynamic programming and reinforcement learning has been proposed based on maintaining explicit representations of stationary distributions instead of value functions. The convergence properties and practical effectiveness of these algorithms have not been previously studied however. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.
Author Information
Tao Wang (Australian National University / University of Alberta)
Daniel Lizotte (The University of Western Ontario)
Michael Bowling (DeepMind / University of Alberta)
Dale Schuurmans (Google Brain & University of Alberta)
Related Events (a corresponding poster, oral, or spotlight)
-
2007 Poster: Stable Dual Dynamic Programming »
Tue. Dec 4th 06:30 -- 06:40 PM Room
More from the Same Authors
-
2019 Poster: Surrogate Objectives for Batch Policy Optimization in One-step Decision Making »
Minmin Chen · Ramki Gummadi · Chris Harris · Dale Schuurmans -
2016 : Computer Curling: AI in Sports Analytics »
Michael Bowling -
2016 Poster: The Forget-me-not Process »
Kieran Milan · Joel Veness · James Kirkpatrick · Michael Bowling · Anna Koop · Demis Hassabis -
2016 Poster: Deep Learning Games »
Dale Schuurmans · Martin A Zinkevich -
2016 Poster: Reward Augmented Maximum Likelihood for Neural Structured Prediction »
Mohammad Norouzi · Samy Bengio · zhifeng Chen · Navdeep Jaitly · Mike Schuster · Yonghui Wu · Dale Schuurmans -
2015 Poster: Embedding Inference for Structured Multilabel Prediction »
Farzaneh Mirzazadeh · Siamak Ravanbakhsh · Nan Ding · Dale Schuurmans -
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · Jean-Francis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto -
2014 Poster: Convex Deep Learning via Normalized Kernels »
Özlem Aslan · Xinhua Zhang · Dale Schuurmans -
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser -
2013 Poster: Convex Two-Layer Modeling »
Özlem Aslan · Hao Cheng · Xinhua Zhang · Dale Schuurmans -
2013 Spotlight: Convex Two-Layer Modeling »
Özlem Aslan · Hao Cheng · Xinhua Zhang · Dale Schuurmans -
2013 Poster: Polar Operators for Structured Sparse Estimation »
Xinhua Zhang · Yao-Liang Yu · Dale Schuurmans -
2012 Poster: Sketch-Based Linear Value Function Approximation »
Marc Bellemare · Joel Veness · Michael Bowling -
2012 Poster: Convex Multi-view Subspace Learning »
Martha White · Yao-Liang Yu · Xinhua Zhang · Dale Schuurmans -
2012 Poster: Accelerated Training for Matrix-norm Regularization: A Boosting Approach »
Xinhua Zhang · Yao-Liang Yu · Dale Schuurmans -
2012 Poster: Tractable Objectives for Robust Policy Optimization »
Katherine Chen · Michael Bowling -
2012 Poster: A Polynomial-time Form of Robust Regression »
Yao-Liang Yu · Özlem Aslan · Dale Schuurmans -
2011 Poster: Convergent Fitted Value Iteration with Linear Function Approximation »
Daniel Lizotte -
2011 Poster: Variance Reduction in Monte-Carlo Tree Search »
Joel Veness · Marc Lanctot · Michael Bowling -
2010 Workshop: Learning and Planning from Batch Time Series Data »
Daniel Lizotte · Michael Bowling · Susan Murphy · Joelle Pineau · Sandeep Vijan -
2010 Poster: Relaxed Clipping: A Global Training Method for Robust Regression and Classification »
Yao-Liang Yu · Min Yang · Linli Xu · Martha White · Dale Schuurmans -
2009 Poster: Strategy Grafting in Extensive Games »
Kevin G Waugh · Nolan Bard · Michael Bowling -
2009 Poster: Convex Relaxation of Mixture Regression with Efficient Algorithms »
Novi Quadrianto · Tiberio Caetano · John Lim · Dale Schuurmans -
2009 Poster: A General Projection Property for Distribution Families »
Yao-Liang Yu · Yuxi Li · Dale Schuurmans · Csaba Szepesvari -
2009 Poster: Monte Carlo Sampling for Regret Minimization in Extensive Games »
Marc Lanctot · Kevin G Waugh · Martin A Zinkevich · Michael Bowling -
2008 Session: Oral session 3: Learning from Reinforcement: Modeling and Control »
Michael Bowling -
2007 Session: Spotlights »
Dale Schuurmans -
2007 Spotlight: Regret Minimization in Games with Incomplete Information »
Martin A Zinkevich · Michael Johanson · Michael Bowling · Carmelo Piccione -
2007 Poster: Regret Minimization in Games with Incomplete Information »
Martin A Zinkevich · Michael Johanson · Michael Bowling · Carmelo Piccione -
2007 Poster: Convex Relaxations of EM »
Yuhong Guo · Dale Schuurmans -
2007 Poster: Computing Robust Counter-Strategies »
Michael Johanson · Martin A Zinkevich · Michael Bowling -
2007 Poster: Discriminative Batch Mode Active Learning »
Yuhong Guo · Dale Schuurmans -
2006 Poster: Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields »
Chi-Hoon Lee · Shaojun Wang · Feng Jiao · Dale Schuurmans · Russell Greiner -
2006 Poster: implicit Online Learning with Kernels »
Li Cheng · Vishwanathan S V N · Dale Schuurmans · Shaojun Wang · Terry Caelli -
2006 Poster: iLSTD: Convergence, Eligibility Traces, and Mountain Car »
Alborz Geramifard · Michael Bowling · Martin A Zinkevich · Richard Sutton