Timezone: »
The performance of a reinforcement learning (RL) system depends on the compu- tational architecture used to approximate a value function. We propose an online RL algorithm for adapting a value function’s architecture and efficiently finding useful nonlinear features. The algorithm is evaluated in a spatial domain with high-dimensional, stochastic observations. Our method outperforms non-adaptive baseline architectures and approaches the performance of an architecture given side- channel information about observational structure. These results are a step towards scalable RL algorithms for more general problem settings, where observational structure is unavailable.
Author Information
Fatima Davelouis (University of Alberta)
I am a Master's student at the University of Alberta, working on reinforcement learning with Professor Michael Bowling. In my current research, I aim to build representations from an agent's predictions in POMDP settings.
John Martin (University of Alberta)
Joseph Modayil (DeepMind)
Michael Bowling (DeepMind / University of Alberta)
More from the Same Authors
-
2022 : Adapting the Function Approximation Architecture in Online Reinforcement Learning »
John Martin · Joseph Modayil · Fatima Davelouis · Michael Bowling -
2022 : Learning to Prioritize Planning Updates in Model-based Reinforcement Learning »
Brad Burega · John Martin · Michael Bowling -
2022 : Oral Presentation 7: Adapting the Function Approximation Architecture in Online Reinforcement Learning »
Fatima Davelouis -
2020 : Invited Talk 2: Michael Bowling (University of Alberta) - Hindsight Rationality: Alternatives to Nash »
Michael Bowling -
2020 Poster: Marginal Utility for Planning in Continuous or Large Discrete Action Spaces »
Zaheen Ahmad · Levi Lelis · Michael Bowling -
2020 : Discussion Panel: Hugo Larochelle, Finale Doshi-Velez, Devi Parikh, Marc Deisenroth, Julien Mairal, Katja Hofmann, Phillip Isola, and Michael Bowling »
Hugo Larochelle · Finale Doshi-Velez · Marc Deisenroth · Devi Parikh · Julien Mairal · Katja Hofmann · Phillip Isola · Michael Bowling -
2019 Poster: Ease-of-Teaching and Language Structure from Emergent Communication »
Fushan Li · Michael Bowling -
2016 : Computer Curling: AI in Sports Analytics »
Michael Bowling -
2016 Poster: The Forget-me-not Process »
Kieran Milan · Joel Veness · James Kirkpatrick · Michael Bowling · Anna Koop · Demis Hassabis -
2012 Poster: Sketch-Based Linear Value Function Approximation »
Marc Bellemare · Joel Veness · Michael Bowling -
2012 Poster: Tractable Objectives for Robust Policy Optimization »
Katherine Chen · Michael Bowling -
2011 Poster: Variance Reduction in Monte-Carlo Tree Search »
Joel Veness · Marc Lanctot · Michael Bowling -
2010 Workshop: Learning and Planning from Batch Time Series Data »
Daniel Lizotte · Michael Bowling · Susan Murphy · Joelle Pineau · Sandeep Vijan -
2009 Poster: Strategy Grafting in Extensive Games »
Kevin G Waugh · Nolan Bard · Michael Bowling -
2009 Poster: Monte Carlo Sampling for Regret Minimization in Extensive Games »
Marc Lanctot · Kevin G Waugh · Martin A Zinkevich · Michael Bowling -
2008 Session: Oral session 3: Learning from Reinforcement: Modeling and Control »
Michael Bowling -
2007 Spotlight: Stable Dual Dynamic Programming »
Tao Wang · Daniel Lizotte · Michael Bowling · Dale Schuurmans -
2007 Poster: Stable Dual Dynamic Programming »
Tao Wang · Daniel Lizotte · Michael Bowling · Dale Schuurmans -
2007 Spotlight: Regret Minimization in Games with Incomplete Information »
Martin A Zinkevich · Michael Johanson · Michael Bowling · Carmelo Piccione -
2007 Poster: Regret Minimization in Games with Incomplete Information »
Martin A Zinkevich · Michael Johanson · Michael Bowling · Carmelo Piccione -
2007 Poster: Computing Robust Counter-Strategies »
Michael Johanson · Martin A Zinkevich · Michael Bowling -
2006 Poster: iLSTD: Convergence, Eligibility Traces, and Mountain Car »
Alborz Geramifard · Michael Bowling · Martin A Zinkevich · Richard Sutton