Timezone: »
We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks. The learned posterior combined with a sample-based Bayesian exploration procedure increases the sample efficiency of learning across a family of related tasks. We first derive an analysis of the relationship between the sample complexity and the initialization quality of the posterior in the finite MDP setting. We next scale the approach to continuous-state domains by introducing a Variational Bayesian Lifelong Reinforcement Learning algorithm that can be combined with recent model-based deep RL methods, and that exhibits backward transfer. Experimental results on several challenging domains show that our algorithms achieve both better forward and backward transfer performance than state-of-the-art lifelong RL methods.
Author Information
Haotian Fu (Brown University)
Shangqun Yu (Brown University)
Michael Littman (Brown University)
George Konidaris (Brown University)
More from the Same Authors
-
2021 : Bayesian Exploration for Lifelong Reinforcement Learning »
Haotian Fu · Shangqun Yu · Michael Littman · George Konidaris -
2023 Poster: Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning »
Akhil Bagaria · Ben Abbatematteo · Omer Gottesman · Matt Corsaro · Sreehari Rammohan · George Konidaris -
2022 Spotlight: Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex »
Charles Lovering · Jessica Forde · George Konidaris · Ellie Pavlick · Michael Littman -
2022 Poster: Effects of Data Geometry in Early Deep Learning »
Saket Tiwari · George Konidaris -
2022 Poster: Faster Deep Reinforcement Learning with Slower Online Network »
Kavosh Asadi · Rasool Fakoor · Omer Gottesman · Taesup Kim · Michael Littman · Alexander Smola -
2022 Poster: Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex »
Charles Lovering · Jessica Forde · George Konidaris · Ellie Pavlick · Michael Littman -
2021 : George Konidaris Talk Q&A »
George Konidaris -
2021 : Invited Talk: George Konidaris - Signal to Symbol (via Skills) »
George Konidaris -
2021 Poster: On the Expressivity of Markov Reward »
David Abel · Will Dabney · Anna Harutyunyan · Mark Ho · Michael Littman · Doina Precup · Satinder Singh -
2021 Poster: Learning Markov State Abstractions for Deep Reinforcement Learning »
Cameron Allen · Neev Parikh · Omer Gottesman · George Konidaris -
2021 Oral: On the Expressivity of Markov Reward »
David Abel · Will Dabney · Anna Harutyunyan · Mark Ho · Michael Littman · Doina Precup · Satinder Singh -
2020 : Panel Discussions »
Grace Lindsay · George Konidaris · Shakir Mohamed · Kimberly Stachenfeld · Peter Dayan · Yael Niv · Doina Precup · Catherine Hartley · Ishita Dasgupta -
2020 : Invited Talk #4 QnA - George Konidaris »
George Konidaris · Raymond Chua · Feryal Behbahani -
2020 : Invited Talk #4 George Konidaris - Signal to Symbol (via Skills) »
George Konidaris -
2017 Poster: Active Exploration for Learning Symbolic Representations »
Garrett Andersen · George Konidaris -
2017 Poster: Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes »
Taylor Killian · Samuel Daulton · Finale Doshi-Velez · George Konidaris -
2017 Oral: Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes »
Taylor Killian · Samuel Daulton · Finale Doshi-Velez · George Konidaris -
2016 Workshop: The Future of Interactive Machine Learning »
Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman -
2016 Oral: Showing versus doing: Teaching by demonstration »
Mark Ho · Michael Littman · James MacGlashan · Fiery Cushman · Joseph L Austerweil -
2016 Poster: Showing versus doing: Teaching by demonstration »
Mark Ho · Michael Littman · James MacGlashan · Fiery Cushman · Joe Austerweil · Joseph L Austerweil -
2015 Poster: Policy Evaluation Using the Ω-Return »
Philip Thomas · Scott Niekum · Georgios Theocharous · George Konidaris -
2011 Poster: TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning »
George Konidaris · Scott Niekum · Philip Thomas -
2010 Poster: Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories »
George Konidaris · Scott R Kuindersma · Andrew G Barto · Roderic A Grupen -
2009 Poster: Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining »
George Konidaris · Andrew G Barto -
2009 Spotlight: Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining »
George Konidaris · Andrew G Barto