Timezone: »
Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as Rmax base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a ``sanity check'' theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.
Author Information
Manuel Lopes (INRIA)
Tobias Lang
Marc Toussaint (TU Berlin)
Pierre-Yves Oudeyer (INRIA)
More from the Same Authors
-
2022 : Using Confounded Data in Offline RL »
Maxime Gasse · Damien GRASSET · Guillaume Gaudron · Pierre-Yves Oudeyer -
2022 Poster: EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL »
Thomas Carta · Pierre-Yves Oudeyer · Olivier Sigaud · Sylvain Lamprier -
2021 : Sculpting (human-like) AI systems by sculpting their (social) environments »
Pierre-Yves Oudeyer -
2021 Poster: Grounding Spatio-Temporal Language with Transformers »
Tristan Karch · Laetitia Teodorescu · Katja Hofmann · Clément Moulin-Frier · Pierre-Yves Oudeyer -
2020 : Panel discussion »
Pierre-Yves Oudeyer · Marc Bellemare · Peter Stone · Matt Botvinick · Susan Murphy · Anusha Nagabandi · Ashley Edwards · Karen Liu · Pieter Abbeel -
2020 : Invited talk: PierreYves Oudeyer "Machines that invent their own problems: Towards open-ended learning of skills" »
Pierre-Yves Oudeyer -
2020 Poster: Hierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems »
Mayalen Etcheverry · Clément Moulin-Frier · Pierre-Yves Oudeyer -
2020 Oral: Hierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems »
Mayalen Etcheverry · Clément Moulin-Frier · Pierre-Yves Oudeyer -
2020 Poster: Language as a Cognitive Tool to Imagine Goals in Curiosity Driven Exploration »
Cédric Colas · Tristan Karch · Nicolas Lair · Jean-Michel Dussoux · Clément Moulin-Frier · Peter F Dominey · Pierre-Yves Oudeyer -
2016 Demonstration: Autonomous exploration, active learning and human guidance with open-source Poppy humanoid robot platform and Explauto library »
Sébastien Forestier · Yoan Mollard · Pierre-Yves Oudeyer -
2010 Poster: An Approximate Inference Approach to Temporal Optimization in Optimal Control »
Konrad C Rawlik · Marc Toussaint · Sethu Vijayakumar -
2007 Workshop: Robotics Challenges for Machine Learning »
Jan Peters · Marc Toussaint -
2007 Poster: Modelling motion primitives and their timing in biologically executed movements »
Ben H Williams · Marc Toussaint · Amos Storkey