Timezone: »
It has been a long-standing dream to design artificial agents that explore their environment efficiently via intrinsic motivation, similar to how children perform curious free play. Despite recent advances in intrinsically motivated reinforcement learning (RL), sample-efficient exploration in object manipulation scenarios remains a significant challenge as most of the relevant information lies in the sparse agent-object and object-object interactions. In this paper, we propose to use structured world models to incorporate relational inductive biases in the control loop to achieve sample-efficient and interaction-rich exploration in compositional multi-object environments. By planning for future novelty inside structured world models, our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time. Instead of using models only to compute intrinsic rewards, as commonly done, our method showcases that the self-reinforcing cycle between good models and good exploration also opens up another avenue: zero-shot generalization to downstream tasks via model-based planning. After the entirely intrinsic task-agnostic exploration phase, our method solves challenging downstream tasks such as stacking, flipping, pick & place, and throwing that generalizes to unseen numbers and arrangements of objects without any additional training.
Author Information
Cansu Sancaktar (Max Planck Institute for Intelligent Systems)
Sebastian Blaes (Max-Planck Institute for Intelligent Systems, Tuebingen, Germany)
Georg Martius (Max Planck Institute for Intelligent Systems)
More from the Same Authors
-
2022 : Fifteen-minute Competition Overview Video »
Nico Gürtler · Georg Martius · Pavel Kolev · Sebastian Blaes · Manuel Wuethrich · Markus Wulfmeier · Cansu Sancaktar · Martin Riedmiller · Arthur Allshire · Bernhard Schölkopf · Annika Buchholz · Stefan Bauer -
2022 : Neural All-Pairs Shortest Path for Reinforcement Learning »
Cristina Pinneri · Georg Martius · Andreas Krause -
2022 : Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning »
Onno Eberhard · Jakob Hollenstein · Cristina Pinneri · Georg Martius -
2023 Poster: Optimistic Active Exploration of Dynamical Systems »
Bhavya · Lenart Treven · Cansu Sancaktar · Sebastian Blaes · Stelian Coros · Andreas Krause -
2023 Poster: Object-Centric Learning for Real-World Videos by Predicting Temporal Feature Similarities »
Andrii Zadaianchuk · Maximilian Seitzer · Georg Martius -
2023 Poster: Goal-conditioned Offline Planning from Curious Exploration »
Marco Bagatella · Georg Martius -
2023 Poster: Online Learning under Adversarial Nonlinear Constraints »
Pavel Kolev · Georg Martius · Michael Muehlebach -
2023 Poster: Regularity as Intrinsic Reward for Free Play »
Cansu Sancaktar · Justus Piater · Georg Martius -
2023 Workshop: Intrinsically Motivated Open-ended Learning (IMOL) Workshop »
Cédric Colas · Laetitia Teodorescu · Nadia Ady · Cansu Sancaktar · Junyi Chu -
2022 Spotlight: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2022 Competition: Real Robot Challenge III - Learning Dexterous Manipulation from Offline Data in the Real World »
Nico Gürtler · Georg Martius · Sebastian Blaes · Pavel Kolev · Cansu Sancaktar · Stefan Bauer · Manuel Wuethrich · Markus Wulfmeier · Martin Riedmiller · Arthur Allshire · Annika Buchholz · Bernhard Schölkopf -
2022 Poster: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2020 : Opening »
Marin Vlastelica Pogančić · Georg Martius -
2020 Workshop: Learning Meets Combinatorial Algorithms »
Marin Vlastelica · Jialin Song · Aaron Ferber · Brandon Amos · Georg Martius · Bistra Dilkina · Yisong Yue -
2019 Poster: Control What You Can: Intrinsically Motivated Task-Planning Agent »
Sebastian Blaes · Marin Vlastelica Pogančić · Jiajie Zhu · Georg Martius -
2018 Poster: L4: Practical loss-based stepsize adaptation for deep learning »
Michal Rolinek · Georg Martius