Timezone: »
In off-policy deep reinforcement learning with continuous action spaces, exploration is often implemented by injecting action noise into the action selection process. Popular algorithms based on stochastic policies, such as SAC or MPO, inject white noise by sampling actions from uncorrelated Gaussian distributions. In many tasks, however, white noise does not provide sufficient exploration, and temporally correlated noise is used instead. A common choice is Ornstein-Uhlenbeck (OU) noise, which is closely related to Brownian motion (red noise). Both red noise and white noise belong to the broad family of colored noise. In this work, we perform a comprehensive experimental evaluation on MPO and SAC to explore the effectiveness of other colors of noise as action noise. We find that pink noise, which is halfway between white and red noise, significantly outperforms white noise, OU noise, and other alternatives on a wide range of environments. Thus, we recommend it as the default choice for action noise in continuous control.
Author Information
Onno Eberhard (Max Planck Institute for Intelligent Systems)
Jakob Hollenstein (Universität Innsbruck)
Cristina Pinneri (Max Planck Institute for Intelligent Systems, Max-Planck Institute)
Georg Martius (Max Planck Institute for Intelligent Systems)
More from the Same Authors
-
2022 : Fifteen-minute Competition Overview Video »
Nico Gürtler · Georg Martius · Pavel Kolev · Sebastian Blaes · Manuel Wuethrich · Markus Wulfmeier · Cansu Sancaktar · Martin Riedmiller · Arthur Allshire · Bernhard Schölkopf · Annika Buchholz · Stefan Bauer -
2022 : Neural All-Pairs Shortest Path for Reinforcement Learning »
Cristina Pinneri · Georg Martius · Andreas Krause -
2022 : Hypernetwork-PPO for Continual Reinforcement Learning »
Philemon Schöpf · Sayantan Auddy · Jakob Hollenstein · Antonio Rodriguez-sanchez -
2022 Spotlight: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2022 Competition: Real Robot Challenge III - Learning Dexterous Manipulation from Offline Data in the Real World »
Nico Gürtler · Georg Martius · Sebastian Blaes · Pavel Kolev · Cansu Sancaktar · Stefan Bauer · Manuel Wuethrich · Markus Wulfmeier · Martin Riedmiller · Arthur Allshire · Annika Buchholz · Bernhard Schölkopf -
2022 Poster: Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation »
Cansu Sancaktar · Sebastian Blaes · Georg Martius -
2022 Poster: Embrace the Gap: VAEs Perform Independent Mechanism Analysis »
Patrik Reizinger · Luigi Gresele · Jack Brady · Julius von Kügelgen · Dominik Zietlow · Bernhard Schölkopf · Georg Martius · Wieland Brendel · Michel Besserve -
2020 : Opening »
Marin Vlastelica Pogančić · Georg Martius -
2020 Workshop: Learning Meets Combinatorial Algorithms »
Marin Vlastelica · Jialin Song · Aaron Ferber · Brandon Amos · Georg Martius · Bistra Dilkina · Yisong Yue -
2019 Poster: Control What You Can: Intrinsically Motivated Task-Planning Agent »
Sebastian Blaes · Marin Vlastelica Pogančić · Jiajie Zhu · Georg Martius -
2018 Poster: L4: Practical loss-based stepsize adaptation for deep learning »
Michal Rolinek · Georg Martius