Timezone: »
Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high-dimensional reinforcement learning problems often beyond the reach of current methods. In this paper, we extend previous work on policy learning from the immediate reward case to episodic reinforcement learning. We show that this results into a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning by assuming a form of exploration that is particularly well-suited for dynamic motor primitives. The resulting algorithm is an EM-inspired algorithm applicable in complex motor learning tasks. We compare this algorithm to alternative parametrized policy search methods and show that it outperforms previous methods. We apply it in the context of motor learning and show that it can learn a complex Ball-in-a-Cup task using a real Barrett WAM robot arm.
Author Information
Jens Kober (Max Planck Institute for Biological Cybernetics)
Jan Peters (TU Darmstadt & DFKI)
Jan Peters is a full professor (W3) for Intelligent Autonomous Systems at the Computer Science Department of the Technische Universitaet Darmstadt and at the same time a senior research scientist and group leader at the Max-Planck Institute for Intelligent Systems, where he heads the interdepartmental Robot Learning Group. Jan Peters has received the Dick Volz Best 2007 US PhD Thesis Runner-Up Award, the Robotics: Science & Systems - Early Career Spotlight, the INNS Young Investigator Award, and the IEEE Robotics & Automation Society‘s Early Career Award as well as numerous best paper awards. In 2015, he was awarded an ERC Starting Grant. Jan Peters has studied Computer Science, Electrical, Mechanical and Control Engineering at TU Munich and FernUni Hagen in Germany, at the National University of Singapore (NUS) and the University of Southern California (USC). He has received four Master‘s degrees in these disciplines as well as a Computer Science PhD from USC.
Related Events (a corresponding poster, oral, or spotlight)
-
2008 Poster: Policy Search for Motor Primitives in Robotics »
Wed. Dec 10th through Tue the 9th Room
More from the Same Authors
-
2020 : Differentiable Implicit Layers »
Andreas Look · Simona Doneva · Melih Kandemir · Rainer Gemulla · Jan Peters -
2022 : How crucial is Transformer in Decision Transformer? »
Max Siebenborn · Boris Belousov · Junning Huang · Jan Peters -
2022 : Conditioned Score-Based Models for Learning Collision-Free Trajectory Generation »
Joao Carvalho · Mark Baierl · Julen Urain · Jan Peters -
2023 Poster: Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures »
Hamish Flynn · David Reeb · Melih Kandemir · Jan Peters -
2023 Poster: Pseudo-Likelihood Inference »
Theo Gruner · Fabio Muratore · Boris Belousov · Daniel Palenicek · Jan Peters -
2023 Poster: Accelerating Motion Planning via Optimal Transport »
An T. Le · Georgia Chalvatzaki · Armin Biess · Jan Peters -
2023 Oral: Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures »
Hamish Flynn · David Reeb · Melih Kandemir · Jan Peters -
2023 Competition: The Robot Air Hockey Challenge: Robust, Reliable, and Safe Learning Techniques for Real-world Robotics »
Puze Liu · Jonas Günster · Niklas Funk · Dong Chen · Haitham Bou Ammar · Davide Tateo · Jan Peters -
2022 Poster: Information-Theoretic Safe Exploration with Gaussian Processes »
Alessandro Bottero · Carlos Luis · Julia Vinogradska · Felix Berkenkamp · Jan Peters -
2020 Poster: Self-Paced Deep Reinforcement Learning »
Pascal Klink · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2020 Oral: Self-Paced Deep Reinforcement Learning »
Pascal Klink · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2017 : Panel Discussion »
Matt Botvinick · Emma Brunskill · Marcos Campos · Jan Peters · Doina Precup · David Silver · Josh Tenenbaum · Roy Fox -
2017 : Hierarchical Imitation and Reinforcement Learning for Robotics (Jan Peters) »
Jan Peters -
2016 Poster: Catching heuristics are optimal control policies »
Boris Belousov · Gerhard Neumann · Constantin Rothkopf · Jan Peters -
2015 Poster: Model-Based Relative Entropy Stochastic Search »
Abbas Abdolmaleki · Rudolf Lioutikov · Jan Peters · Nuno Lau · Luis Pualo Reis · Gerhard Neumann -
2014 Demonstration: Learning for Tactile Manipulation »
Tucker Hermans · Filipe Veiga · Janine Hölscher · Herke van Hoof · Jan Peters -
2013 Workshop: Advances in Machine Learning for Sensorimotor Control »
Thomas Walsh · Alborz Geramifard · Marc Deisenroth · Jonathan How · Jan Peters -
2013 Workshop: Planning with Information Constraints for Control, Reinforcement Learning, Computational Neuroscience, Robotics and Games. »
Hilbert J Kappen · Naftali Tishby · Jan Peters · Evangelos Theodorou · David H Wolpert · Pedro Ortega -
2013 Poster: Probabilistic Movement Primitives »
Alexandros Paraschos · Christian Daniel · Jan Peters · Gerhard Neumann -
2012 Poster: Algorithms for Learning Markov Field Policies »
Abdeslam Boularias · Oliver Kroemer · Jan Peters -
2011 Poster: A Non-Parametric Approach to Dynamic Programming »
Oliver Kroemer · Jan Peters -
2011 Oral: A Non-Parametric Approach to Dynamic Programming »
Oliver Kroemer · Jan Peters -
2010 Spotlight: Switched Latent Force Models for Movement Segmentation »
Mauricio A Alvarez · Jan Peters · Bernhard Schölkopf · Neil D Lawrence -
2010 Poster: Switched Latent Force Models for Movement Segmentation »
Mauricio A Alvarez · Jan Peters · Bernhard Schölkopf · Neil D Lawrence -
2010 Poster: Movement extraction by detecting dynamics switches and repetitions »
Silvia Chiappa · Jan Peters -
2009 Workshop: Probabilistic Approaches for Control and Robotics »
Marc Deisenroth · Hilbert J Kappen · Emo Todorov · Duy Nguyen-Tuong · Carl Edward Rasmussen · Jan Peters -
2008 Poster: Using Bayesian Dynamical Systems for Motion Template Libraries »
Silvia Chiappa · Jens Kober · Jan Peters -
2008 Poster: Fitted Q-iteration by Advantage Weighted Regression »
Gerhard Neumann · Jan Peters -
2008 Spotlight: Fitted Q-iteration by Advantage Weighted Regression »
Gerhard Neumann · Jan Peters -
2008 Poster: Local Gaussian Process Regression for Real Time Online Model Learning »
Duy Nguyen-Tuong · Matthias Seeger · Jan Peters -
2007 Workshop: Robotics Challenges for Machine Learning »
Jan Peters · Marc Toussaint -
2006 Workshop: Towards a New Reinforcement Learning? »
Jan Peters · Stefan Schaal · Drew Bagnell