Timezone: »
In this paper, we consider the problem of policy evaluation for continuous-state systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin's method. Furthermore, we also present a unified view of several well-known policy evaluation methods. In particular, we show that the same Galerkin method can be used to derive Least-Squares Temporal Difference learning, Kernelized Temporal Difference learning, and a discrete-state Dynamic Programming solution, as well as our proposed method. In a numerical evaluation of these algorithms, the proposed approach performed better than the other methods.
Author Information
Oliver Kroemer (CMU)
Jan Peters (TU Darmstadt & MPI Intelligent Systems)
Jan Peters is a full professor (W3) for Intelligent Autonomous Systems at the Computer Science Department of the Technische Universitaet Darmstadt and at the same time a senior research scientist and group leader at the Max-Planck Institute for Intelligent Systems, where he heads the interdepartmental Robot Learning Group. Jan Peters has received the Dick Volz Best 2007 US PhD Thesis Runner-Up Award, the Robotics: Science & Systems - Early Career Spotlight, the INNS Young Investigator Award, and the IEEE Robotics & Automation Society‘s Early Career Award as well as numerous best paper awards. In 2015, he was awarded an ERC Starting Grant. Jan Peters has studied Computer Science, Electrical, Mechanical and Control Engineering at TU Munich and FernUni Hagen in Germany, at the National University of Singapore (NUS) and the University of Southern California (USC). He has received four Master‘s degrees in these disciplines as well as a Computer Science PhD from USC.
Related Events (a corresponding poster, oral, or spotlight)
-
2011 Oral: A Non-Parametric Approach to Dynamic Programming »
Tue. Dec 13th 09:20 -- 09:40 AM Room
More from the Same Authors
-
2020 : Differentiable Implicit Layers »
Andreas Look · Simona Doneva · Melih Kandemir · Rainer Gemulla · Jan Peters -
2022 : How crucial is Transformer in Decision Transformer? »
Max Siebenborn · Boris Belousov · Junning Huang · Jan Peters -
2022 : Conditioned Score-Based Models for Learning Collision-Free Trajectory Generation »
Joao Carvalho · Mark Baierl · Julen Urain · Jan Peters -
2022 Poster: Information-Theoretic Safe Exploration with Gaussian Processes »
Alessandro Bottero · Carlos Luis · Julia Vinogradska · Felix Berkenkamp · Jan Peters -
2020 Poster: Self-Paced Deep Reinforcement Learning »
Pascal Klink · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2020 Oral: Self-Paced Deep Reinforcement Learning »
Pascal Klink · Carlo D'Eramo · Jan Peters · Joni Pajarinen -
2017 : Panel Discussion »
Matt Botvinick · Emma Brunskill · Marcos Campos · Jan Peters · Doina Precup · David Silver · Josh Tenenbaum · Roy Fox -
2017 : Hierarchical Imitation and Reinforcement Learning for Robotics (Jan Peters) »
Jan Peters -
2016 Poster: Catching heuristics are optimal control policies »
Boris Belousov · Gerhard Neumann · Constantin Rothkopf · Jan Peters -
2015 Poster: Model-Based Relative Entropy Stochastic Search »
Abbas Abdolmaleki · Rudolf Lioutikov · Jan Peters · Nuno Lau · Luis Pualo Reis · Gerhard Neumann -
2014 Demonstration: Learning for Tactile Manipulation »
Tucker Hermans · Filipe Veiga · Janine Hölscher · Herke van Hoof · Jan Peters -
2013 Workshop: Advances in Machine Learning for Sensorimotor Control »
Thomas Walsh · Alborz Geramifard · Marc Deisenroth · Jonathan How · Jan Peters -
2013 Workshop: Planning with Information Constraints for Control, Reinforcement Learning, Computational Neuroscience, Robotics and Games. »
Hilbert J Kappen · Naftali Tishby · Jan Peters · Evangelos Theodorou · David H Wolpert · Pedro Ortega -
2013 Poster: Probabilistic Movement Primitives »
Alexandros Paraschos · Christian Daniel · Jan Peters · Gerhard Neumann -
2012 Poster: Algorithms for Learning Markov Field Policies »
Abdeslam Boularias · Oliver Kroemer · Jan Peters -
2010 Spotlight: Switched Latent Force Models for Movement Segmentation »
Mauricio A Alvarez · Jan Peters · Bernhard Schölkopf · Neil D Lawrence -
2010 Poster: Switched Latent Force Models for Movement Segmentation »
Mauricio A Alvarez · Jan Peters · Bernhard Schölkopf · Neil D Lawrence -
2010 Poster: Movement extraction by detecting dynamics switches and repetitions »
Silvia Chiappa · Jan Peters -
2009 Workshop: Probabilistic Approaches for Control and Robotics »
Marc Deisenroth · Hilbert J Kappen · Emo Todorov · Duy Nguyen-Tuong · Carl Edward Rasmussen · Jan Peters -
2008 Poster: Using Bayesian Dynamical Systems for Motion Template Libraries »
Silvia Chiappa · Jens Kober · Jan Peters -
2008 Poster: Fitted Q-iteration by Advantage Weighted Regression »
Gerhard Neumann · Jan Peters -
2008 Poster: Policy Search for Motor Primitives in Robotics »
Jens Kober · Jan Peters -
2008 Spotlight: Fitted Q-iteration by Advantage Weighted Regression »
Gerhard Neumann · Jan Peters -
2008 Oral: Policy Search for Motor Primitives in Robotics »
Jens Kober · Jan Peters -
2008 Poster: Local Gaussian Process Regression for Real Time Online Model Learning »
Duy Nguyen-Tuong · Matthias Seeger · Jan Peters -
2007 Workshop: Robotics Challenges for Machine Learning »
Jan Peters · Marc Toussaint -
2006 Workshop: Towards a New Reinforcement Learning? »
Jan Peters · Stefan Schaal · Drew Bagnell