Timezone: »
We present a data-efficient reinforcement learning method for continuous state-action systems under significant observation noise. Data-efficient solutions under small noise exist, such as PILCO which learns the cartpole swing-up task in 30s. PILCO evaluates policies by planning state-trajectories using a dynamics model. However, PILCO applies policies to the observed state, therefore planning in observation space. We extend PILCO with filtering to instead plan in belief space, consistent with partially observable Markov decisions process (POMDP) planning. This enables data-efficient learning under significant observation noise, outperforming more naive methods such as post-hoc application of a filter to policies optimised by the original (unfiltered) PILCO algorithm. We test our method on the cartpole swing-up task, which involves nonlinear dynamics and requires nonlinear control.
Author Information
Rowan McAllister (University of California Berkeley)
Carl Edward Rasmussen (University of Cambridge)
More from the Same Authors
-
2020 : Paper 1: Multimodal Trajectory Prediction for Autonomous Driving with Semantic Map and Dynamic Graph Attention Network »
Rowan McAllister -
2020 : Paper 12: DepthNet Nano: A Highly Compact Self-Normalizing Neural Network for Monocular Depth Estimation »
Rowan McAllister -
2020 : Paper 15: Calibrating Self-supervised Monocular Depth Estimation »
Rowan McAllister -
2020 : Paper 22: RAMP-CNN: A Novel Neural Network for EnhancedAutomotive Radar Object Recognition »
Rowan McAllister -
2020 : Paper 57: Single Shot Multitask Pedestrian Detection and Behavior Prediction »
Rowan McAllister -
2022 : Gaussian Process parameterized Covariance Kernels for Non-stationary Regression »
Vidhi Lalchand · Talay Cheema · Laurence Aitchison · Carl Edward Rasmussen -
2022 : Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning? »
Gunshi Gupta · Tim G. J. Rudner · Rowan McAllister · Adrien Gaidon · Yarin Gal -
2022 : Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning? »
Gunshi Gupta · Tim G. J. Rudner · Rowan McAllister · Adrien Gaidon · Yarin Gal -
2022 Workshop: Machine Learning for Autonomous Driving »
Jiachen Li · Nigamaa Nayakanti · Xinshuo Weng · Daniel Omeiza · Ali Baheri · German Ros · Rowan McAllister -
2022 Poster: Sparse Gaussian Process Hyperparameters: Optimize or Integrate? »
Vidhi Lalchand · Wessel Bruinsma · David Burt · Carl Edward Rasmussen -
2021 Poster: Outcome-Driven Reinforcement Learning via Variational Inference »
Tim G. J. Rudner · Vitchyr Pong · Rowan McAllister · Yarin Gal · Sergey Levine -
2021 Poster: Kernel Identification Through Transformers »
Fergus Simpson · Ian Davies · Vidhi Lalchand · Alessandro Vullo · Nicolas Durrande · Carl Edward Rasmussen -
2021 Poster: Marginalised Gaussian Processes with Nested Sampling »
Fergus Simpson · Vidhi Lalchand · Carl Edward Rasmussen -
2020 : Combining variational autoencoder representations with structural descriptors improves prediction of docking scores »
Miguel Garcia-Ortegon · Carl Edward Rasmussen · Hiroshi Kajino -
2020 Workshop: Machine Learning for Autonomous Driving »
Rowan McAllister · Xinshuo Weng · Daniel Omeiza · Nick Rhinehart · Fisher Yu · German Ros · Vladlen Koltun -
2020 : Welcome »
Rowan McAllister -
2020 Poster: Ensembling geophysical models with Bayesian Neural Networks »
Ushnish Sengupta · Matt Amos · Scott Hosking · Carl Edward Rasmussen · Matthew Juniper · Paul Young -
2019 : Welcome »
Rowan McAllister · Nicholas Rhinehart · Li Erran Li -
2019 Workshop: Machine Learning for Autonomous Driving »
Rowan McAllister · Nicholas Rhinehart · Fisher Yu · Li Erran Li · Anca Dragan -
2018 : Coffee Break and Poster Session I »
Pim de Haan · Bin Wang · Dequan Wang · Aadil Hayat · Ibrahim Sobh · Muhammad Asif Rana · Thibault Buhet · Nicholas Rhinehart · Arjun Sharma · Alex Bewley · Michael Kelly · Lionel Blondé · Ozgur S. Oguz · Vaibhav Viswanathan · Jeroen Vanbaar · Konrad Żołna · Negar Rostamzadeh · Rowan McAllister · Sanjay Thakur · Alexandros Kalousis · Chelsea Sidrane · Sujoy Paul · Daphne Chen · Michal Garmulewicz · Henryk Michalewski · Coline Devin · Hongyu Ren · Jiaming Song · Wen Sun · Hanzhang Hu · Wulong Liu · Emilie Wirbel -
2018 Poster: Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models »
Kurtland Chua · Roberto Calandra · Rowan McAllister · Sergey Levine -
2018 Spotlight: Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models »
Kurtland Chua · Roberto Calandra · Rowan McAllister · Sergey Levine -
2017 Poster: Convolutional Gaussian Processes »
Mark van der Wilk · Carl Edward Rasmussen · James Hensman -
2017 Oral: Convolutional Gaussian Processes »
Mark van der Wilk · Carl Edward Rasmussen · James Hensman -
2016 Poster: Understanding Probabilistic Sparse Gaussian Process Approximations »
Matthias Bauer · Mark van der Wilk · Carl Edward Rasmussen -
2014 Poster: Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models »
Yarin Gal · Mark van der Wilk · Carl Edward Rasmussen -
2014 Poster: Variational Gaussian Process State-Space Models »
Roger Frigola · Yutian Chen · Carl Edward Rasmussen -
2013 Poster: Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC »
Roger Frigola · Fredrik Lindsten · Thomas Schön · Carl Edward Rasmussen -
2012 Poster: Active Learning of Model Evidence Using Bayesian Quadrature »
Michael A Osborne · David Duvenaud · Roman Garnett · Carl Edward Rasmussen · Stephen J Roberts · Zoubin Ghahramani -
2011 Poster: Gaussian Process Training with Input Noise »
Andrew McHutchon · Carl Edward Rasmussen -
2011 Poster: Additive Gaussian Processes »
David Duvenaud · Hannes Nickisch · Carl Edward Rasmussen -
2009 Workshop: Probabilistic Approaches for Control and Robotics »
Marc Deisenroth · Hilbert J Kappen · Emo Todorov · Duy Nguyen-Tuong · Carl Edward Rasmussen · Jan Peters -
2006 Tutorial: Advances in Gaussian Processes »
Carl Edward Rasmussen