Timezone: »
Bayesian Active Reinforcement Learning
Viraj Mehta · Biswajit Paria · Jeff Schneider · Willie Neiswanger
Tue Dec 14 09:00 AM -- 11:00 AM (PST) @
Many current reinforcement learning algorithms explore by adding some form of randomness to the optimal policy given current knowledge. Here we take a different strategy, and instead aim to leverage ideas from Bayesian Optimal Experimental Design to guide exploration in RL for increased data-efficiency. In particular, we first construct an acquisition function that characterizes the value that a given data point provides for reinforcement learning. To the best of our knowledge, this is the first study that gives a practical task-aware criterion for evaluating the relative value of acquiring additional data. We also give a practical method for computing this quantity, given a dataset of transitions from a Markov Decision Process (MDP). Using this acquisition function, we develop an algorithm for reinforcement learning with access to a generative model of the environment, a setting which has not seen algorithms for continuous MDPs despite being thoroughly studied in the tabular case. Our algorithm is able to solve a variety of simulated continuous control problems using 5 - 1,000 times less data than model-based reinforcement learning algorithms and $10^3$ - $10^5$ times less data than model-free techniques. We give several ablated comparisons, which point to substantial improvements arising from the ability to operate in a generative setting as well as the principled method of obtaining data.
Author Information
Viraj Mehta (Carnegie Mellon University)
Biswajit Paria (Carnegie Mellon University)
Jeff Schneider (CMU)
Willie Neiswanger (Carnegie Mellon University)
More from the Same Authors
-
2021 : BATS: Best Action Trajectory Stitching »
Ian Char · Viraj Mehta · Adam Villaflor · John Dolan · Jeff Schneider -
2022 : Offline Model-Based Reinforcement Learning for Tokamak Control »
Ian Char · Joseph Abbate · Laszlo Bardoczi · Mark Boyer · Youngseog Chung · Rory Conlin · Keith Erickson · Viraj Mehta · Nathan Richner · Egemen Kolemen · Jeff Schneider -
2022 : AutoML for Climate Change: A Call to Action »
Renbo Tu · Nicholas Roberts · Vishak Prasad C · Sibasis Nayak · Paarth Jain · Frederic Sala · Ganesh Ramakrishnan · Ameet Talwalkar · Willie Neiswanger · Colin White -
2022 Poster: Exploration via Planning for Information about the Optimal Trajectory »
Viraj Mehta · Ian Char · Joseph Abbate · Rory Conlin · Mark Boyer · Stefano Ermon · Jeff Schneider · Willie Neiswanger -
2021 : Reinforcement Learning for Autonomous Driving »
Jeff Schneider · Jeff Schneider -
2021 Poster: Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification »
Youngseog Chung · Willie Neiswanger · Ian Char · Jeff Schneider -
2020 Poster: A Study on Encodings for Neural Architecture Search »
Colin White · Willie Neiswanger · Sam Nolen · Yash Savani -
2020 Spotlight: A Study on Encodings for Neural Architecture Search »
Colin White · Willie Neiswanger · Sam Nolen · Yash Savani -
2019 : Coffee + Posters »
Benjamin Caine · Renhao Wang · Nazmus Sakib · Nana Otawara · Meha Kaushik · elmira amirloo · Nemanja Djuric · Johanna Rock · Tanmay Agarwal · Angelos Filos · Panagiotis Tigkas · Donsuk Lee · Wootae Jeon · Nikita Jaipuria · Pin Wang · Jinxin Zhao · Liangjun Zhang · Ashutosh Singh · Ershad Banijamali · Mohsen Rohani · Aman Sinha · Ameya Joshi · Ching-Yao Chan · Mohammed Abdou · Changhao Chen · Jong-Chan Kim · eslam mohamed · Matt OKelly · Nirvan Singhania · Hiroshi Tsukahara · Atsushi Keyaki · Praveen Palanisamy · Justin Norden · Micol Marchetti-Bowick · Yiming Gu · Hitesh Arora · Shubhankar Deshpande · Jeff Schneider · Shangling Jui · Vaneet Aggarwal · Tryambak Gangopadhyay · Qiaojing Yan -
2019 Poster: Offline Contextual Bayesian Optimization »
Ian Char · Youngseog Chung · Willie Neiswanger · Kirthevasan Kandasamy · Oak Nelson · Mark Boyer · Egemen Kolemen · Jeff Schneider -
2018 Poster: Neural Architecture Search with Bayesian Optimisation and Optimal Transport »
Kirthevasan Kandasamy · Willie Neiswanger · Jeff Schneider · Barnabas Poczos · Eric Xing -
2018 Spotlight: Neural Architecture Search with Bayesian Optimisation and Optimal Transport »
Kirthevasan Kandasamy · Willie Neiswanger · Jeff Schneider · Barnabas Poczos · Eric Xing -
2016 Poster: The Multi-fidelity Multi-armed Bandit »
Kirthevasan Kandasamy · Gautam Dasarathy · Barnabas Poczos · Jeff Schneider -
2016 Poster: Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations »
Kirthevasan Kandasamy · Gautam Dasarathy · Junier B Oliva · Jeff Schneider · Barnabas Poczos -
2015 : Bayesian Optimization and Embedded Learning Systems »
Jeff Schneider -
2014 Poster: Flexible Transfer Learning under Support and Model Shift »
Xuezhi Wang · Jeff Schneider -
2013 Poster: Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition »
Tzu-Kuo Huang · Jeff Schneider -
2013 Poster: Σ-Optimality for Active Learning on Gaussian Random Fields »
Yifei Ma · Roman Garnett · Jeff Schneider -
2011 Poster: Group Anomaly Detection using Flexible Genre Models »
Liang Xiong · Barnabas Poczos · Jeff Schneider -
2011 Poster: Learning Auto-regressive Models from Sequence and Non-sequence Data »
Tzu-Kuo Huang · Jeff Schneider -
2010 Poster: Learning Multiple Tasks with a Sparse Matrix-Normal Penalty »
Yi Zhang · Jeff Schneider -
2008 Poster: Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text »
Yi Zhang · Jeff Schneider · Artur Dubrawski