Timezone: »
Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping. In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network. This improves the robustness of deep reinforcement learning in presence of noisy updates. The resultant agents, called DQN Pro and Rainbow Pro, exhibit significant performance improvements over their original counterparts on the Atari benchmark demonstrating the effectiveness of this simple idea in deep reinforcement learning. The code for our paper is available here: Github.com/amazon-research/fast-rl-with-slow-updates.
Author Information
Kavosh Asadi (Amazon)
Rasool Fakoor (Amazon Web Services)
Omer Gottesman
Taesup Kim (Seoul National University)
Michael Littman (Brown University)
Alexander Smola (Amazon)
**AWS Machine Learning**
More from the Same Authors
-
2021 Spotlight: Mixture Proportion Estimation and PU Learning:A Modern Approach »
Saurabh Garg · Yifan Wu · Alexander Smola · Sivaraman Balakrishnan · Zachary Lipton -
2021 : Benchmarking Multimodal AutoML for Tabular Data with Text Fields »
Xingjian Shi · Jonas Mueller · Nick Erickson · Mu Li · Alexander Smola -
2021 : Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation »
Ramtin Keramati · Omer Gottesman · Leo Celi · Finale Doshi-Velez · Emma Brunskill -
2021 : Robust Reinforcement Learning for Shifting Dynamics During Deployment »
Samuel Stanton · Rasool Fakoor · Jonas Mueller · Andrew Gordon Wilson · Alexander Smola -
2021 : Bayesian Exploration for Lifelong Reinforcement Learning »
Haotian Fu · Shangqun Yu · Michael Littman · George Konidaris -
2022 : RLSBench: A Large-Scale Empirical Study of Domain Adaptation Under Relaxed Label Shift »
Saurabh Garg · Nick Erickson · James Sharpnack · Alexander Smola · Sivaraman Balakrishnan · Zachary Lipton -
2023 Poster: Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition »
Shuhuai Ren · Aston Zhang · Yi Zhu · Shuai Zhang · Shuai Zheng · Mu Li · Alexander Smola · Xu Sun -
2023 Poster: TD Convergence: An Optimization Perspective »
Kavosh Asadi · Shoham Sabach · Yao Liu · Omer Gottesman · Rasool Fakoor -
2023 Poster: Resetting the Optimizer in Deep RL: An Empirical Study »
Kavosh Asadi · Rasool Fakoor · Shoham Sabach -
2023 Poster: Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning »
Akhil Bagaria · Ben Abbatematteo · Omer Gottesman · Matt Corsaro · Sreehari Rammohan · George Konidaris -
2023 Poster: Budgeting Counterfactual for Offline RL »
Yao Liu · Pratik Chaudhari · Rasool Fakoor -
2022 Spotlight: Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex »
Charles Lovering · Jessica Forde · George Konidaris · Ellie Pavlick · Michael Littman -
2022 Workshop: Reinforcement Learning for Real Life (RL4RealLife) Workshop »
Yuxi Li · Emma Brunskill · MINMIN CHEN · Omer Gottesman · Lihong Li · Yao Liu · Zhiwei Tony Qin · Matthew Taylor -
2022 Poster: Adaptive Interest for Emphatic Reinforcement Learning »
Martin Klissarov · Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Taesup Kim · Alexander Smola -
2022 Poster: Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex »
Charles Lovering · Jessica Forde · George Konidaris · Ellie Pavlick · Michael Littman -
2022 Poster: Graph Reordering for Cache-Efficient Near Neighbor Search »
Benjamin Coleman · Santiago Segarra · Alexander Smola · Anshumali Shrivastava -
2022 Poster: Model-based Lifelong Reinforcement Learning with Bayesian Exploration »
Haotian Fu · Shangqun Yu · Michael Littman · George Konidaris -
2022 Social: RL Social »
Yuxi Li · Omer Gottesman · Niranjani Prasad -
2021 Poster: On the Expressivity of Markov Reward »
David Abel · Will Dabney · Anna Harutyunyan · Mark Ho · Michael Littman · Doina Precup · Satinder Singh -
2021 Poster: Mixture Proportion Estimation and PU Learning:A Modern Approach »
Saurabh Garg · Yifan Wu · Alexander Smola · Sivaraman Balakrishnan · Zachary Lipton -
2021 Poster: Deep Explicit Duration Switching Models for Time Series »
Abdul Fatir Ansari · Konstantinos Benidis · Richard Kurle · Ali Caner Turkmen · Harold Soh · Alexander Smola · Bernie Wang · Tim Januschowski -
2021 Poster: Continuous Doubly Constrained Batch Reinforcement Learning »
Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Pratik Chaudhari · Alexander Smola -
2021 Oral: On the Expressivity of Markov Reward »
David Abel · Will Dabney · Anna Harutyunyan · Mark Ho · Michael Littman · Doina Precup · Satinder Singh -
2020 Poster: Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation »
Rasool Fakoor · Jonas Mueller · Nick Erickson · Pratik Chaudhari · Alexander Smola -
2019 : Invited Talk - Alexander J. Smola - Sets and symmetries »
Alexander Smola -
2019 : Poster Session »
Rishav Chourasia · Yichong Xu · Corinna Cortes · Chien-Yi Chang · Yoshihiro Nagano · So Yeon Min · Benedikt Boecking · Phi Vu Tran · Kamyar Ghasemipour · Qianggang Ding · Shouvik Mani · Vikram Voleti · Rasool Fakoor · Miao Xu · Kenneth Marino · Lisa Lee · Volker Tresp · Jean-Francois Kagy · Marvin Zhang · Barnabas Poczos · Dinesh Khandelwal · Adrien Bardes · Evan Shelhamer · Jiacheng Zhu · Ziming Li · Xiaoyan Li · Dmitrii Krasheninnikov · Ruohan Wang · Mayoore Jaiswal · Emad Barsoum · Suvansh Sanjeev · Theeraphol Wattanavekin · Qizhe Xie · Sifan Wu · Yuki Yoshida · David Kanaa · Sina Khoshfetrat Pakazad · Mehdi Maasoumy -
2019 Poster: Variational Temporal Abstraction »
Taesup Kim · Sungjin Ahn · Yoshua Bengio -
2019 Poster: Fast AutoAugment »
Sungbin Lim · Ildoo Kim · Taesup Kim · Chiheon Kim · Sungwoong Kim -
2018 Poster: Bayesian Model-Agnostic Meta-Learning »
Jaesik Yoon · Taesup Kim · Ousmane Dia · Sungwoong Kim · Yoshua Bengio · Sungjin Ahn -
2018 Spotlight: Bayesian Model-Agnostic Meta-Learning »
Jaesik Yoon · Taesup Kim · Ousmane Dia · Sungwoong Kim · Yoshua Bengio · Sungjin Ahn -
2017 : TBA11 »
Alexander Smola -
2017 : Poster Session Speech: source separation, enhancement, recognition, synthesis »
Shuayb Zarar · Rasool Fakoor · SRI HARSHA DUMPALA · Minje Kim · Paris Smaragdis · Mohit Dubey · Jong Hwan Ko · Sakriani Sakti · Yuxuan Wang · Lijiang Guo · Garrett T Kenyon · Andros Tjandra · Tycho Tax · Younggun Lee -
2017 Oral: Deep Sets »
Manzil Zaheer · Satwik Kottur · Siamak Ravanbakhsh · Barnabas Poczos · Ruslan Salakhutdinov · Alexander Smola -
2017 Poster: Deep Sets »
Manzil Zaheer · Satwik Kottur · Siamak Ravanbakhsh · Barnabas Poczos · Ruslan Salakhutdinov · Alexander Smola -
2016 Workshop: The Future of Interactive Machine Learning »
Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman -
2016 Oral: Showing versus doing: Teaching by demonstration »
Mark Ho · Michael Littman · James MacGlashan · Fiery Cushman · Joseph L Austerweil -
2016 Poster: Variance Reduction in Stochastic Gradient Langevin Dynamics »
Kumar Avinava Dubey · Sashank J. Reddi · Sinead Williamson · Barnabas Poczos · Alexander Smola · Eric Xing -
2016 Poster: Showing versus doing: Teaching by demonstration »
Mark Ho · Michael Littman · James MacGlashan · Fiery Cushman · Joe Austerweil · Joseph L Austerweil -
2016 Poster: Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization »
Sashank J. Reddi · Suvrit Sra · Barnabas Poczos · Alexander Smola -
2015 : Scaling Machine Learning »
Alexander Smola -
2015 Workshop: Nonparametric Methods for Large Scale Representation Learning »
Andrew G Wilson · Alexander Smola · Eric Xing -
2015 Poster: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2015 Spotlight: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2015 Poster: On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants »
Sashank J. Reddi · Ahmed Hefny · Suvrit Sra · Barnabas Poczos · Alexander Smola -
2014 Poster: Communication Efficient Distributed Machine Learning with the Parameter Server »
Mu Li · David G Andersen · Alexander Smola · Kai Yu -
2014 Poster: Spectral Methods for Indian Buffet Process Inference »
Hsiao-Yu Tung · Alexander Smola -
2013 Workshop: Topic Models: Computation, Application, and Evaluation »
David Mimno · Amr Ahmed · Jordan Boyd-Graber · Ankur Moitra · Hanna Wallach · Alexander Smola · David Blei · Anima Anandkumar -
2013 Workshop: Randomized Methods for Machine Learning »
David Lopez-Paz · Quoc V Le · Alexander Smola -
2013 Workshop: Modern Nonparametric Methods in Machine Learning »
Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao -
2013 Poster: Variance Reduction for Stochastic Gradient Optimization »
Chong Wang · Xi Chen · Alexander Smola · Eric Xing -
2012 Workshop: Confluence between Kernel Methods and Graphical Models »
Le Song · Arthur Gretton · Alexander Smola -
2012 Session: Oral Session 10 »
Alexander Smola -
2012 Poster: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan -
2012 Poster: FastEx: Fast Clustering with Exponential Families »
Amr Ahmed · Sujith Ravi · Shravan M Narayanamurthy · Alexander Smola -
2012 Spotlight: Learning Networks of Heterogeneous Influence »
Nan Du · Le Song · Alexander Smola · Ming Yuan -
2011 Workshop: Big Learning: Algorithms, Systems, and Tools for Learning at Scale »
Joseph E Gonzalez · Sameer Singh · Graham Taylor · James Bergstra · Alice Zheng · Misha Bilenko · Yucheng Low · Yoshua Bengio · Michael Franklin · Carlos Guestrin · Andrew McCallum · Alexander Smola · Michael Jordan · Sugato Basu -
2011 Tutorial: Graphical Models for the Internet »
Amr Ahmed · Alexander Smola -
2010 Workshop: Challenges of Data Visualization »
Barbara Hammer · Laurens van der Maaten · Fei Sha · Alexander Smola -
2010 Poster: Word Features for Latent Dirichlet Allocation »
James Petterson · Alexander Smola · Tiberio Caetano · Wray L Buntine · Shravan M Narayanamurthy -
2010 Poster: Optimal Web-Scale Tiering as a Flow Problem »
Gilbert Leung · Novi Quadrianto · Alexander Smola · Kostas Tsioutsiouliklis -
2010 Poster: Multitask Learning without Label Correspondences »
Novi Quadrianto · Alexander Smola · Tiberio Caetano · S.V.N. Vishwanathan · James Petterson -
2010 Poster: Parallelized Stochastic Gradient Descent »
Martin A Zinkevich · Markus Weimer · Alexander Smola · Lihong Li -
2009 Workshop: Large-Scale Machine Learning: Parallelism and Massive Datasets »
Alexander Gray · Arthur Gretton · Alexander Smola · Joseph E Gonzalez · Carlos Guestrin -
2009 Poster: Slow Learners are Fast »
Martin A Zinkevich · Alexander Smola · John Langford -
2009 Poster: Distribution Matching for Transduction »
Novi Quadrianto · James Petterson · Alexander Smola -
2008 Poster: Kernelized Sorting »
Novi Quadrianto · Le Song · Alexander Smola -
2008 Poster: Kernel Measures of Independence for non-iid Data »
Xinhua Zhang · Le Song · Arthur Gretton · Alexander Smola -
2008 Spotlight: Kernelized Sorting »
Novi Quadrianto · Le Song · Alexander Smola -
2008 Spotlight: Kernel Measures of Independence for non-iid Data »
Xinhua Zhang · Le Song · Arthur Gretton · Alexander Smola -
2008 Poster: Tighter Bounds for Structured Estimation »
Olivier Chapelle · Chuong B Do · Quoc V Le · Alexander Smola · Choon Hui Teo -
2008 Poster: Robust Near-Isometric Matching via Structured Learning of Graphical Models »
Julian J McAuley · Tiberio Caetano · Alexander Smola -
2007 Workshop: Representations and Inference on Probability Distributions »
Kenji Fukumizu · Arthur Gretton · Alexander Smola -
2007 Poster: Convex Learning with Invariances »
Choon Hui Teo · Amir Globerson · Sam T Roweis · Alexander Smola -
2007 Spotlight: A Kernel Statistical Test of Independence »
Arthur Gretton · Kenji Fukumizu · Choon Hui Teo · Le Song · Bernhard Schölkopf · Alexander Smola -
2007 Spotlight: Bundle Methods for Machine Learning »
Alexander Smola · Vishwanathan S V N · Quoc V Le -
2007 Poster: COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking »
Markus Weimer · Alexandros Karatzoglou · Quoc V Le · Alexander Smola -
2007 Oral: Colored Maximum Variance Unfolding »
Le Song · Alexander Smola · Karsten Borgwardt · Arthur Gretton -
2007 Poster: Colored Maximum Variance Unfolding »
Le Song · Alexander Smola · Karsten Borgwardt · Arthur Gretton -
2007 Poster: A Kernel Statistical Test of Independence »
Arthur Gretton · Kenji Fukumizu · Choon Hui Teo · Le Song · Bernhard Schölkopf · Alexander Smola -
2007 Poster: Bundle Methods for Machine Learning »
Alexander Smola · Vishwanathan S V N · Quoc V Le -
2007 Spotlight: COFI RANK - Maximum Margin Matrix Factorization for Collaborative Ranking »
Markus Weimer · Alexandros Karatzoglou · Quoc V Le · Alexander Smola -
2007 Demonstration: Elefant »
Kishor Gawande · Alexander Smola · Vishwanathan S V N · Li Cheng · Simon A Guenter -
2007 Spotlight: Convex Learning with Invariances »
Choon Hui Teo · Amir Globerson · Sam T Roweis · Alexander Smola -
2006 Poster: A Kernel Method for the Two-Sample-Problem »
Arthur Gretton · Karsten Borgwardt · Malte J Rasch · Bernhard Schölkopf · Alexander Smola -
2006 Poster: Correcting Sample Selection Bias by Unlabeled Data »
Jiayuan Huang · Alexander Smola · Arthur Gretton · Karsten Borgwardt · Bernhard Schölkopf -
2006 Spotlight: Correcting Sample Selection Bias by Unlabeled Data »
Jiayuan Huang · Alexander Smola · Arthur Gretton · Karsten Borgwardt · Bernhard Schölkopf -
2006 Talk: A Kernel Method for the Two-Sample-Problem »
Arthur Gretton · Karsten Borgwardt · Malte J Rasch · Bernhard Schölkopf · Alexander Smola