Timezone: »
Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration schemes built on Anderson mixing that improve the convergence of deep RL algorithms. Our main results establish a connection between Anderson mixing and quasi-Newton methods and prove that Anderson mixing increases the convergence radius of policy iteration schemes by an extra contraction factor. The key focus of the analysis roots in the fixed-point iteration nature of RL. We further propose a stabilization strategy by introducing a stable regularization term in Anderson mixing and a differentiable, non-expansive MellowMax operator that can allow both faster convergence and more stable behavior. Extensive experiments demonstrate that our proposed method enhances the convergence, stability, and performance of RL algorithms.
Author Information
Ke Sun (University of Alberta)
Yafei Wang (University of Alberta)
Yi Liu (University of Alaberta)
yingnan zhao (Harbin Institute of Technology)
Bo Pan (University of Alberta)
Shangling Jui (Huawei)
Dr. Jui is the chief AI scientist of Huawei Kirin team. His knowledge on AI and reinforcement learning has guided the team to build the eco-system of Kirin platform. He support decisions and investment of AI to Canadian universities including UBC, SFU, UofToronto, UofAlberta, UofWaterloo, etc., through joint lab collaborations and local Huawei offices.
Bei Jiang (University of Alberta)
Linglong Kong (University of Alberta)
More from the Same Authors
-
2023 Poster: AutoGO: Automated Computation Graph Optimization for Neural Network Evolution »
Mohammad Salameh · Keith Mills · Negar Hassanpour · Fred Han · Shuting Zhang · Wei Lu · Shangling Jui · CHUNHUA ZHOU · Fengyu Sun · Di Niu -
2023 Poster: Gaussian Differential Privacy on Riemannian Manifolds »
Yangdi Jiang · Xiaotian Chang · Yi Liu · Lei Ding · Linglong Kong · Bei Jiang -
2022 Spotlight: Identification, Amplification and Measurement: A bridge to Gaussian Differential Privacy »
Yi Liu · Ke Sun · Bei Jiang · Linglong Kong -
2022 Spotlight: Lightning Talks 1B-4 »
Andrei Atanov · Shiqi Yang · Wanshan Li · Yongchang Hao · Ziquan Liu · Jiaxin Shi · Anton Plaksin · Jiaxiang Chen · Ziqi Pan · yaxing wang · Yuxin Liu · Stepan Martyanov · Alessandro Rinaldo · Yuhao Zhou · Li Niu · Qingyuan Yang · Andrei Filatov · Yi Xu · Liqing Zhang · Lili Mou · Ruomin Huang · Teresa Yeo · kai wang · Daren Wang · Jessica Hwang · Yuanhong Xu · Qi Qian · Hu Ding · Michalis Titsias · Shangling Jui · Ajay Sohmshetty · Lester Mackey · Joost van de Weijer · Hao Li · Amir Zamir · Xiangyang Ji · Antoni Chan · Rong Jin -
2022 Spotlight: Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation »
Shiqi Yang · yaxing wang · kai wang · Shangling Jui · Joost van de Weijer -
2022 Poster: Identification, Amplification and Measurement: A bridge to Gaussian Differential Privacy »
Yi Liu · Ke Sun · Bei Jiang · Linglong Kong -
2022 Poster: Conformalized Fairness via Quantile Regression »
Meichen Liu · Lei Ding · Dengdeng Yu · Wulong Liu · Linglong Kong · Bei Jiang -
2022 Poster: Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation »
Shiqi Yang · yaxing wang · kai wang · Shangling Jui · Joost van de Weijer -
2021 Poster: Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation »
Shiqi Yang · yaxing wang · Joost van de Weijer · Luis Herranz · Shangling Jui -
2019 : Coffee + Posters »
Benjamin Caine · Renhao Wang · Nazmus Sakib · Nana Otawara · Meha Kaushik · elmira amirloo · Nemanja Djuric · Johanna Rock · Tanmay Agarwal · Angelos Filos · Panagiotis Tigkas · Donsuk Lee · Wootae Jeon · Nikita Jaipuria · Pin Wang · Jinxin Zhao · Liangjun Zhang · Ashutosh Singh · Ershad Banijamali · Mohsen Rohani · Aman Sinha · Ameya Joshi · Ching-Yao Chan · Mohammed Abdou · Changhao Chen · Jong-Chan Kim · eslam mohamed · Matt OKelly · Nirvan Singhania · Hiroshi Tsukahara · Atsushi Keyaki · Praveen Palanisamy · Justin Norden · Micol Marchetti-Bowick · Yiming Gu · Hitesh Arora · Shubhankar Deshpande · Jeff Schneider · Shangling Jui · Vaneet Aggarwal · Tryambak Gangopadhyay · Qiaojing Yan