Timezone: »
Many recent reinforcement learning (RL) methods learn stochastic policies with entropy regularization for exploration and robustness. However, in continuous action spaces, integrating entropy regularization with expressive policies is challenging and usually requires complex inference procedures. To tackle this problem, we propose a novel regularization method that is compatible with a broad range of expressive policy architectures. An appealing feature is that, the estimation of our regularization terms is simple and efficient even when the policy distributions are unknown. We show that our approach can effectively promote the exploration in continuous action spaces. Based on our regularization, we propose an off-policy actor-critic algorithm. Experiments demonstrate that the proposed algorithm outperforms state-of-the-art regularized RL methods in continuous control tasks.
Author Information
Qi Zhou (University of Science and Technology of China)
Yufei Kuang (University of Science and Technology of China)
Zherui Qiu (University of Science and Technology of China)
Houqiang Li (University of Science and Technology of China)
Jie Wang (University of Science and Technology of China)
More from the Same Authors
-
2022 Poster: LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning »
Mingyu Yang · Jian Zhao · Xunhan Hu · Wengang Zhou · Jiangcheng Zhu · Houqiang Li -
2022 : Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management »
Yuandong Ding · Mingxiao Feng · Guozi Liu · Wei Jiang · Chuheng Zhang · Li Zhao · Lei Song · Houqiang Li · Yan Jin · Jiang Bian -
2022 : Multi-Agent Reinforcement Learning with Shared Resources for Inventory Management »
Yuandong Ding · Mingxiao Feng · Guozi Liu · Wei Jiang · Chuheng Zhang · Li Zhao · Lei Song · Houqiang Li · Yan Jin · Jiang Bian -
2022 Spotlight: Lightning Talks 3A-3 »
Xu Yan · Zheng Dong · Qiancheng Fu · Jing Tan · Hezhen Hu · Fukun Yin · Weilun Wang · Ke Xu · Heshen Zhan · Wen Liu · Qingshan Xu · Xiaotong Zhao · Chaoda Zheng · Ziheng Duan · Zilong Huang · Xintian Shi · Wengang Zhou · Yew Soon Ong · Pei Cheng · Hujun Bao · Houqiang Li · Wenbing Tao · Jiantao Gao · Bin Kang · Weiwei Xu · Limin Wang · Ruimao Zhang · Tao Chen · Gang Yu · Rynson Lau · Shuguang Cui · Zhen Li -
2022 Spotlight: Hand-Object Interaction Image Generation »
Hezhen Hu · Weilun Wang · Wengang Zhou · Houqiang Li -
2022 Poster: Hand-Object Interaction Image Generation »
Hezhen Hu · Weilun Wang · Wengang Zhou · Houqiang Li -
2021 Poster: Dual Progressive Prototype Network for Generalized Zero-Shot Learning »
Chaoqun Wang · Shaobo Min · Xuejin Chen · Xiaoyan Sun · Houqiang Li -
2021 Poster: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking »
Jianbo Ouyang · Hui Wu · Min Wang · Wengang Zhou · Houqiang Li -
2021 Poster: Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training »
Hongwei Xue · Yupan Huang · Bei Liu · Houwen Peng · Jianlong Fu · Houqiang Li · Jiebo Luo -
2020 Poster: Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion »
Zhanqiu Zhang · Jianyu Cai · Jie Wang