Timezone: »

DOMINO: Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning
Yao Mu · Yuzheng Zhuang · Fei Ni · Bin Wang · Jianyu Chen · Jianye Hao · Ping Luo

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #223

Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by decomposed mutual information optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges via learning disentangled context and reduce the demand for the number of samples collected in various environments. Extensive experiments show that the context learned by DOMINO benefits both model-based and model-free reinforcement learning algorithms for dynamics generalization in terms of sample efficiency and performance in unseen environments.

Author Information

Yao Mu (The University of Hong Kong)

I am currently a Ph.D. Candidate of Computer Science at the University of Hong Kong. I graduated with a Master Degree from Tsinghua University in June 2021. My research interests include Reinforcement Learning, Representation Learning, Autonomous Driving, Optimal Control, and Computer Vision.

Yuzheng Zhuang (Huawei Technologies Co. Ltd.)
Fei Ni (Tianjin University)
Bin Wang (Huawei Noah's Ark Lab)
Jianyu Chen (Tsinghua University)
Jianye Hao (Tianjin University)
Ping Luo (The University of Hong Kong)

More from the Same Authors