This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models. We define the MVRL framework by extending partially observable Markov decision processes (POMDPs) to support more than one observation model and propose two solution methods through observation augmentation and cross-view policy transfer. We empirically evaluate our method and demonstrate its effectiveness in a variety of environments. Specifically, we show reductions in sample complexities and computational time for acquiring policies that handle multi-view environments.
Minne Li (University College London)
Lisheng Wu (UCL)
Jun WANG (UCL)
Haitham Bou Ammar (UCL)
More from the Same Authors
2018 Poster: Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning »
Rui Luo · Jianhong Wang · Yaodong Yang · Jun WANG · Zhanxing Zhu