Timezone: »
Reinforcement Learning (RL) agents are often unable to generalise well to environment variations in the state space that were not observed during training. This issue is especially problematic for image-based RL, where a change in just one variable, such as the background colour, can change many pixels in the image, which can lead to drastic changes in the agent's latent representation of the image, causing the learned policy to fail. To learn more robust representations, we introduce TEmporal Disentanglement (TED), a self-supervised auxiliary task that leads to disentangled image representations exploiting the sequential nature of RL observations. We find empirically that RL algorithms utilising TED as an auxiliary task adapt more quickly to changes in environment variables with continued training compared to state-of-the-art representation learning methods. Since TED enforces a disentangled structure of the representation, we also find that policies trained with TED generalise better to unseen values of variables irrelevant to the task (e.g.\ background colour) as well as unseen values of variables that affect the optimal policy (e.g.\ goal positions).
Author Information
Mhairi Dunion (University of Edinburgh)
Trevor McInroe (The University of Edinburgh)

Trevor McInroe is a PhD student at The University of Edinburgh, advised by Amos Storkey. His interests include deep reinforcement learning, representation learning, and world models.
Kevin Sebastian Luck (Aalto University)
Josiah Hanna (University of Wisconsin -- Madison)
Stefano Albrecht (University of Edinburgh)
More from the Same Authors
-
2021 : Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks »
Georgios Papoudakis · Filippos Christianos · Lukas Schäfer · Stefano Albrecht -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2021 : Safe Evaluation For Offline Learning: \\Are We Ready To Deploy? »
Hager Radi · Josiah Hanna · Peter Stone · Matthew Taylor -
2021 : Robust On-Policy Data Collection for Data-Efficient Policy Evaluation »
Rujie Zhong · Josiah Hanna · Lukas Schäfer · Stefano Albrecht -
2022 : Enhancing Transfer of Reinforcement Learning Agents with Abstract Contextual Embeddings »
Guy Azran · Mohamad Hosein Danesh · Stefano Albrecht · Sarah Keren -
2022 : Verifiable Goal Recognition for Autonomous Driving with Occlusions »
Cillian Brewitt · Massimiliano Tamborski · Stefano Albrecht -
2022 : Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction »
Brahma Pavse · Josiah Hanna -
2022 : Sample Relationships through the Lens of Learning Dynamics with Label Information »
Shangmin Guo · Yi Ren · Stefano Albrecht · Kenny Smith -
2022 : Learning Representations for Reinforcement Learning with Hierarchical Forward Models »
Trevor McInroe · Lukas Schäfer · Stefano Albrecht -
2022 : Co-Imitation: Learning Design and Behaviour by Imitation »
Chang Rajani · Karol Arndt · David Blanco-Mulero · Kevin Sebastian Luck · Ville Kyrki -
2022 Poster: Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning »
Rujie Zhong · Duohan Zhang · Lukas Schäfer · Stefano Albrecht · Josiah Hanna -
2021 Poster: Agent Modelling under Partial Observability for Deep Reinforcement Learning »
Georgios Papoudakis · Filippos Christianos · Stefano Albrecht -
2020 Poster: Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning »
Filippos Christianos · Lukas Schäfer · Stefano Albrecht