Timezone: »
This paper deals with the problem of learning a skill-conditioned policy that acts meaningfully in the absence of a reward signal. Mutual information based objectives have shown some success in learning skills that reach a diverse set of states in this setting. These objectives include a KL-divergence term, which is maximized by visiting distinct states even if those states are not far apart in the MDP. This paper presents an approach that rewards the agent for learning skills that maximize the Wasserstein distance of their state visitation from the start state of the skill. It shows that such an objective leads to a policy that covers more distance in the MDP than diversity based objectives, and validates the results on a variety of Atari environments.
Author Information
Ishan Durugkar (University of Texas at Austin)
Steven Hansen (DeepMind)
Stephen Spencer (DeepMind)
Volodymyr Mnih (DeepMind)
Ishan Durugkar (University of Texas at Austin)
More from the Same Authors
-
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2022 : ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning »
Eddy Hudson · Ishan Durugkar · Garrett Warnell · Peter Stone -
2022 : ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning »
Eddy Hudson · Ishan Durugkar · Garrett Warnell · Peter Stone -
2022 : In-context Reinforcement Learning with Algorithm Distillation »
Michael Laskin · Luyu Wang · Junhyuk Oh · Emilio Parisotto · Stephen Spencer · Richie Steigerwald · DJ Strouse · Steven Hansen · Angelos Filos · Ethan Brooks · Maxime Gazeau · Himanshu Sahni · Satinder Singh · Volodymyr Mnih -
2023 Poster: f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences »
Siddhant Agarwal · Ishan Durugkar · Peter Stone · Amy Zhang -
2023 Workshop: The NeurIPS 2023 Workshop on Goal-Conditioned Reinforcement Learning »
Benjamin Eysenbach · Ishan Durugkar · Jason Yecheng Ma · Andi Peng · Tongzhou Wang · Amy Zhang -
2022 Poster: Palm up: Playing in the Latent Manifold for Unsupervised Pretraining »
Hao Liu · Tom Zahavy · Volodymyr Mnih · Satinder Singh -
2021 Poster: Adversarial Intrinsic Motivation for Reinforcement Learning »
Ishan Durugkar · Mauricio Tec · Scott Niekum · Peter Stone -
2021 Poster: Entropic Desired Dynamics for Intrinsic Control »
Steven Hansen · Guillaume Desjardins · Kate Baumli · David Warde-Farley · Nicolas Heess · Simon Osindero · Volodymyr Mnih -
2020 Poster: An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch »
Siddharth Desai · Ishan Durugkar · Haresh Karnan · Garrett Warnell · Josiah Hanna · Peter Stone -
2019 Poster: Generalization of Reinforcement Learners with Working and Episodic Memory »
Meire Fortunato · Melissa Tan · Ryan Faulkner · Steven Hansen · Adrià Puigdomènech Badia · Gavin Buttimore · Charles Deck · Joel Leibo · Charles Blundell -
2019 Poster: Unsupervised Learning of Object Keypoints for Perception and Control »
Tejas Kulkarni · Ankush Gupta · Catalin Ionescu · Sebastian Borgeaud · Malcolm Reynolds · Andrew Zisserman · Volodymyr Mnih -
2018 : Poster Session 1 + Coffee »
Tom Van de Wiele · Rui Zhao · J. Fernando Hernandez-Garcia · Fabio Pardo · Xian Yeow Lee · Xiaolin Andy Li · Marcin Andrychowicz · Jie Tang · Suraj Nair · Juhyeon Lee · Cédric Colas · S. M. Ali Eslami · Yen-Chen Wu · Stephen McAleer · Ryan Julian · Yang Xue · Matthia Sabatelli · Pranav Shyam · Alexandros Kalousis · Giovanni Montana · Emanuele Pesce · Felix Leibfried · Zhanpeng He · Chunxiao Liu · Yanjun Li · Yoshihide Sawada · Alexander Pashevich · Tejas Kulkarni · Keiran Paster · Luca Rigazio · Quan Vuong · Hyunggon Park · Minhae Kwon · Rivindu Weerasekera · Shamane Siriwardhanaa · Rui Wang · Ozsel Kilinc · Keith Ross · Yizhou Wang · Simon Schmitt · Thomas Anthony · Evan Cater · Forest Agostinelli · Tegg Sung · Shirou Maruyama · Alexander Shmakov · Devin Schwab · Mohammad Firouzi · Glen Berseth · Denis Osipychev · Jesse Farebrother · Jianlan Luo · William Agnew · Peter Vrancx · Jonathan Heek · Catalin Ionescu · Haiyan Yin · Megumi Miyashita · Nathan Jay · Noga H. Rotman · Sam Leroux · Shaileshh Bojja Venkatakrishnan · Henri Schmidt · Jack Terwilliger · Ishan Durugkar · Jonathan Sauder · David Kas · Arash Tavakoli · Alain-Sam Cohen · Philip Bontrager · Adam Lerer · Thomas Paine · Ahmed Khalifa · Ruben Rodriguez · Avi Singh · Yiming Zhang -
2018 Poster: Fast deep reinforcement learning using online adjustments from the past »
Steven Hansen · Alexander Pritzel · Pablo Sprechmann · Andre Barreto · Charles Blundell -
2016 Poster: Learning values across many orders of magnitude »
Hado van Hasselt · Arthur Guez · Arthur Guez · Matteo Hessel · Volodymyr Mnih · David Silver -
2016 Poster: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Oral: Using Fast Weights to Attend to the Recent Past »
Jimmy Ba · Geoffrey E Hinton · Volodymyr Mnih · Joel Leibo · Catalin Ionescu -
2016 Poster: Strategic Attentive Writer for Learning Macro-Actions »
Alexander (Sasha) Vezhnevets · Volodymyr Mnih · Simon Osindero · Alex Graves · Oriol Vinyals · John Agapiou · koray kavukcuoglu -
2015 : The Deep Reinforcement Learning Boom »
Volodymyr Mnih -
2014 Workshop: Deep Learning and Representation Learning »
Andrew Y Ng · Yoshua Bengio · Adam Coates · Roland Memisevic · Sharanyan Chetlur · Geoffrey E Hinton · Shamim Nemati · Bryan Catanzaro · Surya Ganguli · Herbert Jaeger · Phil Blunsom · Leon Bottou · Volodymyr Mnih · Chen-Yu Lee · Rich M Schwartz -
2014 Poster: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2014 Spotlight: Recurrent Models of Visual Attention »
Volodymyr Mnih · Nicolas Heess · Alex Graves · koray kavukcuoglu -
2013 Workshop: Deep Learning »
Yoshua Bengio · Hugo Larochelle · Russ Salakhutdinov · Tomas Mikolov · Matthew D Zeiler · David Mcallester · Nando de Freitas · Josh Tenenbaum · Jian Zhou · Volodymyr Mnih -
2010 Poster: Generating more realistic images using gated MRF's »
Marc'Aurelio Ranzato · Volodymyr Mnih · Geoffrey E Hinton