Timezone: »
Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term return, in expectation. In distributional RL (DRL), the agent is also interested in the probability distribution of the return, not just its expected value. This so-called distributional perspective of RL has led to new algorithms with improved empirical performance. In this paper, we recall the atomic DRL (ADRL) framework based on atomic distributions projected via the Wasserstein-2 metric. Then, we derive two new deep ADRL algorithms, namely SAD-Q-learning and MAD-Q-learning (both for the control task). Numerical experiments on various environments compare our approach against existing deep (distributional) RL methods.
Author Information
Mastane Achab (Technology Innovation Institute)
REDA ALAMI (Total)
YASSER ABDELAZIZ DAHOU DJILALI (Dublin City University)
Kirill Fedyanin (Skolkovo Institute of Science and Technology)
Eric Moulines (Ecole Polytechnique)
Maxim Panov (Technology Innovation Institute)
More from the Same Authors
-
2021 : Nonparametric Approach to Uncertainty Quantification for Deterministic Neural Networks »
Nikita Kotelevskii · Alexander Fishkov · Kirill Fedyanin · Aleksandr Petiushko · Maxim Panov -
2022 : Non-Stationary Causal Bandits »
REDA ALAMI -
2023 Poster: First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities »
Aleksandr Beznosikov · Sergey Samsonov · Marina Sheshukova · Alexander Gasnikov · Alexey Naumov · Eric Moulines -
2023 Poster: Model-free Posterior Sampling via Learning Rate Randomization »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Pierre Perrault · Michal Valko · Pierre Ménard -
2022 Spotlight: Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Mark Rowland · Michal Valko · Pierre Ménard -
2022 Poster: Nonparametric Uncertainty Quantification for Single Deterministic Neural Network »
Nikita Kotelevskii · Aleksandr Artemenkov · Kirill Fedyanin · Fedor Noskov · Alexander Fishkov · Artem Shelmanov · Artem Vazhentsev · Aleksandr Petiushko · Maxim Panov -
2022 Poster: Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees »
Daniil Tiapkin · Denis Belomestny · Daniele Calandriello · Eric Moulines · Remi Munos · Alexey Naumov · Mark Rowland · Michal Valko · Pierre Ménard -
2022 Poster: Local-Global MCMC kernels: the best of both worlds »
Sergey Samsonov · Evgeny Lagutin · Marylou Gabrié · Alain Durmus · Alexey Naumov · Eric Moulines -
2022 Poster: BR-SNIS: Bias Reduced Self-Normalized Importance Sampling »
Gabriel Cardoso · Sergey Samsonov · Achille Thin · Eric Moulines · Jimmy Olsson -
2022 Poster: FedPop: A Bayesian Approach for Personalised Federated Learning »
Nikita Kotelevskii · Maxime Vono · Alain Durmus · Eric Moulines -
2021 : A deep Q Network approach for stock management optimization »
REDA ALAMI -
2021 Poster: Federated-EM with heterogeneity mitigation and variance reduction »
Aymeric Dieuleveut · Gersende Fort · Eric Moulines · Geneviève Robin -
2021 Poster: NEO: Non Equilibrium Sampling on the Orbits of a Deterministic Transform »
Achille Thin · Yazid Janati El Idrissi · Sylvain Le Corff · Charles Ollion · Eric Moulines · Arnaud Doucet · Alain Durmus · Christian X Robert -
2021 Poster: Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize »
Alain Durmus · Eric Moulines · Alexey Naumov · Sergey Samsonov · Kevin Scaman · Hoi-To Wai -
2020 Poster: A Stochastic Path Integral Differential EstimatoR Expectation Maximization Algorithm »
Gersende Fort · Eric Moulines · Hoi-To Wai -
2019 Poster: On the Global Convergence of (Fast) Incremental Expectation Maximization Methods »
Belhal Karimi · Hoi-To Wai · Eric Moulines · Marc Lavielle -
2018 Poster: Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames »
Geneviève Robin · Hoi-To Wai · Julie Josse · Olga Klopp · Eric Moulines -
2018 Spotlight: Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames »
Geneviève Robin · Hoi-To Wai · Julie Josse · Olga Klopp · Eric Moulines -
2018 Poster: The promises and pitfalls of Stochastic Gradient Langevin Dynamics »
Nicolas Brosse · Alain Durmus · Eric Moulines -
2017 : Memory Bandits: a Bayesian Approach for the Switching Bandit problem »
Raphaël Féraud · REDA ALAMI