Timezone: »
We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.
Author Information
Ryan Lowe (McGill University)
YI WU (UC Berkeley)
Aviv Tamar (UC Berkeley)
Jean Harb (McGill University)
OpenAI Pieter Abbeel (OpenAI, UC Berkeley)
Igor Mordatch (OpenAI)
More from the Same Authors
-
2021 : Learning Design and Construction with Varying-Sized Materials via Prioritized Memory Resets »
Yunfei Li · Lei Li · YI WU -
2021 : Deep Variational Semi-Supervised Novelty Detection »
Tal Daniel · Thanard Kurutach · Aviv Tamar -
2021 : Deep Variational Semi-Supervised Novelty Detection »
Tal Daniel · Thanard Kurutach · Aviv Tamar -
2022 Poster: Grounded Reinforcement Learning: Learning to Win the Game under Human Commands »
Shusheng Xu · Huaijie Wang · YI WU -
2022 Poster: Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning »
Zhecheng Yuan · Zhengrong Xue · Bo Yuan · Xueqian Wang · YI WU · Yang Gao · Huazhe Xu -
2022 : Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization »
Runlong Zhou · Yuandong Tian · YI WU · Simon Du -
2022 : Learning Control by Iterative Inversion »
Gal Leibovich · Guy Jacob · Or Avner · Gal Novik · Aviv Tamar -
2022 Spotlight: Lightning Talks 5A-3 »
Minting Pan · Xiang Chen · Wenhan Huang · Can Chang · Zhecheng Yuan · Jianzhun Shao · Yushi Cao · Peihao Chen · Ke Xue · Zhengrong Xue · Zhiqiang Lou · Xiangming Zhu · Lei Li · Zhiming Li · Kai Li · Jiacheng Xu · Dongyu Ji · Ni Mu · Kun Shao · Tianpei Yang · Kunyang Lin · Ningyu Zhang · Yunbo Wang · Lei Yuan · Bo Yuan · Hongchang Zhang · Jiajun Wu · Tianze Zhou · Xueqian Wang · Ling Pan · Yuhang Jiang · Xiaokang Yang · Xiaozhuan Liang · Hao Zhang · Weiwen Hu · Miqing Li · YAN ZHENG · Matthew Taylor · Huazhe Xu · Shumin Deng · Chao Qian · YI WU · Shuncheng He · Wenbing Huang · Chuanqi Tan · Zongzhang Zhang · Yang Gao · Jun Luo · Yi Li · Xiangyang Ji · Thomas Li · Mingkui Tan · Fei Huang · Yang Yu · Huazhe Xu · Dongge Wang · Jianye Hao · Chuang Gan · Yang Liu · Luo Si · Hangyu Mao · Huajun Chen · Jianye Hao · Jun Wang · Xiaotie Deng -
2022 Spotlight: Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning »
Zhecheng Yuan · Zhengrong Xue · Bo Yuan · Xueqian Wang · YI WU · Yang Gao · Huazhe Xu -
2022 Poster: Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach »
Zohar Rimon · Aviv Tamar · Gilad Adler -
2022 Poster: The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games »
Chao Yu · Akash Velu · Eugene Vinitsky · Jiaxuan Gao · Yu Wang · Alexandre Bayen · YI WU -
2021 Poster: Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies »
Ron Dorfman · Idan Shenfeld · Aviv Tamar -
2020 : Mini-panel discussion 1 - Bridging the gap between theory and practice »
Aviv Tamar · Emma Brunskill · Jost Tobias Springenberg · Omer Gottesman · Daniel Mankowitz -
2020 : Keynote: Aviv Tamar »
Aviv Tamar -
2020 Poster: Multi-Task Reinforcement Learning with Soft Modularization »
Ruihan Yang · Huazhe Xu · YI WU · Xiaolong Wang -
2020 Poster: Learning to summarize with human feedback »
Nisan Stiennon · Long Ouyang · Jeffrey Wu · Daniel Ziegler · Ryan Lowe · Chelsea Voss · Alec Radford · Dario Amodei · Paul Christiano -
2019 : Poster Presentations »
Rahul Mehta · Andrew Lampinen · Binghong Chen · Sergio Pascual-Diaz · Jordi Grau-Moya · Aldo Faisal · Jonathan Tompson · Yiren Lu · Khimya Khetarpal · Martin Klissarov · Pierre-Luc Bacon · Doina Precup · Thanard Kurutach · Aviv Tamar · Pieter Abbeel · Jinke He · Maximilian Igl · Shimon Whiteson · Wendelin Boehmer · RaphaĆ«l Marinier · Olivier Pietquin · Karol Hausman · Sergey Levine · Chelsea Finn · Tianhe Yu · Lisa Lee · Benjamin Eysenbach · Emilio Parisotto · Eric Xing · Ruslan Salakhutdinov · Hongyu Ren · Anima Anandkumar · Deepak Pathak · Christopher Lu · Trevor Darrell · Alexei Efros · Phillip Isola · Feng Liu · Bo Han · Gang Niu · Masashi Sugiyama · Saurabh Kumar · Janith Petangoda · Johan Ferret · James McClelland · Kara Liu · Animesh Garg · Robert Lange -
2019 : Invited Talk #4: Multi-Agent Interaction and Online Optimization in RL »
Igor Mordatch -
2019 Workshop: Retrospectives: A Venue for Self-Reflection in ML Research »
Ryan Lowe · Yoshua Bengio · Joelle Pineau · Michela Paganini · Jessica Forde · Shagun Sodhani · Abhishek Gupta · Joel Lehman · Peter Henderson · Kanika Madan · Koustuv Sinha · Xavier Bouthillier -
2019 Poster: Implicit Generation and Modeling with Energy Based Models »
Yilun Du · Igor Mordatch -
2019 Spotlight: Implicit Generation and Modeling with Energy Based Models »
Yilun Du · Igor Mordatch -
2018 Workshop: Emergent Communication Workshop »
Jakob Foerster · Angeliki Lazaridou · Ryan Lowe · Igor Mordatch · Douwe Kiela · Kyunghyun Cho -
2018 Poster: Meta-Learning MCMC Proposals »
Tongzhou Wang · YI WU · Dave Moore · Stuart Russell -
2017 Workshop: Emergent Communication Workshop »
Jakob Foerster · Igor Mordatch · Angeliki Lazaridou · Kyunghyun Cho · Douwe Kiela · Pieter Abbeel -
2017 : Competition III: The Conversational Intelligence Challenge »
Mikhail Burtsev · Ryan Lowe · Iulian Vlad Serban · Yoshua Bengio · Alexander Rudnicky · Alan W Black · Shrimai Prabhumoye · Artem Rodichev · Nikita Smetanin · Denis Fedorenko · CheongAn Lee · EUNMI HONG · Hwaran Lee · Geonmin Kim · Nicolas Gontier · Atsushi Saito · Andrey Gershfeld · Artem Burachenok -
2017 Poster: Hindsight Experience Replay »
Marcin Andrychowicz · Filip Wolski · Alex Ray · Jonas Schneider · Rachel Fong · Peter Welinder · Bob McGrew · Josh Tobin · OpenAI Pieter Abbeel · Wojciech Zaremba -
2016 Poster: Value Iteration Networks »
Aviv Tamar · Sergey Levine · Pieter Abbeel · YI WU · Garrett Thomas -
2016 Oral: Value Iteration Networks »
Aviv Tamar · Sergey Levine · Pieter Abbeel · YI WU · Garrett Thomas -
2014 Workshop: 3rd NIPS Workshop on Probabilistic Programming »
Daniel Roy · Josh Tenenbaum · Thomas Dietterich · Stuart J Russell · YI WU · Ulrik R Beierholm · Alp Kucukelbir · Zenna Tavares · Yura Perov · Daniel Lee · Brian Ruttenberg · Sameer Singh · Michael Hughes · Marco Gaboardi · Alexey Radul · Vikash Mansinghka · Frank Wood · Sebastian Riedel · Prakash Panangaden