Workshop
Learning, Inference and Control of Multi-Agent Systems
Thore Graepel · Marc Lanctot · Joel Leibo · Guy Lever · Janusz Marecki · Frans Oliehoek · Karl Tuyls · Vicky Holgate

Fri Dec 9th 08:00 AM -- 06:30 PM @ Room 133 + 134
Event URL: https://sites.google.com/site/malicnips2016/ »

We live in a multi-agent world and to be successful in that world, agents, and in particular, artificially intelligent agents, will need to learn to take into account the agency of others. They will need to compete in market places, cooperate in teams, communicate with others, coordinate their plans, and negotiate outcomes. Examples include self-driving cars interacting in traffic, personal assistants acting on behalf of humans and negotiating with other agents, swarms of unmanned aerial vehicles, financial trading systems, robotic teams, and household robots.

Furthermore, the evolution of human intelligence itself presumably depended on interaction among human agents, possibly starting out with confrontational scavenging [1] and culminating in the evolution of culture, societies, and language. Learning from other agents is a key feature of human intelligence and an important field of research in machine learning [2]. It is therefore conceivable that exposing learning AI agents to multi-agent situations is necessary for their development towards intelligence.

We can also think of multi-agent systems as a design philosophy for complex systems. We can analyse complex systems in terms of agents at multiple scales. For example, we can view the system of world politics as an interaction of nation state agents, nation states as an interaction of organizations, and further down into departments, people etc. Conversely, when designing systems we can think of agents as building blocks or modules interacting to produce the behaviour of the system, e.g. [3].

Multi-agent systems can have desirable properties such as robustness and scalability, but their design requires careful consideration of incentive structures, learning, and communication. In the most extreme case, agents with individual views of the world, individual actuators, and individual incentive structures need to coordinate to achieve a common goal. To succeed they may need a Theory of Mind that allows them to reason about other agents’ intentions, beliefs, and behaviours [4]. When multiple learning agents are interacting, the learning problem from each agent’s perspective may become non-stationary, non-Markovian, and only partially observable. Studying the dynamics of learning algorithms could lead to better insight about the evolution and stability of such systems [5].

Problems involving competing or cooperating agents feature in recent AI breakthroughs in competitive games [6,7], current ambitions of AI such as robotic football teams [8], and new research into emergent language and agent communication in reinforcement learning [9,10].

In summary, multi-agent learning will be of crucial importance to the future of computational intelligence and pose difficult and fascinating problems that need to be addressed across disciplines. The paradigm shift from single-agent to multi-agent systems will be pervasive and will require efforts across different fields including machine learning, cognitive science, robotics, natural computing, and (evolutionary) game theory. In this workshop we aim to bring together researchers from these different fields to discuss the current state of the art, future avenues and visions for work regarding theory and practice of multi-agent learning, inference, and decision-making.

Topics we consider for inclusion in the workshop include multi-agent reinforcement learning; deep multi-agent learning; theory of mind; multi-agent communication; POMDPs, Dec-POMDPS and partially observable stochastic games; multi-agent robotics, human-robot collaboration, swarm robotics; game theory, mechanism design, algorithms for computing nash equilibria and other solution concepts; bioinspired approaches, swarm intelligence and collective intelligence; co-evolution, evolutionary dynamics and culture; ad hoc teamwork.

[1] ‘Confrontational scavenging as a possible source for language and cooperation’, Derek Bickerton and Eörs Szathmáry, BMC Evolutionary Biology 2011
[2] ‘Apprenticeship Learning via Inverse Reinforcement Learning’, Pieter Abbeel and Andrew Y. Ng, ICML 2004
[3] ‘The Society of Mind’, Marvin Minsky, 1986
[4] ‘Building Machines That Learn and Think Like People’, Brenden M. Lake et al., CBMM Memo 2016
[5] ‘Evolutionary Dynamics of Multi-Agent Learning: A Survey’, Daan Bloembergen et al., JAIR 2015
[6] 'Mastering the game of Go with deep neural networks and tree search', David Silver et al., Nature 2016
[7] 'Heads-up limit hold’em poker is solved', Michael Bowling et al., Science 2015
[8] RoboCup, http://www.robocup.org/
[9] 'Learning to Communicate with Deep Multi-Agent Reinforcement Learning', Jakob N. Foerster et al., Arxiv 2016
[10] 'Learning Multiagent Communication with Backpropagation', Sainbayar Sukhbaatar et al. Arxiv 2016

08:30 AM Introduction (Talk) Thore Graepel, Karl Tuyls, Frans Oliehoek
08:50 AM Learning to Communicate with Deep Multi−Agent Reinforcement Learning (Talk) Shimon Whiteson
09:40 AM Computer Curling: AI in Sports Analytics (Talk) Michael Bowling
11:00 AM Reverse engineering human cooperation (or, How to build machines that treat people like people) (Talk) Josh Tenenbaum, Max Kleiman-Weiner
11:50 AM Spotlight Session (Spotlight)
12:40 PM Lunch (Break)
01:30 PM Poster Session <span> <a href="#"></a> </span>
02:10 PM Multi-Agent and Multi-Robot Coordination with Uncertainty and Limited Communication (Talk)
03:00 PM Coffee Break (Break)
03:30 PM Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving (Talk)
03:50 PM A Study of Value Iteration with Non-Stationary Strategies in General Sum Markov Games (Talk) Julien Pérolat
04:10 PM Learning to Assemble Objects with Robot Swarms (Talk) Gerhard Neumann
04:30 PM Break <span> <a href="#"></a> </span>
04:50 PM Challenges on the way to fully autonomous swarms of drones (Talk) Guido de Croon
05:40 PM Discussion Panel <span> <a href="#"></a> </span>
06:20 PM Concluding Remarks (Talk) Thore Graepel, Frans Oliehoek, Karl Tuyls

Author Information

Thore Graepel (DeepMind)
Marc Lanctot (DeepMind)
Joel Leibo (Google DeepMind)
Guy Lever (UCL)
Janusz Marecki (DeepMind)
Frans Oliehoek (Delft University of Technology)
Karl Tuyls (University of Liverpool)
Vicky Holgate (DeepMind)

More from the Same Authors