Timezone: »

Learning, Inference and Control of Multi-Agent Systems
Thore Graepel · Marc Lanctot · Joel Leibo · Guy Lever · Janusz Marecki · Frans Oliehoek · Karl Tuyls · Vicky Holgate

Thu Dec 08 11:00 PM -- 09:30 AM (PST) @ Room 133 + 134
Event URL: https://sites.google.com/site/malicnips2016/ »

We live in a multi-agent world and to be successful in that world, agents, and in particular, artificially intelligent agents, will need to learn to take into account the agency of others. They will need to compete in market places, cooperate in teams, communicate with others, coordinate their plans, and negotiate outcomes. Examples include self-driving cars interacting in traffic, personal assistants acting on behalf of humans and negotiating with other agents, swarms of unmanned aerial vehicles, financial trading systems, robotic teams, and household robots.

Furthermore, the evolution of human intelligence itself presumably depended on interaction among human agents, possibly starting out with confrontational scavenging [1] and culminating in the evolution of culture, societies, and language. Learning from other agents is a key feature of human intelligence and an important field of research in machine learning [2]. It is therefore conceivable that exposing learning AI agents to multi-agent situations is necessary for their development towards intelligence.

We can also think of multi-agent systems as a design philosophy for complex systems. We can analyse complex systems in terms of agents at multiple scales. For example, we can view the system of world politics as an interaction of nation state agents, nation states as an interaction of organizations, and further down into departments, people etc. Conversely, when designing systems we can think of agents as building blocks or modules interacting to produce the behaviour of the system, e.g. [3].

Multi-agent systems can have desirable properties such as robustness and scalability, but their design requires careful consideration of incentive structures, learning, and communication. In the most extreme case, agents with individual views of the world, individual actuators, and individual incentive structures need to coordinate to achieve a common goal. To succeed they may need a Theory of Mind that allows them to reason about other agents’ intentions, beliefs, and behaviours [4]. When multiple learning agents are interacting, the learning problem from each agent’s perspective may become non-stationary, non-Markovian, and only partially observable. Studying the dynamics of learning algorithms could lead to better insight about the evolution and stability of such systems [5].

Problems involving competing or cooperating agents feature in recent AI breakthroughs in competitive games [6,7], current ambitions of AI such as robotic football teams [8], and new research into emergent language and agent communication in reinforcement learning [9,10].

In summary, multi-agent learning will be of crucial importance to the future of computational intelligence and pose difficult and fascinating problems that need to be addressed across disciplines. The paradigm shift from single-agent to multi-agent systems will be pervasive and will require efforts across different fields including machine learning, cognitive science, robotics, natural computing, and (evolutionary) game theory. In this workshop we aim to bring together researchers from these different fields to discuss the current state of the art, future avenues and visions for work regarding theory and practice of multi-agent learning, inference, and decision-making.

Topics we consider for inclusion in the workshop include multi-agent reinforcement learning; deep multi-agent learning; theory of mind; multi-agent communication; POMDPs, Dec-POMDPS and partially observable stochastic games; multi-agent robotics, human-robot collaboration, swarm robotics; game theory, mechanism design, algorithms for computing nash equilibria and other solution concepts; bioinspired approaches, swarm intelligence and collective intelligence; co-evolution, evolutionary dynamics and culture; ad hoc teamwork.

[1] ‘Confrontational scavenging as a possible source for language and cooperation’, Derek Bickerton and Eörs Szathmáry, BMC Evolutionary Biology 2011
[2] ‘Apprenticeship Learning via Inverse Reinforcement Learning’, Pieter Abbeel and Andrew Y. Ng, ICML 2004
[3] ‘The Society of Mind’, Marvin Minsky, 1986
[4] ‘Building Machines That Learn and Think Like People’, Brenden M. Lake et al., CBMM Memo 2016
[5] ‘Evolutionary Dynamics of Multi-Agent Learning: A Survey’, Daan Bloembergen et al., JAIR 2015
[6] 'Mastering the game of Go with deep neural networks and tree search', David Silver et al., Nature 2016
[7] 'Heads-up limit hold’em poker is solved', Michael Bowling et al., Science 2015
[8] RoboCup, http://www.robocup.org/
[9] 'Learning to Communicate with Deep Multi-Agent Reinforcement Learning', Jakob N. Foerster et al., Arxiv 2016
[10] 'Learning Multiagent Communication with Backpropagation', Sainbayar Sukhbaatar et al. Arxiv 2016

Thu 11:30 p.m. - 11:50 p.m. [iCal]
Introduction (Talk)
Thore Graepel, Karl Tuyls, Frans Oliehoek
Thu 11:50 p.m. - 12:40 a.m. [iCal]
Learning to Communicate with Deep Multi−Agent Reinforcement Learning (Talk)
Shimon Whiteson
Fri 12:40 a.m. - 1:30 a.m. [iCal]
Computer Curling: AI in Sports Analytics (Talk)
Michael Bowling
Fri 2:00 a.m. - 2:50 a.m. [iCal]
Reverse engineering human cooperation (or, How to build machines that treat people like people) (Talk)
Josh Tenenbaum, Max Kleiman-Weiner
Fri 2:50 a.m. - 3:40 a.m. [iCal]
Spotlight Session (Spotlight)
Fri 3:40 a.m. - 4:30 a.m. [iCal]
Lunch (Break)
Fri 4:30 a.m. - 5:10 a.m. [iCal]
Poster Session
Fri 5:10 a.m. - 6:00 a.m. [iCal]
Multi-Agent and Multi-Robot Coordination with Uncertainty and Limited Communication (Talk)
Fri 6:00 a.m. - 6:30 a.m. [iCal]
Coffee Break (Break)
Fri 6:30 a.m. - 6:50 a.m. [iCal]
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving (Talk)
Fri 6:50 a.m. - 7:10 a.m. [iCal]
A Study of Value Iteration with Non-Stationary Strategies in General Sum Markov Games (Talk)
Julien Pérolat
Fri 7:10 a.m. - 7:30 a.m. [iCal]
Learning to Assemble Objects with Robot Swarms (Talk)
Gerhard Neumann
Fri 7:30 a.m. - 7:50 a.m. [iCal]
Fri 7:50 a.m. - 8:40 a.m. [iCal]

While a single, small robot is limited in its capabilities to perform complex tasks, large groups or "swarms" of such robots have a much bigger potential. Physically, they can collaborate to move heavier things, cross gaps bigger than a single robot body length, or explore unknown areas much quicker. Mentally, they can take in and process much more information than a single robot could, even if communication is extremely limited. In the NIPS 2016 workshop on multi-agent systems, it is suggested that true Artificial Intelligence can only be reached by having robots interact with each other, and it is well-known that groups of robots potentially have a much larger collective learning potential than animals or humans.

So, why are we not yet seeing many such robotic swarms in the real world or even in academia? In my talk I will go into the challenges of making an autonomous swarm of tiny drones explore an unknown building. These drones are < 50 grams and have to fly around, avoid obstacles, navigate, and work together for the most efficient exploration. I will highlight how complex these various challenges are and report on a specific study in which we have drones use their bluetooth modules to avoid each other, should they find themselves in the same small indoor space. This case study will illustrate what are in my eyes the major challenges towards the promised autonomous robotic swarms.

Guido de Croon
Fri 8:40 a.m. - 9:20 a.m. [iCal]
Discussion Panel
Fri 9:20 a.m. - 9:30 a.m. [iCal]
Concluding Remarks (Talk)
Thore Graepel, Frans Oliehoek, Karl Tuyls

Author Information

Thore Graepel (DeepMind)
Marc Lanctot (DeepMind)
Joel Leibo (Google DeepMind)
Guy Lever (UCL)
Janusz Marecki (DeepMind)
Frans Oliehoek (Delft University of Technology)
Karl Tuyls (University of Liverpool)
Vicky Holgate (DeepMind)

More from the Same Authors