Skip to yearly menu bar Skip to main content


Poster

Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs

Yanlin Han · Piotr Gmytrasiewicz

Room 517 AB #162

Keywords: [ Planning ] [ Model-Based RL ] [ Decision and Control ] [ Multi-Agent RL ] [ MCMC ]


Abstract:

Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than methods that use subintentional models. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, strategy levels, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.

Live content is unavailable. Log in and register to view live content