NeurIPS Poster Learning Multi-agent Behaviors from Distributed and Streaming Demonstrations

Poster

Learning Multi-agent Behaviors from Distributed and Streaming Demonstrations

Shicheng Liu · Minghui Zhu

Great Hall & Hall B1+B2 (level 1) #1418

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: This paper considers the problem of inferring the behaviors of multiple interacting experts by estimating their reward functions and constraints where the distributed demonstrated trajectories are sequentially revealed to a group of learners. We formulate the problem as a distributed online bi-level optimization problem where the outer-level problem is to estimate the reward functions and the inner-level problem is to learn the constraints and corresponding policies. We propose a novel

multi-agent behavior inference from distributed and streaming demonstrations" (MA-BIRDS) algorithm that allows the learners to solve the outer-level and inner-level problems in a single loop through intermittent communications. We formally guarantee that the distributed learners achieve consensus on reward functions, constraints, and policies, the average local regret (over

N

$N$ online iterations) decreases at the rate of

O (1 / N^{1 - η_{1}} + 1 / N^{1 - η_{2}} + 1 / N)

$O(1/N^{1-\eta_1}+1/N^{1-\eta_2}+1/N)$ , and the cumulative constraint violation increases sub-linearly at the rate of

O (N^{η_{2}} + 1)

$O(N^{\eta_2}+1)$ where

η_{1}, η_{2} \in (1 / 2, 1)

$\eta_1,\eta_2\in (1/2,1)$ .

Chat is not available.