An $\alpha$-No-Regret Algorithm For Graphical Bilinear Bandits

Geovani Rizk · Igor Colin · Albert Thomas · Rida Laraki · Yann Chevaleyre

Hall J #728
[ Abstract ]
[ Poster [ OpenReview
Tue 29 Nov 9 a.m. PST — 11 a.m. PST

Abstract: We propose the first regret-based approach to the \emph{Graphical Bilinear Bandits} problem, where $n$ agents in a graph play a stochastic bilinear bandit game with each of their neighbors. This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-based algorithm for graphical bilinear bandits using the principle of optimism in the face of uncertainty. Theoretical analysis of this new method yields an upper bound of $\tilde{O}(\sqrt{T})$ on the $\alpha$-regret and evidences the impact of the graph structure on the rate of convergence. Finally, we show through various experiments the validity of our approach.

Chat is not available.