NeurIPS Near-Optimal No-Regret Learning in General Games

Oral
in
Workshop: Learning in Presence of Strategic Behavior

Near-Optimal No-Regret Learning in General Games

Constantinos Daskalakis · Maxwell Fishelson · Noah Golowich

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains

p o l y (log T)

${\rm poly}(\log T)$ regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after

T

$T$ rounds of interaction, each player experiences total regret that is

p o l y (log T)

${\rm poly}(\log T)$ . Our bound improves, exponentially, the

O (T^{1 / 2})

$O({T}^{1/2})$ regret attainable by standard no-regret learners in games, the

O (T^{1 / 4})

$O(T^{1/4})$ regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the

O (T^{1 / 6})

${O}(T^{1/6})$ bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Peng, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of

~ O (\frac{1}{T})

$\tilde{O}\left(\frac 1T\right)$ .

Chat is not available.

Oral in Workshop: Learning in Presence of Strategic Behavior

Near-Optimal No-Regret Learning in General Games

Constantinos Daskalakis · Maxwell Fishelson · Noah Golowich

Oral
in
Workshop: Learning in Presence of Strategic Behavior