Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Learning in Presence of Strategic Behavior

Near-Optimal No-Regret Learning in General Games

Constantinos Daskalakis · Maxwell Fishelson · Noah Golowich


Abstract: We show that Optimistic Hedge -- a common variant of multiplicative-weights-updates with recency bias -- attains poly(logT) regret in multi-player general-sum games. In particular, when every player of the game uses Optimistic Hedge to iteratively update her strategy in response to the history of play so far, then after T rounds of interaction, each player experiences total regret that is poly(logT). Our bound improves, exponentially, the O(T1/2) regret attainable by standard no-regret learners in games, the O(T1/4) regret attainable by no-regret learners with recency bias (Syrgkanis et al., 2015), and the O(T1/6) bound that was recently shown for Optimistic Hedge in the special case of two-player games (Chen & Peng, 2020). A corollary of our bound is that Optimistic Hedge converges to coarse correlated equilibrium in general games at a rate of O~(1T).

Chat is not available.