Skip to yearly menu bar Skip to main content


Spotlight Poster

Alternation makes the adversary weaker in two-player games

Volkan Cevher · Ashok Cutkosky · Ali Kavis · Georgios Piliouras · Stratis Skoulakis · Luca Viano

Great Hall & Hall B1+B2 (level 1) #919

Abstract: Motivated by alternating game-play in two-player games, we study an altenating variant of the \textit{Online Linear Optimization} (OLO). In alternating OLO, a \textit{learner} at each round t[n] selects a vector xt and then an \textit{adversary} selects a cost-vector ct[1,1]n. The learner then experiences cost (ct+ct1)xt instead of (ct)xt as in standard OLO. We establish that under this small twist, the Ω(T) lower bound on the regret is no longer valid. More precisely, we present two online learning algorithms for alternating OLO that respectively admit O((logn)4/3T1/3) regret for the n-dimensional simplex and O(ρlogT) regret for the ball of radius ρ>0. Our results imply that in alternating game-play, an agent can always guarantee O~((logn)4/3T1/3) regardless the strategies of the other agent while the regret bound improves to O(logT) in case the agent admits only two actions.

Chat is not available.