Spotlight Poster
Alternation makes the adversary weaker in two-player games
Volkan Cevher · Ashok Cutkosky · Ali Kavis · Georgios Piliouras · Stratis Skoulakis · Luca Viano
Great Hall & Hall B1+B2 (level 1) #919
Abstract:
Motivated by alternating game-play in two-player games, we study an altenating variant of the \textit{Online Linear Optimization} (OLO). In alternating OLO, a \textit{learner} at each round selects a vector and then an \textit{adversary} selects a cost-vector . The learner then experiences cost instead of as in standard OLO. We establish that under this small twist, the lower bound on the regret is no longer valid. More precisely, we present two online learning algorithms for alternating OLO that respectively admit regret for the -dimensional simplex and regret for the ball of radius . Our results imply that in alternating game-play, an agent can always guarantee regardless the strategies of the other agent while the regret bound improves to in case the agent admits only two actions.
Chat is not available.