Timezone: »
Consider a player that in each of T rounds chooses one of K arms. An adversary chooses the cost of each arm in a bounded interval, and a sequence of feedback delays \left{ d{t}\right} that are unknown to the player. After picking arm a{t} at round t, the player receives the cost of playing this arm d{t} rounds later. In cases where t+d{t}>T, this feedback is simply missing. We prove that the EXP3 algorithm (that uses the delayed feedback upon its arrival) achieves a regret of O\left(\sqrt{\ln K\left(KT+\sum{t=1}^{T}d{t}\right)}\right). For the case where \sum{t=1}^{T}d{t} and T are unknown, we propose a novel doubling trick for online learning with delays and prove that this adaptive EXP3 achieves a regret of O\left(\sqrt{\ln K\left(K^{2}T+\sum{t=1}^{T}d{t}\right)}\right). We then consider a two player zero-sum game where players experience asynchronous delays. We show that even when the delays are large enough such that players no longer enjoy the “no-regret property”, (e.g., where d{t}=O\left(t\log t\right)) the ergodic average of the strategy profile still converges to the set of Nash equilibria of the game. The result is made possible by choosing an adaptive step size \eta{t} that is not summable but is square summable, and proving a “weighted regret bound” for this general case.
Author Information
Ilai Bistritz (Stanford)
Zhengyuan Zhou (Stanford University)
Xi Chen (New York University)
Nicholas Bambos (Stanford University)
Jose Blanchet (Stanford University)
More from the Same Authors
-
2022 : Minimax Optimal Kernel Operator Learning via Multilevel Training »
Jikai Jin · Yiping Lu · Jose Blanchet · Lexing Ying -
2022 : Synthetic Principle Component Design: Fast Covariate Balancing with Synthetic Controls »
Yiping Lu · Jiajin Li · Lexing Ying · Jose Blanchet -
2023 Poster: Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage »
Jose Blanchet · Miao Lu · Tong Zhang · Han Zhong -
2023 Poster: When can Regression-Adjusted Control Variate Help? Rare Events, Sobolev Embedding and Minimax Optimality »
Jose Blanchet · Haoxuan Chen · Yiping Lu · Lexing Ying -
2023 Poster: Payoff-based Learning with Matrix Multiplicative Weights in Quantum Games »
Kyriakos Lotidis · Panayotis Mertikopoulos · Nicholas Bambos · Jose Blanchet -
2023 Poster: Doubly Smoothed GDA for Constrained Nonconvex-Nonconcave Minimax Optimization »
Taoli Zheng · Linglingzhi Zhu · Anthony Man-Cho So · Jose Blanchet · Jiajin Li -
2022 Poster: Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations via Gradient Descent »
Yiping Lu · Jose Blanchet · Lexing Ying -
2022 Poster: Tikhonov Regularization is Optimal Transport Robust under Martingale Constraints »
Jiajin Li · Sirui Lin · Jose Blanchet · Viet Anh Nguyen -
2022 Poster: Queue Up Your Regrets: Achieving the Dynamic Capacity Region of Multiplayer Bandits »
Ilai Bistritz · Nicholas Bambos -
2021 : Statistical Numerical PDE : Fast Rate, Neural Scaling Law and When it’s Optimal »
Yiping Lu · Haoxuan Chen · Jianfeng Lu · Lexing Ying · Jose Blanchet -
2021 Poster: Adversarial Regression with Doubly Non-negative Weighting Matrices »
Tam Le · Truyen Nguyen · Makoto Yamada · Jose Blanchet · Viet Anh Nguyen -
2021 Poster: Modified Frank Wolfe in Probability Space »
Carson Kent · Jiajin Li · Jose Blanchet · Peter W Glynn -
2020 Poster: Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm »
Tianyi Lin · Nhat Ho · Xi Chen · Marco Cuturi · Michael Jordan -
2020 Poster: Distributionally Robust Parametric Maximum Likelihood Estimation »
Viet Anh Nguyen · Xuhui Zhang · Jose Blanchet · Angelos Georghiou -
2020 Poster: Optimistic Dual Extrapolation for Coherent Non-monotone Variational Inequalities »
Chaobing Song · Zhengyuan Zhou · Yichao Zhou · Yong Jiang · Yi Ma -
2020 Poster: Distributed Distillation for On-Device Learning »
Ilai Bistritz · Ariana Mann · Nicholas Bambos -
2020 Poster: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality »
Nian Si · Jose Blanchet · Soumyadip Ghosh · Mark Squillante -
2020 Poster: Cooperative Multi-player Bandit Optimization »
Ilai Bistritz · Nicholas Bambos -
2020 Spotlight: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality »
Nian Si · Jose Blanchet · Soumyadip Ghosh · Mark Squillante -
2020 Poster: Distributionally Robust Local Non-parametric Conditional Estimation »
Viet Anh Nguyen · Fan Zhang · Jose Blanchet · Erick Delage · Yinyu Ye -
2019 Poster: Learning in Generalized Linear Contextual Bandits with Stochastic Delays »
Zhengyuan Zhou · Renyuan Xu · Jose Blanchet -
2019 Spotlight: Learning in Generalized Linear Contextual Bandits with Stochastic Delays »
Zhengyuan Zhou · Renyuan Xu · Jose Blanchet -
2019 Poster: Multivariate Distributionally Robust Convex Regression under Absolute Error Loss »
Jose Blanchet · Peter W Glynn · Jun Yan · Zhengqing Zhou -
2019 Poster: Semi-Parametric Dynamic Contextual Pricing »
Virag Shah · Ramesh Johari · Jose Blanchet -
2018 Poster: Distributed Multi-Player Bandits - a Game of Thrones Approach »
Ilai Bistritz · Amir Leshem -
2018 Poster: Learning in Games with Lossy Feedback »
Zhengyuan Zhou · Panayotis Mertikopoulos · Susan Athey · Nicholas Bambos · Peter W Glynn · Yinyu Ye -
2017 Poster: Countering Feedback Delays in Multi-Agent Learning »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter W Glynn · Claire Tomlin -
2017 Poster: Stochastic Mirror Descent in Variationally Coherent Optimization Problems »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Stephen Boyd · Peter W Glynn