Timezone: »

Distributed Online and Bandit Convex Optimization
Kumar Kshitij Patel · Aadirupa Saha · Nati Srebro · Lingxiao Wang
Event URL: https://openreview.net/forum?id=KKfjOEvDwQ »

We study the problems of distributed online and bandit convex optimization against an adaptive adversary. We want to minimize the average regret on M machines that communicate R times intermittently. We show that collaboration is not beneficial if the machines can access gradients of the cost functions at each time step, i.e., they have first-order feedback. In this setting, simple non-collaborative algorithms are min-max optimal. This contrasts with the provable benefit of collaboration when optimizing against a stochastic adversary, which samples the cost functions from fixed distributions. To identify the benefit of collaboration, we consider the harder setting where the machines can only access values of their cost function, i.e., they have bandit feedback. Here, we identify the high-dimensional regime where collaboration is beneficial and may even lead to a linear speedup in the number of machines. Our results bridge the gap between distributed online optimization against stochastic and adaptive adversaries.

Author Information

Kumar Kshitij Patel (Toyota Technological Institute at Chicago)
Aadirupa Saha (Microsoft Research)

Aadirupa Saha is a PhD student at the department of Computer Science and Automation (CSA), Indian Institute of Science (IISc), Bangalore and was a research intern at Google, Mountain View, CA (June-Sept, 2019). Her research interests broadly lie in the areas of Machine Learning, Statistical Learning Theory and Optimization. Her current research specifically focuses on decision making under uncertainty from sequential data, reinforcement learning, and preference based rank aggregation problems.

Nati Srebro (TTI-Chicago)
Lingxiao Wang (Toyota Technological Institute Chicago)

More from the Same Authors