Comparator-Adaptive Convex Bandits
Dirk van der Hoeven · Ashok Cutkosky · Haipeng Luo

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #1020

We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.

Author Information

Dirk van der Hoeven (Leiden University)
Ashok Cutkosky (Boston University)
Haipeng Luo (University of Southern California)

