Timezone: »

Comparator-Adaptive Convex Bandits
Dirk van der Hoeven · Ashok Cutkosky · Haipeng Luo

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #1020

We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.

Author Information

Dirk van der Hoeven (Leiden University)
Ashok Cutkosky (Boston University)
Haipeng Luo (University of Southern California)

More from the Same Authors