Timezone: »

Thresholding Bandit with Optimal Aggregate Regret
Chao Tao · Saúl Blanco · Jian Peng · Yuan Zhou

Tue Dec 10 05:30 PM -- 07:30 PM (PST) @ East Exhibition Hall B + C #22
We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.

Author Information

Chao Tao (Indiana University Bloomington)
Saúl Blanco (Indiana University)
Jian Peng (University of Illinois at Urbana-Champaign)
Yuan Zhou (UIUC)

More from the Same Authors