Timezone: »

Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits
Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #1001
We introduce GLRklUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLRklUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a $O(\sqrt{TA\Upsilon_T\log(T)})$ regret in $T$ rounds on some ``easy'' instances in which there is sufficient delay between two change-points, where $A$ is the number of arms and $\Upsilon_T$ the number of change-points, without prior knowledge of $\Upsilon_T$. In contrast with recently proposed algorithms that are agnostic to $\Upsilon_T$, we perform a numerical study showing that GLRklUCB is also very efficient in practice, beyond easy instances.

Author Information

Lilian Besson
Emilie Kaufmann (CNRS)
Odalric-Ambrym Maillard (INRIA Lille Nord Europe)
Julien Seznec

More from the Same Authors