NeurIPS Poster Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits

Poster

Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits

Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec

Hall J (level 1) #1001

[ Abstract ]

[Paper PDF]

Abstract: We introduce GLRklUCB, a novel algorithm for the piecewise iid non-stationary bandit problem with bounded rewards. This algorithm combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. Unlike previous non-stationary bandit algorithms using a change-point detector, GLRklUCB does not need to be calibrated based on prior knowledge on the arms' means. We prove that this algorithm can attain a

O (\sqrt{T A Υ_{T} \log (T)})

$O(\sqrt{TA\Upsilon_T\log(T)})$ regret in

T

$T$ rounds on some

easy'' instances in which there is sufficient delay between two change-points, where

A

$A$ is the number of arms and

Υ_{T}

$\Upsilon_T$ the number of change-points, without prior knowledge of

Υ_{T}

$\Upsilon_T$ . In contrast with recently proposed algorithms that are agnostic to

Υ_{T}

$\Upsilon_T$ , we perform a numerical study showing that GLRklUCB is also very efficient in practice, beyond easy instances.

Chat is not available.