Skip to yearly menu bar Skip to main content

Workshop: Heavy Tails in ML: Structure, Stability, Dynamics

Towards Fully Adaptive Regret Minimization in Heavy-Tailed Bandits

Gianmarco Genalti · Lupo Marsigli · Nicola Gatti · Alberto Maria Metelli

Keywords: [ Adaptive ] [ bandits ] [ heavy tail ] [ Multi-armed Bandits ]

Abstract: Heavy-tailed distributions naturally arise in many settings, from finance to telecommunications. While regret minimization under sub-Gaussian or bounded support rewards has been widely studied, learning on heavy-tailed distributions only gained popularity over the last decade.In the stochastic heavy-tailed bandit problem, an agent learns under the assumption that the distributions have finite moments of maximum order $1+\epsilon$ which are uniformly bounded by a constant $u$, for some $\epsilon \in (0,1]$. To the best of our knowledge, literature only provides algorithms requiring these two quantities as an input.In this paper we study the stochastic adaptive heavy-tailed bandit, a variation of the standard setting where both $\epsilon$ and $u$ are unknown to the agent.We show that adaptivity comes at cost, introducing two lower bounds on the regret of any adaptive algorithm implying an higher regret w.r.t. the standard setting. Finally, we introduce a specific distributional assumption and provide Adaptive Robust UCB, a regret minimization strategy matching the known lower bound for the heavy-tailed MAB problem.

Chat is not available.