NeurIPS Poster Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Poster

Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Bo Xue · Yimu Wang · Yuanyu Wan · Jinfeng Yi · Lijun Zhang

Great Hall & Hall B1+B2 (level 1) #1902

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: This paper investigates the problem of generalized linear bandits with heavy-tailed rewards, whose

(1 + ϵ)

$(1+\epsilon)$ -th moment is bounded for some

ϵ \in (0, 1]

$\epsilon\in (0,1]$ . Although there exist methods for generalized linear bandits, most of them focus on bounded or sub-Gaussian rewards and are not well-suited for many real-world scenarios, such as financial markets and web-advertising. To address this issue, we propose two novel algorithms based on truncation and mean of medians. These algorithms achieve an almost optimal regret bound of

\tilde{O} (d T^{\frac{1}{1 + ϵ}})

$\widetilde{O}(dT^{\frac{1}{1+\epsilon}})$ , where

d

$d$ is the dimension of contextual information and

T

$T$ is the time horizon. Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches. Additionally, our mean-of-medians-based algorithm requires only

O (\log T)

$O(\log T)$ rewards and one estimator per epoch, making it more practical. Moreover, our algorithms improve the regret bounds by a logarithmic factor compared to existing algorithms when

ϵ = 1

$\epsilon=1$ . Numerical experimental results confirm the merits of our algorithms.

Chat is not available.