Timezone: »
Poster
Position-based Multiple-play Bandit Problem with Unknown Position Bias
Junpei Komiyama · Junya Honda · Akiko Takeda
Motivated by online advertising, we study a multiple-play multi-armed bandit problem with position bias that involves several slots and the latter slots yield fewer rewards. We characterize the hardness of the problem by deriving an asymptotic regret bound. We propose the Permutation Minimum Empirical Divergence (PMED) algorithm and derive its asymptotically optimal regret bound. Because of the uncertainty of the position bias, the optimal algorithm for such a problem requires non-convex optimizations that are different from usual partial monitoring and semi-bandit problems. We propose a cutting-plane method and related bi-convex relaxation for these optimizations by using auxiliary variables.
Author Information
Junpei Komiyama (The University of Tokyo)
Junya Honda (The University of Tokyo / RIKEN)
Akiko Takeda (The University of Tokyo / RIKEN)
More from the Same Authors
-
2019 Poster: On the Calibration of Multiclass Classification with Rejection »
Chenri Ni · Nontawat Charoenphakdee · Junya Honda · Masashi Sugiyama -
2017 Poster: Trimmed Density Ratio Estimation »
Song Liu · Akiko Takeda · Taiji Suzuki · Kenji Fukumizu -
2015 Poster: Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring »
Junpei Komiyama · Junya Honda · Hiroshi Nakagawa