Timezone: »
Poster
Combinatorial Pure Exploration with Bottleneck Reward Function
Yihan Du · Yuko Kuroki · Wei Chen
In this paper, we study the Combinatorial Pure Exploration problem with the Bottleneck reward function (CPE-B) under the fixed-confidence (FC) and fixed-budget (FB) settings.In CPE-B, given a set of base arms and a collection of subsets of base arms (super arms) following a certain combinatorial constraint, a learner sequentially plays a base arm and observes its random reward, with the objective of finding the optimal super arm with the maximum bottleneck value, defined as the minimum expected reward of the base arms contained in the super arm.CPE-B captures a variety of practical scenarios such as network routing in communication networks, and its unique challenges fall on how to utilize the bottleneck property to save samples and achieve the statistical optimality. None of the existing CPE studies (most of them assume linear rewards) can be adapted to solve such challenges, and thus we develop brand-new techniques to handle them.For the FC setting, we propose novel algorithms with optimal sample complexity for a broad family of instances and establish a matching lower bound to demonstrate the optimality (within a logarithmic factor).For the FB setting, we design an algorithm which achieves the state-of-the-art error probability guarantee and is the first to run efficiently on fixed-budget path instances, compared to existing CPE algorithms. Our experimental results on the top-$k$, path and matching instances validate the empirical superiority of the proposed algorithms over their baselines.
Author Information
Yihan Du (IIIS, Tsinghua University)
Yuko Kuroki (University of Tokyo / RIKEN)
Wei Chen (Microsoft Research)
More from the Same Authors
-
2022 Poster: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2023 Poster: Scalable Fair Influence Maximization »
Xiaobin Rui · Zhixiao Wang · Jiayu Zhao · Lichao Sun · Wei Chen -
2023 Poster: Multi-Fidelity Multi-Armed Bandits Revisited »
Xuchuang Wang · Qingyun Wu · Wei Chen · John C.S. Lui -
2022 Spotlight: Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret »
Jiawei Huang · Li Zhao · Tao Qin · Wei Chen · Nan Jiang · Tie-Yan Liu -
2022 Spotlight: Lightning Talks 4A-1 »
Jiawei Huang · Su Jia · Abdurakhmon Sadiev · Ruomin Huang · Yuanyu Wan · Denizalp Goktas · Jiechao Guan · Andrew Li · Wei-Wei Tu · Li Zhao · Amy Greenwald · Jiawei Huang · Dmitry Kovalev · Yong Liu · Wenjie Liu · Peter Richtarik · Lijun Zhang · Zhiwu Lu · R Ravi · Tao Qin · Wei Chen · Hu Ding · Nan Jiang · Tie-Yan Liu -
2022 Poster: Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms or Independent Arms »
Xutong Liu · Jinhang Zuo · Siwei Wang · Carlee Joe-Wong · John C.S. Lui · Wei Chen -
2021 Poster: Continuous Mean-Covariance Bandits »
Yihan Du · Siwei Wang · Zhixuan Fang · Longbo Huang -
2021 Poster: The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle »
Fang Kong · Yueran Yang · Wei Chen · Shuai Li -
2020 Poster: Online Influence Maximization under Linear Threshold Model »
Shuai Li · Fang Kong · Kejie Tang · Qizhi Li · Wei Chen -
2019 Poster: Adaptive Influence Maximization with Myopic Feedback »
Binghui Peng · Wei Chen -
2018 Poster: Community Exploration: From Offline Optimization to Online Learning »
Xiaowei Chen · Weiran Huang · Wei Chen · John C. S. Lui -
2017 Poster: Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications »
Qinshi Wang · Wei Chen -
2017 Poster: Influence Maximization with $\varepsilon$-Almost Submodular Threshold Functions »
Qiang Li · Wei Chen · Institute of Computing Xiaoming Sun · Institute of Computing Jialin Zhang -
2016 Poster: Combinatorial Multi-Armed Bandit with General Reward Functions »
Wei Chen · Wei Hu · Fu Li · Jian Li · Yu Liu · Pinyan Lu -
2015 Poster: Stochastic Online Greedy Learning with Semi-bandit Feedbacks »
Tian Lin · Jian Li · Wei Chen -
2014 Poster: Combinatorial Pure Exploration of Multi-Armed Bandits »
Shouyuan Chen · Tian Lin · Irwin King · Michael R Lyu · Wei Chen -
2014 Oral: Combinatorial Pure Exploration of Multi-Armed Bandits »
Shouyuan Chen · Tian Lin · Irwin King · Michael R Lyu · Wei Chen