Timezone: »

Combinatorial Pure Exploration of Multi-Armed Bandits
Shouyuan Chen · Tian Lin · Irwin King · Michael R Lyu · Wei Chen

Wed Dec 10 08:00 AM -- 08:20 AM (PST) @ Level 2, room 210
We study the {\em combinatorial pure exploration (CPE)} problem in the stochastic multi-armed bandit setting, where a learner explores a set of arms with the objective of identifying the optimal member of a \emph{decision class}, which is a collection of subsets of arms with certain combinatorial structures such as size-$K$ subsets, matchings, spanning trees or paths, etc. The CPE problem represents a rich class of pure exploration tasks which covers not only many existing models but also novel cases where the object of interest has a non-trivial combinatorial structure. In this paper, we provide a series of results for the general CPE problem. We present general learning algorithms which work for all decision classes that admit offline maximization oracles in both fixed confidence and fixed budget settings. We prove problem-dependent upper bounds of our algorithms. Our analysis exploits the combinatorial structures of the decision classes and introduces a new analytic tool. We also establish a general problem-dependent lower bound for the CPE problem. Our results show that the proposed algorithms achieve the optimal sample complexity (within logarithmic factors) for many decision classes. In addition, applying our results back to the problems of top-$K$ arms identification and multiple bandit best arms identification, we recover the best available upper bounds up to constant factors and partially resolve a conjecture on the lower bounds.

Author Information

Shouyuan Chen (CUHK)
Tian Lin (Tsinghua University)
Irwin King (Chinese University of Hong Kong)
Michael R Lyu (CUHK)
Wei Chen (Microsoft Research)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors