Poster
Asymptotically Optimal Quantile Pure Exploration for Infinite-Armed Bandits
Evelyn Xiao-Yue Gong · Mark Sellke
Great Hall & Hall B1+B2 (level 1) #1818
Abstract:
We study pure exploration with infinitely many bandit arms generated \iid from an unknown distribution. Our goal is to efficiently select a single high quality arm whose average reward is, with probability , within of being with the top -fraction of arms; this is a natural adaptation of the classical PAC guarantee for infinite action sets. We consider both the fixed confidence and fixed budget settings, aiming respectively for optimal \emph{expected} and \emph{fixed} sample complexity.For fixed confidence, we give an algorithm with expected sample complexity . This is optimal except for the factor, and the -dependence closes a quadratic gap in the literature. For fixed budget, we show the asymptotically optimal sample complexity as is to leading order; equivalently, the optimal failure probability with exactly samples decays as .The value of depends explicitly on the problem parameters (including the unknown arm distribution) through a certain Fisher information distance. Even the strictly super-linear dependence on was not known and resolves a question of Grossman-Moshkovitz (FOCS 2015).
Chat is not available.