## A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian Process Bandits

### Ilija Bogunovic · Zihan Li · Andreas Krause · Jonathan Scarlett

##### Hall J #541

Keywords: [ kernelized bandits ] [ corruption-tolerant ] [ Gaussian process ] [ Bandit optimization ]

[ Abstract ]
[ [
Tue 29 Nov 9 a.m. PST — 11 a.m. PST

Abstract: We consider the sequential optimization of an unknown, continuous, and expensive to evaluate reward function, from noisy and adversarially corrupted observed rewards. When the corruption attacks are subject to a suitable budget $C$ and the function lives in a Reproducing Kernel Hilbert Space (RKHS), the problem can be posed as {\em corrupted Gaussian process (GP) bandit optimization}. We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants. Our algorithm, {\em Robust GP Phased Elimination (RGP-PE)}, successfully balances robustness to corruptions with exploration and exploitation such that its performance degrades minimally in the presence (or absence) of adversarial corruptions. When $T$ is the number of samples and $\gamma_T$ is the maximal information gain, the corruption-dependent term in our regret bound is $O(C \gamma_T^{3/2})$, which is significantly tighter than the existing $O(C \sqrt{T \gamma_T})$ for several commonly-considered kernels. We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.

Chat is not available.