Poster
Stochastic convex optimization with bandit feedback
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin
[
Abstract
]
Abstract:
This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x∈X. We demonstrate a generalization of the ellipsoid algorithm that incurs O(\poly(d)√T) regret. Since any algorithm has regret at least Ω(√T) on this problem, our algorithm is optimal in terms of the scaling with T.
Live content is unavailable. Log in and register to view live content