NIPS Poster Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Poster

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Dylan Foster · Akshay Krishnamurthy

Room 517 AB #165

Keywords: [ Bandit Algorithms ] [ Learning Theory ] [ Online Learning ]

[ Abstract ]

Abstract: We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive a new margin-based regret bound in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a

\sqrt{d T}

$\sqrt{dT}$ -type mistake bound against benchmark policies induced by

d

$d$ -dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds.

Live content is unavailable. Log in and register to view live content