We consider bandit problems, motivated by applications in online advertising and news story selection, in which the learner must repeatedly select a slate, that is, a subset of size s from K possible actions, and then receives rewards for just the selected actions. The goal is to minimize the regret with respect to total reward of the best slate computed in hindsight. We consider unordered and ordered versions of the problem, and give efficient algorithms which have regret O(sqrt(T)), where the constant depends on the specific nature of the problem. We also consider versions of the problem where we have access to a number of policies which make recommendations for slates in every round, and give algorithms with O(sqrt(T)) regret for competing with the best such policy as well. We make use of the technique of relative entropy projections combined with the usual multiplicative weight update algorithm to obtain our algorithms.
Author Information
Satyen Kale (Google)
Lev Reyzin (Georgia Institute of Technology)
Robert E Schapire (MIcrosoft Research)
Robert Schapire received his ScB in math and computer science from Brown University in 1986, and his SM (1988) and PhD (1991) from MIT under the supervision of Ronald Rivest. After a short postdoc at Harvard, he joined the technical staff at AT&T Labs (formerly AT&T Bell Laboratories) in 1991 where he remained for eleven years. At the end of 2002, he became a Professor of Computer Science at Princeton University. His awards include the 1991 ACM Doctoral Dissertation Award, the 2003 GĂ¶del Prize and the 2004 Kanelakkis Theory and Practice Award (both of the last two with Yoav Freund). His main research interest is in theoretical and applied machine learning.
More from the Same Authors

2019 Poster: Breaking the Glass Ceiling for EmbeddingBased Classifiers for Large Output Spaces »
Chuan Guo · Ali Mousavi · Xiang Wu · Daniel HoltmannRice · Satyen Kale · Sashank Reddi · Sanjiv Kumar 
2019 Poster: Hypothesis Set Stability and Generalization »
Dylan Foster · Spencer Greenberg · Satyen Kale · Haipeng Luo · Mehryar Mohri · Karthik Sridharan 
2018 Poster: Online Learning of Quantum States »
Scott Aaronson · Xinyi Chen · Elad Hazan · Satyen Kale · Ashwin Nayak 
2018 Poster: Adaptive Methods for Nonconvex Optimization »
Manzil Zaheer · Sashank Reddi · Devendra Sachan · Satyen Kale · Sanjiv Kumar 
2017 Poster: ParameterFree Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan 
2017 Spotlight: ParameterFree Online Learning via Model Selection »
Dylan J Foster · Satyen Kale · Mehryar Mohri · Karthik Sridharan 
2016 Poster: Hardness of Online Sleeping Combinatorial Optimization Problems »
Satyen Kale · Chansoo Lee · David Pal 
2015 Poster: Online Gradient Boosting »
Alina Beygelzimer · Elad Hazan · Satyen Kale · Haipeng Luo 
2014 Workshop: NIPS Workshop on Transactional Machine Learning and ECommerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie 
2014 Poster: A DriftingGames Analysis for Online Learning and Applications to Boosting »
Haipeng Luo · Robert E Schapire 
2013 Workshop: Large Scale Matrix Analysis and Inference »
Reza Zadeh · Gunnar Carlsson · Michael Mahoney · Manfred K. Warmuth · Wouter M Koolen · Nati Srebro · Satyen Kale · Malik MagdonIsmail · Ashish Goel · Matei A Zaharia · David Woodruff · Ioannis Koutis · Benjamin Recht 
2013 Poster: Adaptive Market Making via Online Learning »
Jacob D Abernethy · Satyen Kale 
2013 Oral: Adaptive Market Making via Online Learning »
Jacob D Abernethy · Satyen Kale 
2011 Poster: Newtron: an Efficient Bandit algorithm for Online Multiclass Prediction »
Elad Hazan · Satyen Kale 
2010 Poster: A Reduction from Apprenticeship Learning to Classification »
Umar Syed · Robert E Schapire 
2010 Oral: A Theory of Multiclass Boosting »
Indraneel Mukherjee · Robert E Schapire 
2010 Poster: A Theory of Multiclass Boosting »
Indraneel Mukherjee · Robert E Schapire 
2009 Poster: On Stochastic and Worstcase Models for Investing »
Elad Hazan · Satyen Kale 
2009 Oral: On Stochastic and Worstcase Models for Investing »
Elad Hazan · Satyen Kale 
2009 Poster: Beyond Convexity: Online Submodular Minimization »
Elad Hazan · Satyen Kale 
2007 Oral: A Multiplicative Weights Algorithm for Apprenticeship Learning »
Umar Syed · Robert E Schapire 
2007 Oral: FilterBoost: Regression and Classification on Large Datasets »
Joseph K Bradley · Robert E Schapire 
2007 Poster: FilterBoost: Regression and Classification on Large Datasets »
Joseph K Bradley · Robert E Schapire 
2007 Poster: A Multiplicative Weights Algorithm for Apprenticeship Learning »
Umar Syed · Robert E Schapire 
2007 Poster: Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria »
Elad Hazan · Satyen Kale 
2007 Tutorial: Theory and Applications of Boosting »
Robert E Schapire