Timezone: »
Poster
Stochastic convex optimization with bandit feedback
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin
This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in X$. We demonstrate a generalization of the ellipsoid algorithm that incurs $O(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.
Author Information
Alekh Agarwal (Google Research)
Dean P Foster (University of Pennsylvania)
Daniel Hsu (Columbia University)
See <https://www.cs.columbia.edu/~djhsu/>
Sham M Kakade (Harvard University & Amazon)
Sasha Rakhlin (University of Pennsylvania)
More from the Same Authors
-
2020 : Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics »
Bo Cowgill · Fabrizio Dell'Acqua · Augustin Chaintreau · Nakul Verma · Samuel Deng · Daniel Hsu -
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2022 : Provable Benefits of Representational Transfer in Reinforcement Learning »
Alekh Agarwal · Yuda Song · Kaiwen Wang · Mengdi Wang · Wen Sun · Xuezhou Zhang -
2022 Poster: On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL »
Jinglin Chen · Aditya Modi · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal -
2022 Poster: Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity »
Alekh Agarwal · Tong Zhang -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2021 Poster: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Support vector machines and linear regression coincide with very high-dimensional features »
Navid Ardeshir · Clayton Sanford · Daniel Hsu -
2021 Oral: Bellman-consistent Pessimism for Offline Reinforcement Learning »
Tengyang Xie · Ching-An Cheng · Nan Jiang · Paul Mineiro · Alekh Agarwal -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning Q&A »
Sham M Kakade · Martha White · Nicolas Le Roux -
2020 Poster: Ensuring Fairness Beyond the Training Data »
Debmalya Mandal · Samuel Deng · Suman Jana · Jeannette Wing · Daniel Hsu -
2020 Poster: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Spotlight: Policy Improvement via Imitation of Multiple Oracles »
Ching-An Cheng · Andrey Kolobov · Alekh Agarwal -
2020 Poster: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning »
Alekh Agarwal · Mikael Henaff · Sham Kakade · Wen Sun -
2020 Oral: FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs »
Alekh Agarwal · Sham Kakade · Akshay Krishnamurthy · Wen Sun -
2020 Poster: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2020 Poster: Provably Good Batch Reinforcement Learning Without Great Exploration »
Yao Liu · Adith Swaminathan · Alekh Agarwal · Emma Brunskill -
2020 Spotlight: Safe Reinforcement Learning via Curriculum Induction »
Matteo Turchetta · Andrey Kolobov · Shital Shah · Andreas Krause · Alekh Agarwal -
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning »
Sham M Kakade · Martha White · Nicolas Le Roux -
2019 Poster: Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon -
2019 Poster: On the number of variables to use in principal component regression »
Ji Xu · Daniel Hsu -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha P Mitra -
2018 Poster: Benefits of over-parameterization with EM »
Ji Xu · Daniel Hsu · Arian Maleki -
2018 Poster: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: On Oracle-Efficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Poster: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Oral: Off-policy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni -
2017 Poster: Linear regression without correspondence »
Daniel Hsu · Kevin Shi · Xiaorui Sun -
2016 Workshop: Time Series Workshop »
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin -
2016 Demonstration: Project Malmo - Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas -
2016 Poster: Efficient Second Order Online Learning by Sketching »
Haipeng Luo · Alekh Agarwal · Nicolò Cesa-Bianchi · John Langford -
2016 Poster: Contextual semibandits via supervised learning oracles »
Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik -
2016 Poster: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Poster: PAC Reinforcement Learning with Rich Observations »
Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2016 Poster: Search Improves Label for Active Learning »
Alina Beygelzimer · Daniel Hsu · John Langford · Chicheng Zhang -
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi -
2015 Workshop: Time Series Workshop »
Oren Anava · Azadeh Khaleghi · Vitaly Kuznetsov · Alexander Rakhlin -
2015 Poster: Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path »
Daniel Hsu · Aryeh Kontorovich · Csaba Szepesvari -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Poster: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Spotlight: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan -
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire -
2014 Workshop: Modern Nonparametrics 3: Automating the Learning Pipeline »
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David Lopez-Paz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio -
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck -
2014 Poster: large scale canonical correlation analysis with iterative least squares »
Yichao Lu · Dean P Foster -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck -
2013 Workshop: Workshop on Spectral Learning »
Byron Boots · Daniel Hsu · Borja Balle -
2013 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Sasha Rakhlin · Danny Tarlow -
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2013 Poster: Optimization, Learning, and Games with Predictable Sequences »
Sasha Rakhlin · Karthik Sridharan -
2013 Poster: One-shot learning and big data with n=2 »
Lee H Dicker · Dean P Foster -
2013 Poster: New Subsampling Algorithms for Fast Least Squares Regression »
Paramveer Dhillon · Yichao Lu · Dean P Foster · Lyle Ungar -
2013 Poster: Faster Ridge Regression via the Subsampled Randomized Hadamard Transform »
Yichao Lu · Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2013 Poster: Online Learning of Dynamic Parameters in Social Networks »
Shahin Shahrampour · Sasha Rakhlin · Ali Jadbabaie -
2013 Poster: Contrastive Learning Using Spectral Methods »
James Y Zou · Daniel Hsu · David Parkes · Ryan Adams -
2012 Workshop: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2012 Oral: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan -
2011 Workshop: Computational Trade-offs in Statistical Learning »
Alekh Agarwal · Sasha Rakhlin -
2011 Session: Oral Session 12 »
Sasha Rakhlin -
2011 Poster: Distributed Delayed Stochastic Optimization »
Alekh Agarwal · John Duchi -
2011 Poster: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin -
2011 Spotlight: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2011 Poster: Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression »
Sham M Kakade · Adam Kalai · Varun Kanade · Ohad Shamir -
2011 Poster: Multi-View Learning of Word Embeddings via CCA »
Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2011 Poster: Online Learning: Stochastic, Constrained, and Smoothed Adversaries »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford -
2010 Spotlight: Learning from Logged Implicit Exploration Data »
Alex Strehl · Lihong Li · John Langford · Sham M Kakade -
2010 Spotlight: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright -
2010 Poster: Learning from Logged Implicit Exploration Data »
Alexander L Strehl · John Langford · Lihong Li · Sham M Kakade -
2010 Poster: Random Walk Approach to Regret Minimization »
Hariharan Narayanan · Sasha Rakhlin -
2010 Oral: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2010 Oral: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Fast global convergence rates of gradient methods for high-dimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright -
2010 Poster: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization »
Sham M Kakade · Karthik Sridharan · Ambuj Tewari -
2007 Poster: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf -
2007 Spotlight: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2007 Oral: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade -
2007 Oral: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2007 Poster: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin -
2007 Poster: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade -
2007 Poster: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2006 Poster: Stability of $K$-Means Clustering »
Sasha Rakhlin · Andrea Caponnetto