Since its early days, the field of Machine Learning has focused on
developing computationally tractable algorithms with good learning
guarantees. The vast literature on statistical learning theory has led
to a good understanding of how the predictive performance of different
algorithms improves as a function of the number of training
samples. By the same token, the welldeveloped theories of
optimization and sampling methods have yielded efficient computational
techniques at the core of most modern learning methods. The separate
developements in these fields mean that given an algorithm we have a
sound understanding of its statistical and computational
beahvior. However, there hasn't been much joint study of the
computational and statistical complexities of learinng, as a
consequence of which, little is known about the interaction and
tradeoffs between statistical accuracy and computational
complexity. Indeed a systematic joint treatment can answer some very
interesting questions: what is the best attainable statistical error
given a finite computational budget? What is the best learning method
to use given different computational constraints and desired
statistical yardsticks? Is it the case that simple methods outperform
complex ones in computationally impoverished scenarios?
At its core, the PAC framework aims to study learning through the lens
of computation. However, the thrust is on separating polynomialtime
from computationally intractable algorithms. However all
polynomialtime computations are hardly equivalent, and the difference
between linear vs quadratic dependence on problem parameters can have
a profound effect on the applicability of an algorithm. Understanding
the tradeoffs between statistical accuracy and computational demands
in this situation is of paramount importance.
The need for such a theory is more compelling now than ever before
since we routinely face training corpuses with billions of examples,
and often, an even larger number of parameters to be estimated. The
emergence of web and mechanical turk as sources of training data often
stretches learning algorithms to the point that the bottleneck is no
longer the number of examples, but the amount of computation available
to process the examples. A theory to principally choose from a
multitude of learning methods based on the properties of training
examples as well as the computational resources available would be of
clear interest. Another way to pose the same problem would be to
design algorithms that can take as input a computational constraint
and try to learn the best hypothesis they can based on the available
budget and data.
There have been some works that try to address different facets of the above problem. Researchers working on massive datasets in the CS theory community look at streaming methods that aim to impose constraints on both the computational and storage requirements of the algorithms. Online learning presents one particular way of dealing with a computational budget, by processing as many samples as possible with the computational budget.
There have been some more relevant works in the machine learning community in the last few years. Bottou and Bousquet (2008)
compare the amount of computation needed to attain a certain
statistical error for a few routinely used optimization
algorithms. ShalevShwartz and Srebro (2009) show how stochastic gradient descent applied to SVM optimization can experience an inverse dependence on number of training sample in the regime of large
datasets. In some more recent works, ShalevShwartz and coauthors have also used cryptographic conjectures to establish the computational hardness of certain learning problems. On the algorithmic front, coarsetofine learning provides a nice framework to systematically incorporate computational considerations, using computational cost as a regularization term in the objective of the learning method. Other budgeted algorithms such as budgeted SVMs and budgeted perceptrons try to admit hard budget constraints on the running time and storage of the algorithm.
The goals of our workshop are:
* To draw the attention of machine learning researchers to this rich and emerging area of problems and to establish a community of researchers that are interested in understanding these tradeoffs.
* To define a number of common problems in this area and to encourage future research that is comparable and compatible.
* To expose the learning community to relevant work in fields such as CS theory and convex optimization.
We will call for papers on the following topics:
* Fundamental statistical limits with bounded computation
Tradeoffs between statistical accuracy and computational costs
Algorithms to learn under budget constraints
* Budget constraints on other resources (like bounded memory)
* Computationally aware approaches such as coarsetofine learning
Author Information
Alekh Agarwal (Microsoft Research)
Sasha Rakhlin (University of Pennsylvania)
More from the Same Authors

2019 Poster: Bias Correction of Learned Generative Models using LikelihoodFree Importance Weighting »
Aditya Grover · Jiaming Song · Ashish Kapoor · Kenneth Tran · Alekh Agarwal · Eric Horvitz · Stefano Ermon 
2018 Poster: On OracleEfficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire 
2018 Spotlight: On OracleEfficient PAC RL with Rich Observations »
Christoph Dann · Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire 
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht 
2017 Poster: Offpolicy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni 
2017 Oral: Offpolicy evaluation for slate recommendation »
Adith Swaminathan · Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik · John Langford · Damien Jose · Imed Zitouni 
2016 Workshop: Time Series Workshop »
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin 
2016 Demonstration: Project Malmo  Minecraft for AI Research »
Katja Hofmann · Matthew A Johnson · Fernando Diaz · Alekh Agarwal · Tim Hutton · David Bignell · Evelyne Viegas 
2016 Poster: Efficient Second Order Online Learning by Sketching »
Haipeng Luo · Alekh Agarwal · Nicolò CesaBianchi · John Langford 
2016 Poster: Contextual semibandits via supervised learning oracles »
Akshay Krishnamurthy · Alekh Agarwal · Miro Dudik 
2016 Poster: PAC Reinforcement Learning with Rich Observations »
Akshay Krishnamurthy · Alekh Agarwal · John Langford 
2015 Workshop: Optimization for Machine Learning (OPT2015) »
Suvrit Sra · Alekh Agarwal · Leon Bottou · Sashank J. Reddi 
2015 Workshop: Time Series Workshop »
Oren Anava · Azadeh Khaleghi · Vitaly Kuznetsov · Alexander Rakhlin 
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
TzuKuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire 
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
TzuKuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire 
2015 Poster: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan 
2015 Spotlight: Adaptive Online Learning »
Dylan Foster · Alexander Rakhlin · Karthik Sridharan 
2015 Poster: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire 
2015 Oral: Fast Convergence of Regularized Learning in Games »
Vasilis Syrgkanis · Alekh Agarwal · Haipeng Luo · Robert Schapire 
2014 Workshop: Modern Nonparametrics 3: Automating the Learning Pipeline »
Eric Xing · Mladen Kolar · Arthur Gretton · Samory Kpotufe · Han Liu · Zoltán Szabó · Alan L Yuille · Andrew G Wilson · Ryan Tibshirani · Sasha Rakhlin · Damian Kozbur · Bharath Sriperumbudur · David LopezPaz · Kirthevasan Kandasamy · Francesco Orabona · Andreas Damianou · Wacha Bounliphone · Yanshuai Cao · Arijit Das · Yingzhen Yang · Giulia DeSalvo · Dmitry Storcheus · Roberto Valerio 
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean B Lasserre · Yoshua Bengio · Amir Beck 
2014 Poster: Scalable Nonlinear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky 
2013 Workshop: Learning Faster From Easy Data »
Peter Grünwald · Wouter M Koolen · Sasha Rakhlin · Nati Srebro · Alekh Agarwal · Karthik Sridharan · Tim van Erven · Sebastien Bubeck 
2013 Workshop: Perturbations, Optimization, and Statistics »
Tamir Hazan · George Papandreou · Sasha Rakhlin · Daniel Tarlow 
2013 Workshop: OPT2013: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal 
2013 Poster: Optimization, Learning, and Games with Predictable Sequences »
Sasha Rakhlin · Karthik Sridharan 
2013 Poster: Online Learning of Dynamic Parameters in Social Networks »
Shahin Shahrampour · Sasha Rakhlin · Ali Jadbabaie 
2012 Workshop: Optimization for Machine Learning »
Suvrit Sra · Alekh Agarwal 
2012 Poster: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan 
2012 Poster: Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright 
2012 Oral: Relax and Randomize : From Value to Algorithms »
Sasha Rakhlin · Ohad Shamir · Karthik Sridharan 
2011 Session: Oral Session 12 »
Sasha Rakhlin 
2011 Poster: Distributed Delayed Stochastic Optimization »
Alekh Agarwal · John Duchi 
2011 Poster: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin 
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin 
2011 Spotlight: Lower Bounds for Passive and Active Learning »
Maxim Raginsky · Sasha Rakhlin 
2011 Poster: Online Learning: Stochastic, Constrained, and Smoothed Adversaries »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari 
2010 Workshop: Learning on Cores, Clusters, and Clouds »
Alekh Agarwal · Lawrence Cayton · Ofer Dekel · John Duchi · John Langford 
2010 Spotlight: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright 
2010 Poster: Distributed Dual Averaging In Networks »
John Duchi · Alekh Agarwal · Martin J Wainwright 
2010 Poster: Random Walk Approach to Regret Minimization »
Hariharan Narayanan · Sasha Rakhlin 
2010 Oral: Fast global convergence rates of gradient methods for highdimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright 
2010 Oral: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari 
2010 Poster: Fast global convergence rates of gradient methods for highdimensional statistical recovery »
Alekh Agarwal · Sahand N Negahban · Martin J Wainwright 
2010 Poster: Online Learning: Random Averages, Combinatorial Parameters, and Learnability »
Sasha Rakhlin · Karthik Sridharan · Ambuj Tewari 
2009 Poster: Informationtheoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright 
2009 Spotlight: Informationtheoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright 
2007 Poster: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf 
2007 Spotlight: An Analysis of Inference with the Universum »
Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf 
2007 Oral: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin 
2007 Poster: Adaptive Online Gradient Descent »
Peter Bartlett · Elad Hazan · Sasha Rakhlin 
2006 Poster: Stability of $K$Means Clustering »
Sasha Rakhlin · Andrea Caponnetto