Timezone: »
Poster
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
Mikhail Belkin · Daniel Hsu · Partha P Mitra
Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for ``overfitted'' / interpolated classifiers appears to be ubiquitous in high-dimensional data, having been observed in deep networks, kernel machines, boosting and random forests. Their performance is consistently robust even when the data contain large amounts of label noise.
Very little theory is available to explain these observations. The vast majority of theoretical analyses of generalization allows for interpolation only when there is little or no label noise. This paper takes a step toward a theoretical foundation for interpolated classifiers by analyzing local interpolating schemes, including geometric simplicial interpolation algorithm and singularly weighted $k$-nearest neighbor schemes. Consistency or near-consistency is proved for these schemes in classification and regression problems. Moreover, the nearest neighbor schemes exhibit optimal rates under some standard statistical assumptions.
Finally, this paper suggests a way to explain the phenomenon of adversarial examples, which are seemingly ubiquitous in modern machine learning, and also discusses some connections to kernel machines and random forests in the interpolated regime.
Author Information
Mikhail Belkin (Ohio State University)
Daniel Hsu (Columbia University)
See <https://www.cs.columbia.edu/~djhsu/>
Partha P Mitra (Cold Spring Harbor Laboratory)
More from the Same Authors
-
2020 : Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics »
Bo Cowgill · Fabrizio Dell'Acqua · Augustin Chaintreau · Nakul Verma · Samuel Deng · Daniel Hsu -
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2023 Poster: Representational Strengths and Limitations of Transformers »
Clayton Sanford · Daniel Hsu · Matus Telgarsky -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2021 Poster: Support vector machines and linear regression coincide with very high-dimensional features »
Navid Ardeshir · Clayton Sanford · Daniel Hsu -
2021 Poster: Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures »
Yuan Cao · Quanquan Gu · Mikhail Belkin -
2021 Poster: Multiple Descent: Design Your Own Generalization Curve »
Lin Chen · Yifei Min · Mikhail Belkin · Amin Karbasi -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 Poster: Ensuring Fairness Beyond the Training Data »
Debmalya Mandal · Samuel Deng · Suman Jana · Jeannette Wing · Daniel Hsu -
2019 : Poster Session »
Jonathan Scarlett · Piotr Indyk · Ali Vakilian · Adrian Weller · Partha P Mitra · Benjamin Aubin · Bruno Loureiro · Florent Krzakala · Lenka Zdeborová · Kristina Monakhova · Joshua Yurtsever · Laura Waller · Hendrik Sommerhoff · Michael Moeller · Rushil Anirudh · Shuang Qiu · Xiaohan Wei · Zhuoran Yang · Jayaraman Thiagarajan · Salman Asif · Michael Gillhofer · Johannes Brandstetter · Sepp Hochreiter · Felix Petersen · Dhruv Patel · Assad Oberai · Akshay Kamath · Sushrut Karmalkar · Eric Price · Ali Ahmed · Zahra Kadkhodaie · Sreyas Mohan · Eero Simoncelli · Carlos Fernandez-Granda · Oscar Leong · Wesam Sakla · Rebecca Willett · Stephan Hoyer · Jascha Sohl-Dickstein · Sam Greydanus · Gauri Jagatap · Chinmay Hegde · Michael Kellman · Jonathan Tamir · Nouamane Laanait · Ousmane Dia · Mirco Ravanelli · Jonathan Binas · Negar Rostamzadeh · Shirin Jalali · Tiantian Fang · Alex Schwing · Sébastien Lachapelle · Philippe Brouillard · Tristan Deleu · Simon Lacoste-Julien · Stella Yu · Arya Mazumdar · Ankit Singh Rawat · Yue Zhao · Jianshu Chen · Xiaoyang Li · Hubert Ramsauer · Gabrio Rizzuti · Nikolaos Mitsakos · Dingzhou Cao · Thomas Strohmer · Yang Li · Pei Peng · Gregory Ongie -
2019 Poster: On the number of variables to use in principal component regression »
Ji Xu · Daniel Hsu -
2018 Poster: Benefits of over-parameterization with EM »
Ji Xu · Daniel Hsu · Arian Maleki -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 Poster: Linear regression without correspondence »
Daniel Hsu · Kevin Shi · Xiaorui Sun -
2017 Poster: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2017 Spotlight: Diving into the shallows: a computational perspective on large-scale shallow learning »
SIYUAN MA · Mikhail Belkin -
2016 Poster: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Poster: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Graphons, mergeons, and so on! »
Justin Eldridge · Mikhail Belkin · Yusu Wang -
2016 Poster: Clustering with Bregman Divergences: an Asymptotic Analysis »
Chaoyue Liu · Mikhail Belkin -
2016 Poster: Search Improves Label for Active Learning »
Alina Beygelzimer · Daniel Hsu · John Langford · Chicheng Zhang -
2015 Poster: Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path »
Daniel Hsu · Aryeh Kontorovich · Csaba Szepesvari -
2015 Poster: A Pseudo-Euclidean Iteration for Optimal Recovery in Noisy ICA »
James R Voss · Mikhail Belkin · Luis Rademacher -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2014 Poster: Learning with Fredholm Kernels »
Qichao Que · Mikhail Belkin · Yusu Wang -
2013 Workshop: Workshop on Spectral Learning »
Byron Boots · Daniel Hsu · Borja Balle -
2013 Workshop: Modern Nonparametric Methods in Machine Learning »
Arthur Gretton · Mladen Kolar · Samory Kpotufe · John Lafferty · Han Liu · Bernhard Schölkopf · Alexander Smola · Rob Nowak · Mikhail Belkin · Lorenzo Rosasco · peter bickel · Yue Zhao -
2013 Poster: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2013 Poster: Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis »
James R Voss · Luis Rademacher · Mikhail Belkin -
2013 Spotlight: Inverse Density as an Inverse Problem: the Fredholm Equation Approach »
Qichao Que · Mikhail Belkin -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2013 Poster: Contrastive Learning Using Spectral Methods »
James Y Zou · Daniel Hsu · David Parkes · Ryan Adams -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2011 Poster: Data Skeletonization via Reeb Graphs »
Xiaoyin Ge · Issam I Safa · Mikhail Belkin · Yusu Wang -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Poster: Semi-supervised Learning using Sparse Eigenfunction Bases »
Kaushik Sinha · Mikhail Belkin -
2007 Spotlight: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2007 Poster: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2007 Spotlight: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2007 Poster: The Value of Labeled and Unlabeled Examples when the Model is Imperfect »
Kaushik Sinha · Mikhail Belkin -
2006 Poster: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts »
Hariharan Narayanan · Mikhail Belkin · Partha Niyogi -
2006 Poster: Convergence of Laplacian Eigenmaps »
Mikhail Belkin · Partha Niyogi