Timezone: »
Oral
Uniform convergence may be unable to explain generalization in deep learning
Vaishnavh Nagarajan · J. Zico Kolter
Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic technique of uniform convergence. While
it is well-known that many of these existing bounds are numerically large, through numerous experiments, we bring to light a more concerning aspect of these bounds:
in practice, these bounds can {\em increase} with the training dataset size. Guided by our observations,
we then present examples of overparameterized linear classifiers and neural networks trained by gradient descent (GD) where uniform convergence provably cannot ``explain generalization'' -- even if we take into account the implicit bias of GD {\em to the fullest extent possible}. More precisely, even if we consider only the set of classifiers output by GD, which have test errors less than some small $\epsilon$ in our settings, we show that applying (two-sided) uniform convergence on this set of classifiers will yield only a vacuous generalization guarantee larger than $1-\epsilon$. Through these findings,
we cast doubt on the power of uniform convergence-based generalization bounds to provide a complete picture of why overparameterized deep networks generalize well.
Author Information
Vaishnavh Nagarajan (Carnegie Mellon University)
J. Zico Kolter (Carnegie Mellon University / Bosch Center for AI)
Zico Kolter is an Assistant Professor in the School of Computer Science at Carnegie Mellon University, and also serves as Chief Scientist of AI Research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, explainable, and rigorous methods in deep learning. In addition, he has worked on a number of application areas, highlighted by work on sustainability and smart energy systems. He is the recipient of the DARPA Young Faculty Award, and best paper awards at KDD, IJCAI, and PESGM.
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Uniform convergence may be unable to explain generalization in deep learning »
Tue. Dec 10th 06:45 -- 08:45 PM Room East Exhibition Hall B + C #229
More from the Same Authors
-
2020 : An adversarially robust approach to security-constrained optimal power flow »
Neeraj Vijay Bedmutha · Priya Donti · J. Zico Kolter -
2022 : Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation »
Melrose Roderick · Felix Berkenkamp · Fatemeh Sheikholeslami · J. Zico Kolter -
2022 : Denoised Smoothing with Sample Rejection for Robustifying Pretrained Classifiers »
Fatemeh Sheikholeslami · Wan-Yi Lin · Jan Hendrik Metzen · Huan Zhang · J. Zico Kolter -
2022 : A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games »
Samuel Sokota · Ryan D'Orazio · J. Zico Kolter · Nicolas Loizou · Marc Lanctot · Ioannis Mitliagkas · Noam Brown · Christian Kroer -
2022 : Uncertainty-Driven Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2022 : Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes »
Sina Baharlouei · Fatemeh Sheikholeslami · Meisam Razaviyayn · J. Zico Kolter -
2023 Poster: On the Importance of Exploration for Generalization in Reinforcement Learning »
Yiding Jiang · J. Zico Kolter · Roberta Raileanu -
2023 Poster: Deep Equilibrium Based Neural Operators for Steady-State PDEs »
Tanya Marwah · Ashwini Pokle · J. Zico Kolter · Zachary Lipton · Jianfeng Lu · Andrej Risteski -
2023 Poster: Learning with Explanation Constraints »
Rattana Pukdee · Dylan Sam · J. Zico Kolter · Maria-Florina Balcan · Pradeep Ravikumar -
2023 Poster: Permutation Equivariant Neural Functionals »
Allan Zhou · Kaien Yang · Kaylee Burns · Adriano Cardace · Yiding Jiang · Samuel Sokota · J. Zico Kolter · Chelsea Finn -
2023 Poster: One-Step Diffusion Distillation via Deep Equilibrium Models »
Zhengyang Geng · Ashwini Pokle · J. Zico Kolter -
2023 Poster: Neural Functional Transformers »
Allan Zhou · Kaien Yang · Yiding Jiang · Kaylee Burns · Winnie Xu · Samuel Sokota · J. Zico Kolter · Chelsea Finn -
2023 Poster: Provably Bounding Neural Network Preimages »
Christopher Brix · Suhas Kotha · Huan Zhang · J. Zico Kolter · Krishnamurthy Dvijotham -
2023 Poster: Language Models are Weak Learners »
Hariharan Manikandan · Yiding Jiang · J. Zico Kolter -
2023 Workshop: XAI in Action: Past, Present, and Future Applications »
Chhavi Yadav · Michal Moshkovitz · Nave Frost · Suraj Srinivas · Bingqing Chen · Valentyn Boreiko · Himabindu Lakkaraju · J. Zico Kolter · Dotan Di Castro · Kamalika Chaudhuri -
2022 Workshop: Trustworthy and Socially Responsible Machine Learning »
Huan Zhang · Linyi Li · Chaowei Xiao · J. Zico Kolter · Anima Anandkumar · Bo Li -
2022 : Zico Kolter, Adapt like you train: How optimization at training time affects model finetuning and adaptation »
J. Zico Kolter -
2022 Poster: Characterizing Datapoints via Second-Split Forgetting »
Pratyush Maini · Saurabh Garg · Zachary Lipton · J. Zico Kolter -
2022 Poster: Learning Options via Compression »
Yiding Jiang · Evan Liu · Benjamin Eysenbach · J. Zico Kolter · Chelsea Finn -
2022 Poster: Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation »
Zhouxing Shi · Yihan Wang · Huan Zhang · J. Zico Kolter · Cho-Jui Hsieh -
2022 Poster: Test Time Adaptation via Conjugate Pseudo-labels »
Sachin Goyal · Mingjie Sun · Aditi Raghunathan · J. Zico Kolter -
2022 Poster: Deep Equilibrium Approaches to Diffusion Models »
Ashwini Pokle · Zhengyang Geng · J. Zico Kolter -
2022 Poster: Agreement-on-the-line: Predicting the Performance of Neural Networks under Distribution Shift »
Christina Baek · Yiding Jiang · Aditi Raghunathan · J. Zico Kolter -
2022 Poster: General Cutting Planes for Bound-Propagation-Based Neural Network Verification »
Huan Zhang · Shiqi Wang · Kaidi Xu · Linyi Li · Bo Li · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2022 Poster: The Pitfalls of Regularization in Off-Policy TD Learning »
Gaurav Manek · J. Zico Kolter -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Enforcing Robustness for Neural Network Policies »
J. Zico Kolter -
2021 Poster: Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification »
Shiqi Wang · Huan Zhang · Kaidi Xu · Xue Lin · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2021 Poster: Joint inference and input optimization in equilibrium networks »
Swaminathan Gurumurthy · Shaojie Bai · Zachary Manchester · J. Zico Kolter -
2021 Poster: $(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations »
Zhichun Huang · Shaojie Bai · J. Zico Kolter -
2021 Poster: Boosted CVaR Classification »
Runtian Zhai · Chen Dan · Arun Suggala · J. Zico Kolter · Pradeep Ravikumar -
2021 Poster: Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds »
Yujia Huang · Huan Zhang · Yuanyuan Shi · J. Zico Kolter · Anima Anandkumar -
2021 Poster: Adversarially robust learning for security-constrained optimal power flow »
Priya Donti · Aayushya Agarwal · Neeraj Vijay Bedmutha · Larry Pileggi · J. Zico Kolter -
2021 Poster: Robustness between the worst and average case »
Leslie Rice · Anna Bair · Huan Zhang · J. Zico Kolter -
2021 Poster: Monte Carlo Tree Search With Iteratively Refining State Abstractions »
Samuel Sokota · Caleb Y Ho · Zaheen Ahmad · J. Zico Kolter -
2020 : Invited Talk (Zico Kolter) »
J. Zico Kolter -
2020 Workshop: Machine Learning for Engineering Modeling, Simulation and Design »
Alex Beatson · Priya Donti · Amira Abdel-Rahman · Stephan Hoyer · Rose Yu · J. Zico Kolter · Ryan Adams -
2020 : Keynote by Zico Kolter »
J. Zico Kolter -
2020 Poster: Community detection using fast low-cardinality semidefinite programming
 »
Po-Wei Wang · J. Zico Kolter -
2020 Poster: Deep Archimedean Copulas »
Chun Kai Ling · Fei Fang · J. Zico Kolter -
2020 Tutorial: (Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization Q&A »
David Duvenaud · J. Zico Kolter · Matthew Johnson -
2020 Poster: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Spotlight: Efficient semidefinite-programming-based inference for binary and multi-class MRFs »
Chirag Pabbaraju · Po-Wei Wang · J. Zico Kolter -
2020 Poster: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2020 Poster: Denoised Smoothing: A Provable Defense for Pretrained Classifiers »
Hadi Salman · Mingjie Sun · Greg Yang · Ashish Kapoor · J. Zico Kolter -
2020 Poster: Monotone operator equilibrium networks »
Ezra Winston · J. Zico Kolter -
2020 Spotlight: Monotone operator equilibrium networks »
Ezra Winston · J. Zico Kolter -
2020 Oral: Multiscale Deep Equilibrium Models »
Shaojie Bai · Vladlen Koltun · J. Zico Kolter -
2020 Tutorial: (Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization »
David Duvenaud · J. Zico Kolter · Matthew Johnson -
2019 Poster: Learning Stable Deep Dynamics Models »
J. Zico Kolter · Gaurav Manek -
2019 Poster: Adversarial Music: Real world Audio Adversary against Wake-word Detection System »
Juncheng Li · Shuhui Qu · Xinjian Li · Joseph Szurley · J. Zico Kolter · Florian Metze -
2019 Spotlight: Adversarial Music: Real world Audio Adversary against Wake-word Detection System »
Juncheng Li · Shuhui Qu · Xinjian Li · Joseph Szurley · J. Zico Kolter · Florian Metze -
2019 Poster: Differentiable Convex Optimization Layers »
Akshay Agrawal · Brandon Amos · Shane Barratt · Stephen Boyd · Steven Diamond · J. Zico Kolter -
2019 Poster: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2019 Spotlight: Deep Equilibrium Models »
Shaojie Bai · J. Zico Kolter · Vladlen Koltun -
2018 : Talk 1: Zico Kolter - Differentiable Physics and Control »
J. Zico Kolter -
2018 Poster: Differentiable MPC for End-to-end Planning and Control »
Brandon Amos · Ivan Jimenez · Jacob I Sacks · Byron Boots · J. Zico Kolter -
2018 Poster: End-to-End Differentiable Physics for Learning and Control »
Filipe de Avila Belbute Peres · Kevin Smith · Kelsey Allen · Josh Tenenbaum · J. Zico Kolter -
2018 Spotlight: End-to-End Differentiable Physics for Learning and Control »
Filipe de Avila Belbute Peres · Kevin Smith · Kelsey Allen · Josh Tenenbaum · J. Zico Kolter -
2018 Poster: Scaling provable adversarial defenses »
Eric Wong · Frank Schmidt · Jan Hendrik Metzen · J. Zico Kolter -
2018 Tutorial: Adversarial Robustness: Theory and Practice »
J. Zico Kolter · Aleksander Madry -
2017 : Provable defenses against adversarial examples via the convex outer adversarial polytope »
J. Zico Kolter -
2017 Poster: Gradient descent GAN optimization is locally stable »
Vaishnavh Nagarajan · J. Zico Kolter -
2017 Oral: Gradient descent GAN optimization is locally stable »
Vaishnavh Nagarajan · J. Zico Kolter -
2017 Poster: Task-based End-to-end Model Learning in Stochastic Optimization »
Priya Donti · J. Zico Kolter · Brandon Amos -
2016 Poster: The Multiple Quantile Graphical Model »
Alnur Ali · J. Zico Kolter · Ryan Tibshirani -
2013 Workshop: Machine Learning for Sustainability »
Edwin Bonilla · Thomas Dietterich · Theodoros Damoulas · Andreas Krause · Daniel Sheldon · Iadine Chades · J. Zico Kolter · Bistra Dilkina · Carla Gomes · Hugo P Simao -
2011 Workshop: Machine Learning for Sustainability »
Thomas Dietterich · J. Zico Kolter · Matthew A Brown -
2011 Poster: The Fixed Points of Off-Policy TD »
J. Zico Kolter -
2011 Spotlight: The Fixed Points of Off-Policy TD »
J. Zico Kolter -
2010 Poster: Energy Disaggregation via Discriminative Sparse Coding »
J. Zico Kolter · Siddarth Batra · Andrew Y Ng -
2009 Mini Symposium: Machine Learning for Sustainability »
J. Zico Kolter · Thomas Dietterich · Andrew Y Ng -
2007 Spotlight: Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion »
J. Zico Kolter · Pieter Abbeel · Andrew Y Ng -
2007 Poster: Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion »
J. Zico Kolter · Pieter Abbeel · Andrew Y Ng