Timezone: »
Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled from the training distribution. In particular, the optimal transport cost can be interpreted as a generalization of variance which captures the structural properties of the learned feature space. Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature.
Author Information
Ching-Yao Chuang (MIT)
Youssef Mroueh (IBM T.J Watson Research Center)
Kristjan Greenewald (MIT-IBM Watson AI Lab; IBM Research)
Antonio Torralba (Massachusetts Institute of Technology)
Stefanie Jegelka (MIT)
Stefanie Jegelka is an X-Consortium Career Development Assistant Professor in the Department of EECS at MIT. She is a member of the Computer Science and AI Lab (CSAIL), the Center for Statistics and an affiliate of the Institute for Data, Systems and Society and the Operations Research Center. Before joining MIT, she was a postdoctoral researcher at UC Berkeley, and obtained her PhD from ETH Zurich and the Max Planck Institute for Intelligent Systems. Stefanie has received a Sloan Research Fellowship, an NSF CAREER Award, a DARPA Young Faculty Award, the German Pattern Recognition Award and a Best Paper Award at the International Conference for Machine Learning (ICML). Her research interests span the theory and practice of algorithmic machine learning.
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Measuring Generalization with Optimal Transport »
Dates n/a. Room
More from the Same Authors
-
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2021 Spotlight: Learning to Compose Visual Relations »
Nan Liu · Shuang Li · Yilun Du · Josh Tenenbaum · Antonio Torralba -
2021 Spotlight: Learning to See by Looking at Noise »
Manel Baradad Jurjo · Jonas Wulff · Tongzhou Wang · Phillip Isola · Antonio Torralba -
2021 Spotlight: Sliced Mutual Information: A Scalable Measure of Statistical Dependence »
Ziv Goldfeld · Kristjan Greenewald -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 : Optimizing Functionals on the Space of Probabilities with Input Convex Neural Network »
David Alvarez-Melis · Yair Schiff · Youssef Mroueh -
2021 : Optimizing Functionals on the Space of Probabilities with Input Convex Neural Network »
David Alvarez-Melis · Yair Schiff · Youssef Mroueh -
2022 Poster: Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks »
Ching-Yao Chuang · Stefanie Jegelka -
2021 : 3D Neural Scene Representations for Visuomotor Control »
Yunzhu Li · Shuang Li · Vincent Sitzmann · Pulkit Agrawal · Antonio Torralba -
2021 : Invited talk 1 »
Stefanie Jegelka -
2021 Poster: Learning to Compose Visual Relations »
Nan Liu · Shuang Li · Yilun Du · Josh Tenenbaum · Antonio Torralba -
2021 Poster: Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification »
Alkis Gotovos · Rebekka Burkholz · John Quackenbush · Stefanie Jegelka -
2021 Poster: Can contrastive learning avoid shortcut solutions? »
Joshua Robinson · Li Sun · Ke Yu · Kayhan Batmanghelich · Stefanie Jegelka · Suvrit Sra -
2021 Poster: What training reveals about neural network complexity »
Andreas Loukas · Marinos Poiitis · Stefanie Jegelka -
2021 Poster: EditGAN: High-Precision Semantic Image Editing »
Huan Ling · Karsten Kreis · Daiqing Li · Seung Wook Kim · Antonio Torralba · Sanja Fidler -
2021 Poster: Learning to See by Looking at Noise »
Manel Baradad Jurjo · Jonas Wulff · Tongzhou Wang · Phillip Isola · Antonio Torralba -
2021 Poster: Sliced Mutual Information: A Scalable Measure of Statistical Dependence »
Ziv Goldfeld · Kristjan Greenewald -
2021 Poster: PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning »
Yining Hong · Li Yi · Josh Tenenbaum · Antonio Torralba · Chuang Gan -
2021 Poster: Editing a classifier by rewriting its prediction rules »
Shibani Santurkar · Dimitris Tsipras · Mahalaxmi Elango · David Bau · Antonio Torralba · Aleksander Madry -
2021 Poster: Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics »
Carles Domingo i Enrich · Youssef Mroueh -
2021 : ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation »
Chuang Gan · Jeremy Schwartz · Seth Alter · Damian Mrowca · Martin Schrimpf · James Traer · Julian De Freitas · Jonas Kubilius · Abhishek Bhandwaldar · Nick Haber · Megumi Sano · Kuno Kim · Elias Wang · Michael Lingelbach · Aidan Curtis · Kevin Feigelis · Daniel Bear · Dan Gutfreund · David Cox · Antonio Torralba · James J DiCarlo · Josh Tenenbaum · Josh McDermott · Dan Yamins -
2021 Oral: Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics »
Carles Domingo i Enrich · Youssef Mroueh -
2020 Poster: Unbalanced Sobolev Descent »
Youssef Mroueh · Mattia Rigotti -
2020 Poster: Testing Determinantal Point Processes »
Khashayar Gatmiry · Maryam Aliakbarpour · Stefanie Jegelka -
2020 Spotlight: Testing Determinantal Point Processes »
Khashayar Gatmiry · Maryam Aliakbarpour · Stefanie Jegelka -
2020 Poster: A Decentralized Parallel Algorithm for Training Generative Adversarial Nets »
Mingrui Liu · Wei Zhang · Youssef Mroueh · Xiaodong Cui · Jarret Ross · Tianbao Yang · Payel Das -
2020 Poster: IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method »
Yossi Arjevani · Joan Bruna · Bugra Can · Mert Gurbuzbalaban · Stefanie Jegelka · Hongzhou Lin -
2020 Spotlight: IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method »
Yossi Arjevani · Joan Bruna · Bugra Can · Mert Gurbuzbalaban · Stefanie Jegelka · Hongzhou Lin -
2020 Poster: Debiased Contrastive Learning »
Ching-Yao Chuang · Joshua Robinson · Yen-Chen Lin · Antonio Torralba · Stefanie Jegelka -
2020 Spotlight: Debiased Contrastive Learning »
Ching-Yao Chuang · Joshua Robinson · Yen-Chen Lin · Antonio Torralba · Stefanie Jegelka -
2019 : Invited Talk - Stefanie Jegelka - Set Representations in Graph Neural Networks and Reasoning »
Stefanie Jegelka -
2019 : Poster Session »
Lili Yu · Aleksei Kroshnin · Alex Delalande · Andrew Carr · Anthony Tompkins · Aram-Alexandre Pooladian · Arnaud Robert · Ashok Vardhan Makkuva · Aude Genevay · Bangjie Liu · Bo Zeng · Charlie Frogner · Elsa Cazelles · Esteban G Tabak · Fabio Ramos · François-Pierre PATY · Georgios Balikas · Giulio Trigila · Hao Wang · Hinrich Mahler · Jared Nielsen · Karim Lounici · Kyle Swanson · Mukul Bhutani · Pierre Bréchet · Piotr Indyk · samuel cohen · Stefanie Jegelka · Tao Wu · Thibault Sejourne · Tudor Manole · Wenjun Zhao · Wenlin Wang · Wenqi Wang · Yonatan Dukler · Zihao Wang · Chaosheng Dong -
2019 : Stefanie Jegelka »
Stefanie Jegelka -
2019 Poster: Distributionally Robust Optimization and Generalization in Kernel Methods »
Matt Staib · Stefanie Jegelka -
2019 Poster: Flexible Modeling of Diversity with Strongly Log-Concave Distributions »
Joshua Robinson · Suvrit Sra · Stefanie Jegelka -
2019 Poster: Sobolev Independence Criterion »
Youssef Mroueh · Tom Sercu · Mattia Rigotti · Inkit Padhi · Cicero Nogueira dos Santos -
2018 Poster: ResNet with one-neuron hidden layers is a Universal Approximator »
Hongzhou Lin · Stefanie Jegelka -
2018 Spotlight: ResNet with one-neuron hidden layers is a Universal Approximator »
Hongzhou Lin · Stefanie Jegelka -
2018 Poster: Provable Variational Inference for Constrained Log-Submodular Models »
Josip Djolonga · Stefanie Jegelka · Andreas Krause -
2018 Poster: Adversarially Robust Optimization with Gaussian Processes »
Ilija Bogunovic · Jonathan Scarlett · Stefanie Jegelka · Volkan Cevher -
2018 Spotlight: Adversarially Robust Optimization with Gaussian Processes »
Ilija Bogunovic · Jonathan Scarlett · Stefanie Jegelka · Volkan Cevher -
2018 Poster: Exponentiated Strongly Rayleigh Distributions »
Zelda Mariet · Suvrit Sra · Stefanie Jegelka -
2018 Tutorial: Negative Dependence, Stable Polynomials, and All That »
Suvrit Sra · Stefanie Jegelka -
2017 : Invited talk: Scaling Bayesian Optimization in High Dimensions »
Stefanie Jegelka -
2017 Workshop: Discrete Structures in Machine Learning »
Yaron Singer · Jeff A Bilmes · Andreas Krause · Stefanie Jegelka · Amin Karbasi -
2017 Poster: Fisher GAN »
Youssef Mroueh · Tom Sercu -
2017 Poster: Parallel Streaming Wasserstein Barycenters »
Matt Staib · Sebastian Claici · Justin Solomon · Stefanie Jegelka -
2017 Poster: Polynomial time algorithms for dual volume sampling »
Chengtao Li · Stefanie Jegelka · Suvrit Sra -
2016 : Invited Talk - Learning to see objects by listening »
Antonio Torralba -
2016 : Submodular Optimization and Nonconvexity »
Stefanie Jegelka -
2016 Workshop: Nonconvex Optimization for Machine Learning: Theory and Practice »
Hossein Mobahi · Anima Anandkumar · Percy Liang · Stefanie Jegelka · Anna Choromanska -
2016 Poster: Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling »
Chengtao Li · Suvrit Sra · Stefanie Jegelka -
2016 Poster: Cooperative Graphical Models »
Josip Djolonga · Stefanie Jegelka · Sebastian Tschiatschek · Andreas Krause -
2014 Workshop: Discrete Optimization in Machine Learning »
Jeffrey A Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher -
2014 Poster: Parallel Double Greedy Submodular Maximization »
Xinghao Pan · Stefanie Jegelka · Joseph Gonzalez · Joseph K Bradley · Michael Jordan -
2014 Poster: Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra -
2014 Spotlight: Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra -
2014 Poster: On the Convergence Rate of Decomposable Submodular Function Minimization »
Robert Nishihara · Stefanie Jegelka · Michael Jordan -
2014 Poster: Weakly-supervised Discovery of Visual Pattern Configurations »
Hyun Oh Song · Yong Jae Lee · Stefanie Jegelka · Trevor Darrell -
2013 Workshop: Discrete Optimization in Machine Learning: Connecting Theory and Practice »
Stefanie Jegelka · Andreas Krause · Pradeep Ravikumar · Kazuo Murota · Jeffrey A Bilmes · Yisong Yue · Michael Jordan -
2013 Poster: Optimistic Concurrency Control for Distributed Unsupervised Learning »
Xinghao Pan · Joseph Gonzalez · Stefanie Jegelka · Tamara Broderick · Michael Jordan -
2013 Poster: Reflection methods for user-friendly submodular optimization »
Stefanie Jegelka · Francis Bach · Suvrit Sra -
2013 Poster: Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions »
Rishabh K Iyer · Stefanie Jegelka · Jeffrey A Bilmes -
2012 Workshop: Discrete Optimization in Machine Learning (DISCML): Structure and Scalability »
Stefanie Jegelka · Andreas Krause · Jeffrey A Bilmes · Pradeep Ravikumar -
2011 Poster: Fast approximate submodular minimization »
Stefanie Jegelka · Hui Lin · Jeffrey A Bilmes -
2010 Workshop: Discrete Optimization in Machine Learning: Structures, Algorithms and Applications »
Andreas Krause · Pradeep Ravikumar · Jeffrey A Bilmes · Stefanie Jegelka