Timezone: »
Poster
Robust Learning with Progressive Data Expansion Against Spurious Correlation
Yihe Deng · Yu Yang · Baharan Mirzasoleiman · Quanquan Gu
Event URL: https://uclaml.github.io/PDE/ »
While deep learning models have shown remarkable performance in various tasks, they are susceptible to learning non-generalizable _spurious features_ rather than the core features that are genuinely correlated to the true label. In this paper, beyond existing analyses of linear models, we theoretically examine the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. In light of this, we propose a new training algorithm called **PDE** that efficiently enhances the model's robustness for a better worst-group performance. PDE begins with a group-balanced subset of training data and progressively expands it to facilitate the learning of the core features. Experiments on synthetic and real-world benchmark datasets confirm the superior performance of our method on models such as ResNets and Transformers. On average, our method achieves a $2.8$ \% improvement in worst-group accuracy compared with the state-of-the-art method, while enjoying up to $10\times$ faster training efficiency.
While deep learning models have shown remarkable performance in various tasks, they are susceptible to learning non-generalizable _spurious features_ rather than the core features that are genuinely correlated to the true label. In this paper, beyond existing analyses of linear models, we theoretically examine the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. In light of this, we propose a new training algorithm called **PDE** that efficiently enhances the model's robustness for a better worst-group performance. PDE begins with a group-balanced subset of training data and progressively expands it to facilitate the learning of the core features. Experiments on synthetic and real-world benchmark datasets confirm the superior performance of our method on models such as ResNets and Transformers. On average, our method achieves a $2.8$ \% improvement in worst-group accuracy compared with the state-of-the-art method, while enjoying up to $10\times$ faster training efficiency.
Author Information
Yihe Deng (University of California, Los Angeles)
Yu Yang (University of California, Los Angeles)
Baharan Mirzasoleiman (UCLA)
Quanquan Gu (UCLA)
More from the Same Authors
-
2021 : Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2021 : Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2021 : Faster Perturbed Stochastic Gradient Methods for Finding Local Minima »
Zixiang Chen · Dongruo Zhou · Quanquan Gu -
2021 : Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium »
Chris Junchi Li · Dongruo Zhou · Quanquan Gu · Michael Jordan -
2022 : A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning »
Zixiang Chen · Chris Junchi Li · Angela Yuan · Quanquan Gu · Michael Jordan -
2023 : Risk Bounds of Accelerated SGD for Overparameterized Linear Regression »
Xuheng Li · Yihe Deng · Jingfeng Wu · Dongruo Zhou · Quanquan Gu -
2023 : Fast Sampling via De-randomization for Discrete Diffusion Models »
Zixiang Chen · Angela Yuan · Yongqian Li · Yiwen Kou · Junkai Zhang · Quanquan Gu -
2023 : Causal Graph ODE: Continuous Treatment Effect Modeling in Multi-agent Dynamical Systems »
Zijie Huang · Jeehyun Hwang · Junkai Zhang · Jinwoo Baik · Weitong ZHANG · Quanquan Gu · Dominik Wodarz · Yizhou Sun · Wei Wang -
2023 : CryoSTAR: Cryo-EM Heterogeneous Reconstruction of Atomic Models with Structural Regularization »
Yi Zhou · Yilai Li · Jing Yuan · Fei YE · Quanquan Gu -
2023 : MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction »
Weitong ZHANG · Xiaoyun Wang · Weili Nie · Joe Eaton · Brad Rees · Quanquan Gu -
2023 : How does semi-supervised learning with pseudo-labelers work? A case study »
Yiwen Kou · Zixiang Chen · Yuan Cao · Quanquan Gu -
2023 : Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP »
Zixiang Chen · Yihe Deng · Yuanzhi Li · Quanquan Gu -
2023 : Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP »
Zixiang Chen · Yihe Deng · Yuanzhi Li · Quanquan Gu -
2023 : [Paper-Oral 3] Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data »
Yu Yang · Aaditya Singh · Mostafa Elhoushi · Anas Mahmoud · Kushal Tirumala · Fabian Gloeckle · Baptiste Roziere · Carole-Jean Wu · Ari Morcos · Newsha Ardalani -
2023 Workshop: New Frontiers of AI for Drug Discovery and Development »
Animashree Anandkumar · Ilija Bogunovic · Ti-chiun Chang · Quanquan Gu · Jure Leskovec · Michelle Li · Chong Liu · Nataša Tagasovska · Mengdi Wang · Wei Wang -
2023 Poster: Robust Contrastive Language-Image Pretraining against Data Poisoning and Backdoor Attacks »
Wenhan Yang · Jingdong Gao · Baharan Mirzasoleiman -
2023 Poster: Corruption-Robust Offline Reinforcement Learning with General Function Approximation »
Chenlu Ye · Rui Yang · Quanquan Gu · Tong Zhang -
2023 Poster: Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data »
Yiwen Kou · Zixiang Chen · Quanquan Gu -
2023 Poster: Optimal Extragradient-Based Algorithms for Stochastic Variational Inequalities with Separable Structure »
Angela Yuan · Chris Junchi Li · Gauthier Gidel · Michael Jordan · Quanquan Gu · Simon Du -
2023 Poster: Why Does Sharpness-Aware Minimization Generalize Better Than SGD? »
Zixiang Chen · Junkai Zhang · Yiwen Kou · Xiangning Chen · Cho-Jui Hsieh · Quanquan Gu -
2022 Spotlight: Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 : Contributed Talks 2 »
Quanquan Gu · Aaron Defazio · Jiajin Li -
2022 Workshop: OPT 2022: Optimization for Machine Learning »
Courtney Paquette · Sebastian Stich · Quanquan Gu · Cristóbal Guzmán · John Duchi -
2022 : Generating High Fidelity Synthetic Data via Coreset selection and Entropic Regularization »
Omead Pooladzandi · Pasha Khosravi · Erik Nijkamp · Baharan Mirzasoleiman -
2022 Poster: Towards Understanding the Mixture-of-Experts Layer in Deep Learning »
Zixiang Chen · Yihe Deng · Yue Wu · Quanquan Gu · Yuanzhi Li -
2022 Poster: Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs »
Dongruo Zhou · Quanquan Gu -
2022 Poster: Benign Overfitting in Two-layer Convolutional Neural Networks »
Yuan Cao · Zixiang Chen · Misha Belkin · Quanquan Gu -
2022 Poster: Data-Efficient Augmentation for Training Neural Networks »
Tian Yu Liu · Baharan Mirzasoleiman -
2022 Poster: Learning Two-Player Markov Games: Neural Function Approximation and Correlated Equilibrium »
Chris Junchi Li · Dongruo Zhou · Quanquan Gu · Michael Jordan -
2022 Poster: A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits »
Jiafan He · Tianhao Wang · Yifei Min · Quanquan Gu -
2022 Poster: The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Poster: Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Poster: Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attack »
Tian Yu Liu · Yu Yang · Baharan Mirzasoleiman -
2022 Poster: Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions »
Jiafan He · Dongruo Zhou · Tong Zhang · Quanquan Gu -
2022 Poster: Active Ranking without Strong Stochastic Transitivity »
Hao Lou · Tao Jin · Yue Wu · Pan Xu · Quanquan Gu · Farzad Farnoud -
2021 : Contributed talks in Session 4 (Zoom) »
Quanquan Gu · Agnieszka Słowik · Jacques Chen · Neha Wadia · Difan Zou -
2021 : Opening Remarks to Session 4 »
Quanquan Gu -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2021 Poster: The Benefits of Implicit Regularization from SGD in Least Squares Problems »
Difan Zou · Jingfeng Wu · Vladimir Braverman · Quanquan Gu · Dean Foster · Sham Kakade -
2021 Poster: Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Poster: Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent »
Spencer Frei · Quanquan Gu -
2021 Poster: Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures »
Yuan Cao · Quanquan Gu · Mikhail Belkin -
2021 Poster: Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Poster: Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation »
Weitong ZHANG · Dongruo Zhou · Quanquan Gu -
2021 Poster: Variance-Aware Off-Policy Evaluation with Linear Function Approximation »
Yifei Min · Tianhao Wang · Dongruo Zhou · Quanquan Gu -
2021 Poster: Iterative Teacher-Aware Learning »
Luyao Yuan · Dongruo Zhou · Junhong Shen · Jingdong Gao · Jeffrey L Chen · Quanquan Gu · Ying Nian Wu · Song-Chun Zhu -
2021 Poster: Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints »
Tianhao Wang · Dongruo Zhou · Quanquan Gu -
2021 Poster: Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks »
Hanxun Huang · Yisen Wang · Sarah Erfani · Quanquan Gu · James Bailey · Xingjun Ma -
2021 Poster: Do Wider Neural Networks Really Help Adversarial Robustness? »
Boxi Wu · Jinghui Chen · Deng Cai · Xiaofei He · Quanquan Gu -
2021 Poster: Pure Exploration in Kernel and Neural Bandits »
Yinglun Zhu · Dongruo Zhou · Ruoxi Jiang · Quanquan Gu · Rebecca Willett · Robert Nowak -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Contributed talks in Session 4 (Zoom) »
Quanquan Gu · sanae lotfi · Charles Guille-Escuret · Tolga Ergen · Dongruo Zhou -
2020 : Live Q&A with Deanna Needell and Hanbake Lyu (Zoom) »
Quanquan Gu -
2020 : Welcome remarks to Session 4 »
Quanquan Gu -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 Poster: A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks »
Zixiang Chen · Yuan Cao · Quanquan Gu · Tong Zhang -
2020 Poster: Agnostic Learning of a Single Neuron with Gradient Descent »
Spencer Frei · Yuan Cao · Quanquan Gu -
2020 Poster: A Finite-Time Analysis of Two Time-Scale Actor-Critic Methods »
Yue Wu · Weitong ZHANG · Pan Xu · Quanquan Gu -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 Poster: Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks »
Spencer Frei · Yuan Cao · Quanquan Gu -
2019 Poster: Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks »
Difan Zou · Ziniu Hu · Yewen Wang · Song Jiang · Yizhou Sun · Quanquan Gu -
2019 Poster: Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction »
Difan Zou · Pan Xu · Quanquan Gu -
2019 Poster: Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks »
Yuan Cao · Quanquan Gu -
2019 Poster: An Improved Analysis of Training Over-parameterized Deep Neural Networks »
Difan Zou · Quanquan Gu -
2019 Poster: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks »
Yuan Cao · Quanquan Gu -
2019 Spotlight: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks »
Yuan Cao · Quanquan Gu -
2018 Poster: Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima »
Yaodong Yu · Pan Xu · Quanquan Gu -
2018 Poster: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Spotlight: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Poster: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Spotlight: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Poster: Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization »
Bargav Jayaraman · Lingxiao Wang · David Evans · Quanquan Gu -
2017 Poster: Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimization »
Pan Xu · Jian Ma · Quanquan Gu -
2016 Poster: Semiparametric Differential Graph Models »
Pan Xu · Quanquan Gu -
2015 Poster: High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality »
Zhaoran Wang · Quanquan Gu · Yang Ning · Han Liu -
2014 Poster: Sparse PCA with Oracle Property »
Quanquan Gu · Zhaoran Wang · Han Liu -
2014 Poster: Robust Tensor Decomposition with Gross Corruption »
Quanquan Gu · Huan Gui · Jiawei Han -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han