Timezone: »
Poster
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao · Quanquan Gu
Tue Dec 10 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #141
We study the training and generalization of deep neural networks (DNNs) in the over-parameterized regime, where the network width (i.e., number of hidden nodes per layer) is much larger than the number of training data points. We show that, the expected $0$-$1$ loss of a wide enough ReLU network trained with stochastic gradient descent (SGD) and random initialization can be bounded by the training loss of a random feature model induced by the network gradient at initialization, which we call a \textit{neural tangent random feature} (NTRF) model. For data distributions that can be classified by NTRF model with sufficiently small error, our result yields a generalization error bound in the order of $\tilde{\mathcal{O}}(n^{-1/2})$ that is independent of the network width. Our result is more general and sharper than many existing generalization error bounds for over-parameterized neural networks. In addition, we establish a strong connection between our generalization error bound and the neural tangent kernel (NTK) proposed in recent work.
Author Information
Yuan Cao (UCLA)
Quanquan Gu (UCLA)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Spotlight: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks »
Tue Dec 10th 06:25 -- 06:30 PM Room West Exhibition Hall C + B3
More from the Same Authors
-
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 Poster: A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks »
Zixiang Chen · Yuan Cao · Quanquan Gu · Tong Zhang -
2020 Poster: Agnostic Learning of a Single Neuron with Gradient Descent »
Spencer Frei · Yuan Cao · Quanquan Gu -
2020 Poster: A Finite-Time Analysis of Two Time-Scale Actor-Critic Methods »
Yue Frank Wu · Weitong ZHANG · Pan Xu · Quanquan Gu -
2019 Poster: Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks »
Spencer Frei · Yuan Cao · Quanquan Gu -
2019 Poster: Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks »
Difan Zou · Ziniu Hu · Yewen Wang · Song Jiang · Yizhou Sun · Quanquan Gu -
2019 Poster: Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction »
Difan Zou · Pan Xu · Quanquan Gu -
2019 Poster: Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks »
Yuan Cao · Quanquan Gu -
2019 Poster: An Improved Analysis of Training Over-parameterized Deep Neural Networks »
Difan Zou · Quanquan Gu -
2018 Poster: Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima »
Yaodong Yu · Pan Xu · Quanquan Gu -
2018 Poster: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Spotlight: Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization »
Pan Xu · Jinghui Chen · Difan Zou · Quanquan Gu -
2018 Poster: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Spotlight: Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Poster: Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization »
Bargav Jayaraman · Lingxiao Wang · David Evans · Quanquan Gu -
2017 Poster: Speeding Up Latent Variable Gaussian Graphical Model Estimation via Nonconvex Optimization »
Pan Xu · Jian Ma · Quanquan Gu -
2016 Poster: Semiparametric Differential Graph Models »
Pan Xu · Quanquan Gu -
2015 Poster: High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality »
Zhaoran Wang · Quanquan Gu · Yang Ning · Han Liu -
2014 Poster: Sparse PCA with Oracle Property »
Quanquan Gu · Zhaoran Wang · Han Liu -
2014 Poster: Robust Tensor Decomposition with Gross Corruption »
Quanquan Gu · Huan Gui · Jiawei Han -
2012 Poster: Selective Labeling via Error Bound Minimization »
Quanquan Gu · Tong Zhang · Chris Ding · Jiawei Han