Timezone: »
Poster
Directional Pruning of Deep Neural Networks
Shih-Kang Chao · Zhanyu Wang · Yue Xing · Guang Cheng
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region. The proposed pruning method does not require retraining or the expert knowledge on the sparsity level. To overcome the computational formidability of estimating the flat directions, we propose to use a carefully tuned $\ell_1$ proximal gradient algorithm which can provably achieve the directional pruning with a small learning rate after sufficient training. The empirical results demonstrate the promising results of our solution in highly sparse regime (92% sparsity) among many existing pruning methods on the ResNet50 with the ImageNet, while using only a slightly higher wall time and memory footprint than the SGD. Using the VGG16 and the wide ResNet 28x10 on the CIFAR-10 and CIFAR-100, we demonstrate that our solution reaches the same minima valley as the SGD, and the minima found by our solution and the SGD do not deviate in directions that impact the training loss. The code that reproduces the results of this paper is available at https://github.com/donlan2710/gRDA-Optimizer/tree/master/directional_pruning.
Author Information
Shih-Kang Chao (University of Missouri)
Zhanyu Wang (Purdue University)
Yue Xing (Purdue University)
Guang Cheng (Purdue University)
More from the Same Authors
-
2021 : Optimum-statistical Collaboration Towards Efficient Black-boxOptimization »
Wenjie Li · Chi-Hua Wang · Guang Cheng -
2022 Poster: Fair Bayes-Optimal Classifiers Under Predictive Parity »
Xianli Zeng · Edgar Dobriban · Guang Cheng -
2022 Poster: Why Do Artificially Generated Data Help Adversarial Robustness »
Yue Xing · Qifan Song · Guang Cheng -
2022 Poster: Phase Transition from Clean Training to Adversarial Training »
Yue Xing · Qifan Song · Guang Cheng -
2021 Poster: On the Algorithmic Stability of Adversarial Training »
Yue Xing · Qifan Song · Guang Cheng -
2020 Poster: Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee »
Jincheng Bai · Qifan Song · Guang Cheng -
2020 Poster: Statistical Guarantees of Distributed Nearest Neighbor Classification »
Jiexin Duan · Xingye Qiao · Guang Cheng -
2019 Poster: Bootstrapping Upper Confidence Bound »
Botao Hao · Yasin Abbasi Yadkori · Zheng Wen · Guang Cheng -
2019 Poster: Rates of Convergence for Large-scale Nearest Neighbor Classification »
Xingye Qiao · Jiexin Duan · Guang Cheng -
2018 Poster: Early Stopping for Nonparametric Testing »
Meimei Liu · Guang Cheng -
2015 Poster: Non-convex Statistical Optimization for Sparse Tensor Graphical Model »
Wei Sun · Zhaoran Wang · Han Liu · Guang Cheng