Timezone: »
Network Pruning at Scale: A Discrete Optimization Approach
Wenyu Chen · Riade Benbaki · Xiang Meng · Rahul Mazumder
Event URL: https://openreview.net/forum?id=e9EARLwp2Yw »
Due to the ever-growing size of neural network models, there has been an emerging interest in compressing (i.e., pruning) neural networks by sparsifying weights in a pre-trained neural network, while maintaining the performance of dense model as much as possible. In this work, we focus on a neural network pruning framework based on local quadratic models of the loss function. We present an optimization-based approach with an $\ell_0$-regression formulation, and propose novel algorithms to obtain good solutions to the combinatorial optimization problem. In practice, our basic (single-stage) approach, based on one local quadratic model approximation, is up to $10^3$ times faster than existing methods while achieving similar accuracy. We also propose a multi-stage method that outperforms other methods in terms of accuracy for a given sparsity constraint while remaining computationally efficient. In particular, our approach results in a 98\% sparse (i.e., 98\% of weights in dense model are set to zero) MLPNet with 90\% test accuracy (i.e., 3\% reduction in test accuracy compared to the dense model), which is an improvement over the previous best accuracy (55\%).
Due to the ever-growing size of neural network models, there has been an emerging interest in compressing (i.e., pruning) neural networks by sparsifying weights in a pre-trained neural network, while maintaining the performance of dense model as much as possible. In this work, we focus on a neural network pruning framework based on local quadratic models of the loss function. We present an optimization-based approach with an $\ell_0$-regression formulation, and propose novel algorithms to obtain good solutions to the combinatorial optimization problem. In practice, our basic (single-stage) approach, based on one local quadratic model approximation, is up to $10^3$ times faster than existing methods while achieving similar accuracy. We also propose a multi-stage method that outperforms other methods in terms of accuracy for a given sparsity constraint while remaining computationally efficient. In particular, our approach results in a 98\% sparse (i.e., 98\% of weights in dense model are set to zero) MLPNet with 90\% test accuracy (i.e., 3\% reduction in test accuracy compared to the dense model), which is an improvement over the previous best accuracy (55\%).
Author Information
Wenyu Chen (Massachusetts Institute of Technology)
Riade Benbaki (Massachusetts Institute of Technology)
Xiang Meng (Massachusetts Institute of Technology)
Rahul Mazumder (MIT)
More from the Same Authors
-
2021 : Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance »
Shibal Ibrahim · Natalia Ponomareva · Rahul Mazumder -
2022 : A Light-speed Linear Program Solver for Personalized Recommendation with Diversity Constraints »
Miao Cheng · Haoyue Wang · Aman Gupta · Rahul Mazumder · Sathiya Selvaraj · Kinjal Basu -
2022 : Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization »
Kayhan Behdin · Qingquan Song · Aman Gupta · Sathiya Selvaraj · David Durfee · Ayan Acharya · Rahul Mazumder -
2023 Poster: On the Convergence of CART under Sufficient Impurity Decrease Condition »
Rahul Mazumder · Haoyue Wang -
2023 Poster: GRAND-SLAMIN’ Interpretable Additive Modeling with Structural Constraints »
Shibal Ibrahim · Gabriel Afriat · Kayhan Behdin · Rahul Mazumder -
2022 Poster: Pushing the limits of fairness impossibility: Who's the fairest of them all? »
Brian Hsu · Rahul Mazumder · Preetam Nandy · Kinjal Basu -
2021 Poster: DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning »
Hussein Hazimeh · Zhe Zhao · Aakanksha Chowdhery · Maheswaran Sathiamoorthy · Yihua Chen · Rahul Mazumder · Lichan Hong · Ed Chi