Skip to yearly menu bar Skip to main content

Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

LeanFlex-GKP: Advancing Hassle-Free Structured Pruning with Simple Flexible Group Count

Jiamu Zhang · Shaochen (Henry) Zhong · Andrew Ye · Zirui Liu · Kaixiong Zhou · Xia Hu · Shuai Xu · Vipin Chaudhary


Densely structured pruning methods — which generate pruned models in a fully dense format, allowing immediate compression benefits without additional demands — are evolving due to their practical significance. Traditional techniques in this domain mainly revolve around coarser granularities, such as filter pruning, and thereby limit performance due to a restricted pruning freedom.Recent advancements in Grouped Kernel Pruning (GKP) have enabled the utilization of finer granularities while maintaining a densely structured format. We observe that existing GKP methods often introduce dynamic operations to different aspects of their procedures at the cost of adding complications and/or imposing limitations (e.g. requiring an expensive mixture of clustering schemes), or contain dynamic pruning rates and sizes among groups which results in a reliance on custom architecture support for its pruned models.In this work, we argue that the best practice to introduce these dynamic operations to GKP is to make Conv2d(groups) (a.k.a. group count) flexible under an integral optimization, leveraging its ideal alignment with the infrastructure support Grouped Convolution. Pursuing such a direction, we present a one-shot, post-train, data-agnostic GKP method that is more performant, adaptive, and efficient than its predecessors while simultaneously being a lot more user-friendly, with little-to-no hyper-parameter tuning or handcrafting of criteria required.

Chat is not available.