Timezone: »
Vision Transformers (ViTs) yield impressive performance across various vision tasks. However, heavy computation and memory footprint make them inaccessible for edge devices. Previous works apply importance criteria determined independently by each individual component to prune ViTs. Considering that heterogeneous components in ViTs play distinct roles, these approaches lead to suboptimal performance. In this paper, we introduce joint importance, which integrates essential structural-aware interactions between components for the first time, to perform collaborative pruning. Based on the theoretical analysis, we construct a Taylor-based approximation to evaluate the joint importance. This guides pruning toward a more balanced reduction across all components. To further reduce the algorithm complexity, we incorporate the interactions into the optimization function under some mild assumptions. Moreover, the proposed method can be seamlessly applied to various tasks including object detection. Extensive experiments demonstrate the effectiveness of our method. Notably, the proposed approach outperforms the existing state-of-the-art approaches on ImageNet, increasing accuracy by 0.7% over the DeiT-Base baseline while saving 50% FLOPs. On COCO, we are the first to show that 70% FLOPs of FasterRCNN with ViT backbone can be removed with only 0.3% mAP drop. The code is available at https://github.com/hikvision-research/SAViT.
Author Information
Chuanyang Zheng (Hikvision Research Institute)
zheyang li (Hikvision Research Institute)
Kai Zhang (Hikvision Research Institute)
Zhi Yang (University of Science and Technology of China)
Wenming Tan (Hikvision Research Institute)
Jun Xiao (Zhejiang University)
Ye Ren (Zhejiang University)
Shiliang Pu (Zhejiang University)
More from the Same Authors
-
2022 Poster: "Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach »
lingyu gu · Yongqi Du · yuan zhang · Di Xie · Shiliang Pu · Robert Qiu · Zhenyu Liao -
2022 Spotlight: Lightning Talks 6B-3 »
Lingfeng Yang · Yao Lai · Zizheng Pan · Zhenyu Wang · Weicong Liang · Chuanyang Zheng · Jian-Wei Zhang · Peng Jin · Jing Liu · Xiuying Wei · Yao Mu · Xiang Li · YUHUI YUAN · Zizheng Pan · Yifan Sun · Yunchen Zhang · Jianfei Cai · Hao Luo · zheyang li · Jinfa Huang · Haoyu He · Yi Yang · Ping Luo · Fenglin Liu · Henghui Ding · Borui Zhao · Xiangguo Zhang · Kai Zhang · Pichao WANG · Bohan Zhuang · Wei Chen · Ruihao Gong · Zhi Yang · Xian Wu · Feng Ding · Jianfei Cai · Xiao Luo · Renjie Song · Weihong Lin · Jian Yang · Wenming Tan · Bohan Zhuang · Shanghang Zhang · Shen Ge · Fan Wang · Qi Zhang · Guoli Song · Jun Xiao · Hao Li · Ding Jia · David Clifton · Ye Ren · Fengwei Yu · Zheng Zhang · Jie Chen · Shiliang Pu · Xianglong Liu · Chao Zhang · Han Hu -
2022 Spotlight: SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization »
Chuanyang Zheng · zheyang li · Kai Zhang · Zhi Yang · Wenming Tan · Jun Xiao · Ye Ren · Shiliang Pu -
2022 Spotlight: "Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach »
lingyu gu · Yongqi Du · yuan zhang · Di Xie · Shiliang Pu · Robert Qiu · Zhenyu Liao -
2022 Spotlight: Lightning Talks 2A-1 »
Caio Kalil Lauand · Ryan Strauss · Yasong Feng · lingyu gu · Alireza Fathollah Pour · Oren Mangoubi · Jianhao Ma · Binghui Li · Hassan Ashtiani · Yongqi Du · Salar Fattahi · Sean Meyn · Jikai Jin · Nisheeth Vishnoi · zengfeng Huang · Junier B Oliva · yuan zhang · Han Zhong · Tianyu Wang · John Hopcroft · Di Xie · Shiliang Pu · Liwei Wang · Robert Qiu · Zhenyu Liao -
2021 Poster: STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data »
Zhi Zhou · Lan-Zhe Guo · Zhanzhan Cheng · Yu-Feng Li · Shiliang Pu