Timezone: »
Poster
Sparse Structure Search for Delta Tuning
Shengding Hu · Zhen Zhang · Ning Ding · Yadao Wang · Yasheng Wang · Zhiyuan Liu · Maosong Sun
@
Adapting large pre-trained models (PTMs) through fine-tuning imposes prohibitive computational and storage burdens. Recent studies of delta tuning (DT), i.e., parameter-efficient tuning, find that only optimizing a small portion of parameters conditioned on PTMs could yield on-par performance compared to conventional fine-tuning. Generally, DT methods exquisitely design delta modules (DT modules) which could be applied to arbitrary fine-grained positions inside PTMs. However, the effectiveness of these fine-grained positions largely relies on sophisticated manual designation, thereby usually producing sub-optimal results. In contrast to the manual designation, we explore constructing DT modules in an automatic manner. We automatically \textbf{S}earch for the \textbf{S}parse \textbf{S}tructure of \textbf{Delta} Tuning (S$^3$Delta). Based on a unified framework of various DT methods, S$^3$Delta conducts the differentiable DT structure search through bi-level optimization and proposes shifted global sigmoid method to explicitly control the number of trainable parameters. Extensive experiments show that S$^3$Delta surpasses manual and random structures with less trainable parameters. The searched structures preserve more than 99\% fine-tuning performance with 0.01\% trainable parameters. Moreover, the advantage of S$^3$Delta is amplified with extremely low trainable parameters budgets (0.0009\%$\sim$0.01\%). The searched structures are transferable and explainable, providing suggestions and guidance for the future design of DT methods. Our codes are publicly available at \url{https://github.com/thunlp/S3Delta}.
Author Information
Shengding Hu (Tsinghua University)
Zhen Zhang (Tsinghua University, Tsinghua University)
Ning Ding (Tsinghua University, Tsinghua University)
Yadao Wang (Huawei Technologies Ltd.)
Yasheng Wang (Huawei Noah's Ark Lab)
Zhiyuan Liu (Tsinghua University)
Maosong Sun (Tsinghua University)
More from the Same Authors
-
2022 Poster: Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models »
Biru Zhu · Yujia Qin · Ganqu Cui · Yangyi Chen · Weilin Zhao · Chong Fu · Yangdong Deng · Zhiyuan Liu · Jingang Wang · Wei Wu · Maosong Sun · Ming Gu -
2022 Poster: A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks »
Ganqu Cui · Lifan Yuan · Bingxiang He · Yangyi Chen · Zhiyuan Liu · Maosong Sun -
2022 Spotlight: Lightning Talks 5A-4 »
Yangrui Chen · Zhiyang Chen · Liang Zhang · Hanqing Wang · Jiaqi Han · Shuchen Wu · shaohui peng · Ganqu Cui · Yoav Kolumbus · Noemi Elteto · Xing Hu · Anwen Hu · Wei Liang · Cong Xie · Lifan Yuan · Noam Nisan · Wenbing Huang · Yousong Zhu · Ishita Dasgupta · Luc V Gool · Tingyang Xu · Rui Zhang · Qin Jin · Zhaowen Li · Meng Ma · Bingxiang He · Yangyi Chen · Juncheng Gu · Wenguan Wang · Ke Tang · Yu Rong · Eric Schulz · Fan Yang · Wei Li · Zhiyuan Liu · Jiaming Guo · Yanghua Peng · Haibin Lin · Haixin Wang · Qi Yi · Maosong Sun · Ruizhi Chen · Chuan Wu · Chaoyang Zhao · Yibo Zhu · Liwei Wu · xishan zhang · Zidong Du · Rui Zhao · Jinqiao Wang · Ling Li · Qi Guo · Ming Tang · Yunji Chen -
2022 Spotlight: A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks »
Ganqu Cui · Lifan Yuan · Bingxiang He · Yangyi Chen · Zhiyuan Liu · Maosong Sun -
2020 Poster: Graph Policy Network for Transferable Active Learning on Graphs »
Shengding Hu · Zheng Xiong · Meng Qu · Xingdi Yuan · Marc-Alexandre Côté · Zhiyuan Liu · Jian Tang -
2020 Poster: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables »
Wangchunshu Zhou · Jinyi Hu · Hanlin Zhang · Xiaodan Liang · Maosong Sun · Chenyan Xiong · Jian Tang -
2012 Poster: Monte Carlo Methods for Maximum Margin Supervised Topic Models »
Qixia Jiang · Jun Zhu · Maosong Sun · Eric Xing