Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)

Parameter-Efficient Fine-Tune on Open Pre-trained Transformers for Genomic Sequence

Huixin Zhan ⋅ Zijun Frank Zhang

Keywords: language models Open pre-trained transformers Parameter efficient fine-tune Genomic data

Project Page [ Poster] [ OpenReview]

Abstract

Lately, pre-trained foundation models (PFMs) in DNA have achieved notable advancements in unraveling the linguistic nuances of the genome. As these foundational models expand in parameters and the number of downstream genomic tasks increases, Parameter-Efficient Fine-Tuning (PEFT) has become the de facto approach to fine-tune PFMs while decreasing the computational costs. Low-rank adapters and adaptive low-rank adaptation (AdaLoRA) are popular PEFT methods that introduce some learnable truncated singular value decomposition modules for efficient fine-tuning. However, both methods are deterministic, i.e., once a singular value is pruned, it stays pruned throughout the fine-tuning process. Consequently, deterministic PEFTs can underperform if the initial states, before pruning, are suboptimal—a challenge frequently encountered in genomics due to data heterogeneity. To address this issue, we propose an AdaLoRA with random sampling (AdaLoRA+RS) to prune and stochastically reintroduce pruned singular vectors, adhering to a cubic budget schedule. We evaluate the AdaLoRA+RS on PFMs within genome domain, DNABERT 1/2 and Nucleotide Transformer; and language domain, open pre-trained transformers (OPT). Our AdaLoRA+RS approach demonstrates performance ranging from slightly above to on par with the Full-Model Fine-Tuning (FMFT) across $13$ genomic sequence datasets on two genome understanding tasks, while using less than $2\%$ of the trainable parameters. For instance, in the human promoter detection, OPT-$350$M with AdaLoRA+RS achieves a $4.4\%$ AUC increase compared to its FMFT baseline, leveraging only $1.8\%$ of the trainable parameters. Our proposed AdaLoRA+RS provides a powerful PEFT strategy for modeling genomic sequence.

Chat is not available.