Timezone: »

Pretrained protein language model transfer learning: is the final layer representation what we want?
Francesca-Zhoufan Li · Ava Soleimany · Kevin Yang · Alex X Lu

Large pretrained protein language models have improved protein sequence-to-function prediction. This often takes the form of transfer learning, where final-layer representations from large pretrained models are extracted for downstream tasks. Although pretrained models have been empirically successful, there is little current understanding of how the features learned by pretraining relate to and are useful for downstream tasks. In this work, we investigate whether transferring a partial model, by using the output from a middle layer, is as effective as full model transfer, and if so, whether successful transfer depends on the downstream task and model properties. Across datasets and tasks, we evaluate partial model transfer of pretrained transformer and convolutional neural networks of varying sizes. We observe that pretrained representations outperform the one-hot baseline for most tasks. More importantly, we find that representations from middle layers can be as effective as those from later layers. To our knowledge, our work is the first to report the effectiveness of partial model transfer for protein property prediction. Our results point to a mismatch between the pretraining and downstream tasks, indicating a need for more relevant pretraining tasks so that representations from later layers can be better utilized for downstream tasks.

Author Information

Francesca-Zhoufan Li (Caltech)
Francesca-Zhoufan Li

With a broad interest in applying AI to science and engineering problems, I am currently focusing on machine learning-assisted protein engineering as a Bioengineering Ph.D. student at [Caltech](https://www.caltech.edu/) co-advised by [Frances Arnold](http://fhalab.caltech.edu/) and [Yisong Yue](http://www.yisongyue.com/group.php). My current project is on multi-modal representation learning for predicting top protein fitness from site-saturation mutagenesis libraries. I have also worked with [Kevin K. Yang](https://yangkky.github.io/about/), [Alex X. Lu](https://www.alexluresearch.com/), and [Ava P. Amini](https://www.mit.edu/~asolei/) through my summer internship at [Microsoft Research New England](https://www.microsoft.com/en-us/research/lab/microsoft-research-new-england/) on transfer learning for pretrained protein language models.

Ava Soleimany (Microsoft Research)
Kevin Yang (Microsoft)
Alex X Lu (Microsoft Research)

I’m a Senior Researcher at Microsoft Research New England, in the BioML group. I’m interested in how machine learning can help us discover new insights from biological data, by finding patterns that are too subtle or large-scale to identify unassisted. I primarily focus on biological images, and my research often designs self-supervised learning methods, as I believe these methods are unbiased by prior knowledge.

More from the Same Authors