Timezone: »
Assessing the severity of new pathogenic variants requires an understanding of which mutations will escape the human immune response. Even single point mutations to an antigen can cause immune escape and infection via abrogation of antibody binding. Recent work has modeled the effect of single point mutations on proteins by leveraging the information contained in large-scale, pretrained protein language models. These models are often applied in a zero-shot setting, where the effect of each mutation is predicted based on the output of the language model with no additional training. However, this approach cannot appropriately model immune escape, which involves the interaction of two proteins---antibody and antigen---instead of one and requires making different predictions for the same antigenic mutation in response to different antibodies. Here, we explore several methods for predicting immune escape by building models on top of embeddings from pretrained protein language models. We evaluate our methods on a SARS-CoV-2 deep mutational scanning dataset and show that our embedding-based methods significantly outperform zero-shot methods, which have almost no predictive power. We additionally highlight insights into how best to use embeddings from pretrained protein language models to predict escape.
Author Information
Kyle Swanson (Stanford University)

Kyle Swanson is a PhD student at Stanford University advised by James Zou. He is interested in applications of machine learning to biology, medicine, and drug discovery.
Howard Chang
James Zou (Stanford University)
More from the Same Authors
-
2022 : Predicting Immune Escape with Pretrained Protein Language Model Embeddings »
Kyle Swanson · Howard Chang · James Zou -
2022 : Protein structure generation via folding diffusion »
Kevin Wu · Kevin Yang · Rianne van den Berg · James Zou · Alex X Lu · Ava Soleimany -
2022 : DrML: Diagnosing and Rectifying Vision Models using Language »
Yuhui Zhang · Jeff Z. HaoChen · Shih-Cheng Huang · Kuan-Chieh Wang · James Zou · Serena Yeung -
2020 Poster: Neuron Shapley: Discovering the Responsible Neurons »
Amirata Ghorbani · James Zou -
2020 Poster: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou -
2020 Oral: FrugalML: How to use ML Prediction APIs more accurately and cheaply »
Lingjiao Chen · Matei Zaharia · James Zou -
2020 Poster: MOPO: Model-based Offline Policy Optimization »
Tianhe Yu · Garrett Thomas · Lantao Yu · Stefano Ermon · James Zou · Sergey Levine · Chelsea Finn · Tengyu Ma -
2019 : Poster Session »
Lili Yu · Aleksei Kroshnin · Alex Delalande · Andrew Carr · Anthony Tompkins · Aram-Alexandre Pooladian · Arnaud Robert · Ashok Vardhan Makkuva · Aude Genevay · Bangjie Liu · Bo Zeng · Charlie Frogner · Elsa Cazelles · Esteban G Tabak · Fabio Ramos · François-Pierre PATY · Georgios Balikas · Giulio Trigila · Hao Wang · Hinrich Mahler · Jared Nielsen · Karim Lounici · Kyle Swanson · Mukul Bhutani · Pierre Bréchet · Piotr Indyk · samuel cohen · Stefanie Jegelka · Tao Wu · Thibault Sejourne · Tudor Manole · Wenjun Zhao · Wenlin Wang · Wenqi Wang · Yonatan Dukler · Zihao Wang · Chaosheng Dong -
2019 : Phenotype »
Nir HaCohen · David Reshef · Matthew Johnson · Sam Morris · Aurel Nagy · Gokcen Eraslan · Meromit Singer · Eliezer Van Allen · Smita Krishnaswamy · Casey Greene · Scott Linderman · Alexander Wiltschko · Dylan Kotliar · James Zou · Brendan Bulik-Sullivan -
2019 Poster: Towards Automatic Concept-based Explanations »
Amirata Ghorbani · James Wexler · James Zou · Been Kim -
2018 Poster: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders »
Abubakar Abid · James Zou