Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)

An Energy Based Model for Incorporating Sequence Priors for Target-Specific Antibody Design

Steffanie Paul · Yining Huang · Debora Marks

Keywords: antibody design GNN protein language model energy-based model

Project Page [ Poster] [ OpenReview]

Abstract

With the growing demand for antibody therapeutics, there is a great need for computational methods to accelerate antibody discovery and optimization. Advances in machine learning on graphs have been leveraged to develop generative models of antibody sequence and structure that condition on specific antigen epitopes. However, the data availability for training models on structure ($\sim$5k antibody binding complexes) is dwarfed by the amount of antibody sequence data available ($>$ 550M sequences). Protein language models trained on these sequence corpuses are able to generate expressible antibodies; a necessary criterion for designing real-world binding antibodies. We investigate the performance gap between antibody sequence models and target-specific models and find that target-specific models have lower recovery of middle loop residues in the antibody CDR. Towards a generative model of expressible and target-specific antibodies, we propose an energy-based model framework for combining information from sequence priors with target information, and present preliminary results on the development of this model.

Chat is not available.