Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)

Protein generation with evolutionary diffusion

Sarah Alamdari · Nitya Thakkar · Rianne van den Berg · Alex X Lu · Nicolo Fusi · Ava Amini · Kevin Yang

Keywords: [ protein design ] [ protein sequences ] [ sequence-based methods ] [ diffusion models ]


Abstract:

Diffusion models have demonstrated the ability to generate biologically plausible proteins that are dissimilar to any proteins seen in nature, enabling unprecedented capability and control in de novo protein design. However, current state-of-the-art diffusion models generate protein structures, which limits the scope of their training data and restricts generations to a small and biased subset of protein space. We introduce a general-purpose diffusion framework, EvoDiff, that combines evolutionary-scale data with the conditioning capabilities of diffusion models for controllable protein generation in sequence space. EvoDiff generates high-fidelity, diverse, structurally-plausible proteins that cover natural sequence and functional space. Critically, EvoDiff can generate proteins inaccessible to structure-based models, such as those with disordered regions, and design scaffolds for functional structural motifs, demonstrating the universality of our sequence-based formulation. We envision that EvoDiff will expand capabilities in protein engineering beyond the structure-function paradigm toward programmable, sequence-first design.

Chat is not available.