Skip to yearly menu bar Skip to main content

Workshop: Machine Learning in Structural Biology Workshop

Protein generation with evolutionary diffusion: sequence is all you need

Sarah Alamdari · Nitya Thakkar · Rianne van den Berg · Alex X Lu · Nicolo Fusi · Ava Amini · Kevin Yang

[ ]
presentation: Machine Learning in Structural Biology Workshop
Fri 15 Dec 6:30 a.m. PST — 3:05 p.m. PST


Diffusion models have demonstrated the ability to generate biologically plausible proteins that are dissimilar to any proteins seen in nature, enabling unprecedented capability and control in de novo protein design. However, current state-of-the-art diffusion models generate protein structures, which limits the scope of their training data and restricts generations to a small and biased subset of protein space. We introduce a general-purpose diffusion framework, EvoDiff, that combines evolutionary-scale data with the distinct conditioning capabilities of diffusion models for controllable protein generation in sequence space. EvoDiff generates high-fidelity, diverse, and structurally-plausible proteins that cover natural sequence and functional space. Critically, EvoDiff can generate proteins inaccessible to structure-based models, such as those with disordered regions, and design scaffolds for functional structural motifs, demonstrating the universality of our sequence-based formulation. We envision that EvoDiff will expand capabilities in protein engineering beyond the structure-function paradigm toward programmable, sequence-first design. All code and models will be open-source.

Chat is not available.