Skip to yearly menu bar Skip to main content

Workshop: Machine Learning in Structural Biology Workshop

Fast non-autoregressive inverse folding with discrete diffusion

John Yang · Jason Yim · Tommi Jaakkola · Regina Barzilay


Generating protein sequences that fold into a intended 3D structure is a fundamental step in de novo protein design. De facto methods utilize autoregressive generation, but this eschews higher order interactions that could be exploited to improve inference speed. We describe a non-autoregressive alternative that performs inference using a constant number of calls resulting in a 23 times speed up without a loss in performance on the CATH benchmark. Conditioned on the 3D structure, we fine-tune ProteinMPNN to perform discrete diffusion with a purity prior over the index sampling order. Our approach gives the flexibility in trading off inference speed and accuracy by modulating the diffusion speed.

Chat is not available.