D-Flow: Multi-modality Flow Matching for D-peptide Design
Abstract
Proteins play crucial roles in biological processes, with therapeutic peptides emerging as promising pharmaceutical agents. They allow for new possibilities to leverage target binding sites that were previously undruggable. Although deep learning has advanced peptide discovery, generating D-proteins composed of D-amino acids remains challenging because of the scarcity of natural examples. This paper proposes D-Flow, a full-atom flow-based framework for de novo D-peptide design. D-Flow is conditioned on receptor binding and utilizes a comprehensive representation of peptide structure, incorporating backbone frames, side-chain angles, and discrete amino acid types. A mirror-image algorithm is implemented to address the lack of training data for D-proteins, which convert the chirality of L-receptors. Furthermore, we enhance D-Flow's capacity by integrating large protein language models with structural awareness through a lightweight structural adapter. A two-stage training pipeline and a controlling toolkit also enable D-Flow to transition from a general protein design to a targeted binder design while preserving pre-training knowledge. Extensive experimental results on the PepMerge benchmark demonstrate D-Flow's effectiveness, particularly in developing peptides with entire D-residues. This approach represents a significant advancement in computational D-peptide design, offering unique opportunities for bioorthogonal and stable molecular tools and diagnostics.