Probing Functional Plasticity in Peptide–Protein Interaction with Minimal Data
Abstract
Machine learning has accelerated protein design, yet peptide discovery remains constrained by scarce training data. Here, we present an ML-assisted structure-aware pipeline that integrates AlphaFold-Multimer predictions, and a compact ML model to guide peptide diversification. Using split GFP as a testbed, MDMI generated functional peptide binders with more than 50% sequence divergence from the wild type. When benchmarked against state-of-the-art models such RFdiffusion and PepMLM, MDMI produced fourfold more high-confidence binders while requiring a few dozen labeled examples. Through discovering novel functional variants with high sequence divergence, MDMI pipeline demonstrates capability to map functional plasticity in peptide-protein interfaces, offering a practical framework for low-N peptide engineering.