Multi-Objective Nanobody Design via Masked Discrete Diffusion with Simplex Refinement
Abstract
Nanobodies are compact, stable, and highly specific binding proteins that can access epitopes inaccessible to conventional antibodies, making them ideal scaffolds for therapeutic design. We present a masked discrete denoising framework for nanobody generation NanoMDLM that learns to reconstruct CDRs on a fixed humanized scaffold, with region-specific masking that emphasizes diversity in CDR3. At inference, we develop a platform for Nanobody Optimization for Selective Interaction and Enhanced properties (NOSIE) via discrete simplex refinement (DSR), a gradient-free, black-box guidance method that samples CDR completions and reweights them using Pareto-weighted softmax over predicted binding and stability scores. At inference time, DSR steers NanoMDLM toward high-performing sequences without retraining or differentiable reward access. Across multiple antigens, including the GPCR MRGPRX2, NOSIE produces nanobodies with competitive or superior in silico binding, thermostability, and CDR3 quality, as assessed by NanoNet structure prediction, AlphaFold-Multimer co-folding, and feature combination-based ranking. Together, these results provide a scalable, sequence-only framework for multi-objective nanobody design, enabling numerous therapeutic applications.