Property Adherent Molecular Generation with Constrained Discrete Diffusion
Michael Cardei · Jacob K Christopher · Bhavya Kailkhura · Tom Hartvigsen · Ferdinando Fioretto
Abstract
Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. In life science setting, such as molecular strings (SMILES) and other biological sequence design settings, these models have emerged as a promising alternative to autoregressive architectures, presenting an opportunity to enforce sequence-level constraints, a capability that existing left-to-right sequence design cannot natively provide. This paper capitalizes on this opportunity by introducing $\textbf{Constrained Discrete Diffusion}$ (CDD), a novel integration of differentiable constraint optimization within the diffusion process to ensure adherence to biosafety policies and design properties during generation. Unlike conventional generators that often rely on post-hoc filtering or model retraining for controllable generation, CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach. Experiments in property adherence molecular design, toxicity-bounded generation, and novelty enforcement demonstrate that CDD achieves $\textbf{zero constraint violations}$ in a diverse array of tasks outperforming auto-regressive and existing discrete diffusion approaches.
Chat is not available.
Successful Page Load