Refine Drugs, Don’t Complete Them: Uniform-Source Discrete Flows for Fragment-Based Drug Discovery
Abstract
We present a versatile molecular generative model for drug discovery, supporting de novo generation, fragment‐constrained design, and property optimization. Our approach employs a discrete flow model that gradually transforms a uniform source distribution into the target molecular distribution. Unlike continuous-time discrete models or autoregressive approaches, our method decouples the number of sampling steps from the sequence length. This allows us to increase the sampling time resolution independently of the molecular representation, leading to improved generation quality without modifying the model architecture. As a result, our model achieves a new Pareto frontier in the quality–diversity trade-off for de novo generation and establishes a new state of the art in fragment-constrained generation. Finally, we combine a genetic algorithm, where our model performs the crossover step, with proximal policy optimization adapted to our discrete flow setting, achieving state‐of‐the‐art results on the Practical Molecular Optimization benchmark.