Skip to yearly menu bar Skip to main content

Workshop: AI for Science: from Theory to Practice

STRIDE: Structure-guided Generation for Inverse Design of Molecules

Shehtab Zaman · Denis Akhiyarov · Mauricio Araya-Polo · Kenneth Chiu

Abstract: Machine learning and especially deep learning has had an increasing impact on molecule and materials design. In particular, given the growing access to an abundance of high-quality small molecule data for generative modeling for drug design, which has led to promising results for drug discovery. However, for many important classes of materials such as catalysts, antioxidants, and metal-organic frameworks, such large datasets are not available. Such families of molecules with limited samples and structural similarities are especially prevalent for industrial applications. As it is well-known, retraining and even fine-tuning are challenging on such small datasets. Novel, practically applicable molecules are most often derivatives of well-known molecules, suggesting approaches to addressing data scarcity. To address this problem, we introduce $\textbf{STRIDE}$, a generative molecule workflow that generates novel molecules with an unconditional generative model guided by known molecules without any retraining. We generate molecules outside of the training data from a highly specialized set of antioxidant molecules. Our generated molecules on average 21.7\% lower synthetic accessibility scores and also reduce ionization potential by 5.9\% of generated molecules via guiding.

Chat is not available.