Poster
in
Workshop: AI for Science: The Reach and Limits of AI for Scientific Discovery

A Multi-Modal Deep Learning Model for Drug Potency Prediction: Leveraging Features from Physics-Based Docking and Advanced Co-Folding Methods

Claire Suen · BoRam Lee · Matthew Adrian · Jeffrey Zhou · Hyeyun Jung · David He · Gean Hu · Kelly Hui · Aditi Jain · Qamil Mirza Bin Abdullah · Milena Novakovic · Joseph Park · Winston Qian · Aarav Shah · Xina Wang · Yunsie Chung · Alan Cheng

Project Page [ OpenReview]

Abstract

In drug discovery, the accurate prediction of a compound's potency is crucial for efficient design and optimization of small molecules as drugs. While machine learning and deep learning approaches can be useful, they generally require significant amounts of data that is not typically available in drug discovery programs in practice. We address this limitation by developing a multi-modal deep learning framework that enhances a graph neural network, Chemprop, by integrating explicit protein-ligand interaction features. We generated protein-ligand poses using both a physics-based docking method and two deep learning-based co-folding methods, Boltz-1 and Boltz-2. Our model demonstrates improved predictive accuracy for $IC_{50}$ values for two diverse targets, CYP2D6 Inhibition and EGFR kinase. Additionally, our methods leveraging co-folding consistently outperforms the traditional docking-based approach. Feature selection analysis further revealed that pi-stacking interactions were the most informative, appearing in the top-performing feature sets across all methods. In low-data regimes, the PLIP-informed models consistently outperformed established baselines. This work provides a scalable method to fuse complementary data modalities, offering both enhanced predictive performance and valuable mechanistic insights into drug-target interactions.

Chat is not available.