Poster
in
Workshop: AI for Science: The Reach and Limits of AI for Scientific Discovery

Revolutionize drug discovery with dense PPI data

Wei Lu · Lixia Yi · Jixian Zhang · Ming Gu · Zhongyue Zhang · Jiahua Rao · Shuangjia Zheng

Project Page [ OpenReview]

Abstract

Drug development faces persistent tradeoffs between efficacy, safety, and developability, but existing foundation models cannot reliably predict binding affinity—the central challenge for therapeutic design. This limitation stems from sparse protein–protein interaction (PPI) datasets, which largely reflect natural protein pairs and encourage memorization rather than generalization. We propose dense PPI datasets that systematically sample mutational neighborhoods, compelling models to learn transferable interaction principles. Using scalable FACS and sequencing, billions of labeled data points can be generated at reasonable cost. These datasets would enable PPI-specific foundation models with accurate affinity prediction, improved structure modeling, and efficient exploration of interaction-aware sequence landscapes, with transformative impact on drug discovery, diagnostics, synthetic biology, and the broader life sciences.

Chat is not available.