Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift
Abstract
The concern of overconfident mispredictions under distributional shift demands extensive reliability research on Graph Neural Networks used in critical tasks in drug discovery. Here we first introduce CardioTox, a real-world benchmark on drug cardiotoxicity to facilitate such efforts. Our exploratory study shows overconfident mispredictions are often distant from training data. That leads us to develop distance-aware GNNs: GNN-SNGP. Through evaluation on CardioTox and three established benchmarks, we demonstrate GNN-SNGP's effectiveness in increasing distance-awareness, reducing overconfident mispredictions and making better calibrated predictions without sacrificing accuracy performance. Our ablation study further reveals the embeddings learned by GNN-SNGP improves distance-preservation over its base architecture and is one major factor for improvements.