Skip to yearly menu bar Skip to main content

Workshop: 4th Workshop on Self-Supervised Learning: Theory and Practice

MolSiam: Simple Siamese Self-supervised Representation Learning for Small Molecules

Joshua Yao-Yu Lin · Michael Maser · Nathan Frey · Gabriele Scalia · Omar Mahmood · Pedro O. Pinheiro · Ji Won Park · Stephen Ra · Andrew Watkins · Kyunghyun Cho


We investigate a self-supervised learning technique from the Simple Siamese (SimSiam) Representation Learning framework on 2D molecule graphs. SimSiam does not require negative samples during training, making it 1) more computationally efficient and 2) less vulnerable to faulty negatives compared with contrastive learning. Leveraging unlabeled molecular data, we demonstrate that our approach, MolSiam, effectively captures the underlying features of molecules and shows that those with similar properties tend to cluster in UMAP analysis. By fine-tuning pre-trained MolSiam models, we observe performance improvements across four downstream therapeutic property prediction tasks without training with negative pairs.

Chat is not available.