Timezone: »
Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their predictions for many molecular properties. However, this information is infeasible to compute at the scale required by most real-world applications. We propose pre-training a model to understand the geometry of molecules given only their 2D molecular graph. Using methods from self-supervised learning, we maximize the mutual information between a 3D summary vector and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still generates implicit 3D information and can use it to inform downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of molecular properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Crucially, the learned representations can be effectively transferred between datasets with vastly different molecules.
Author Information
Hannes Stärk (Technical University of Munich)
Gabriele Corso (MIT)
Christian Dallago (Technical University of Munich)
Stephan Günnemann (Technical University of Munich)
Pietro Lió (University of Cambridge)
More from the Same Authors
-
2021 : FLIP: Benchmark tasks in fitness landscape inference for proteins »
Christian Dallago · Jody Mou · Kadina Johnston · Bruce Wittmann · Nicholas Bhattacharya · Samuel Goldman · Ali Madani · Kevin Yang -
2021 : Whole Brain Vessel Graphs: A Dataset and Benchmark for Graph Learning and Neuroscience »
Johannes C. Paetzold · Julian McGinnis · Suprosanna Shit · Ivan Ezhov · Paul Büschl · Chinmay Prabhakar · Anjany Sekuboyina · Mihail Todorov · Georgios Kaissis · Ali Ertürk · Stephan Günnemann · Bjoern Menze -
2021 : Interpretable Data Analysis for Bench-to-Bedside Research »
Zohreh Shams · Botty Dimanov · Nikola Simidjievski · Helena Andres-Terre · Paul Scherer · Urška Matjašec · Mateja Jamnik · Pietro Lió -
2021 : Structure-aware generation of drug-like molecules »
Pavol Drotar · Arian Jamasb · Ben Day · Catalina Cangea · Pietro Lió -
2021 : 3D Pre-training improves GNNs for Molecular Property Prediction »
Hannes Stärk · Dominique Beaini · Gabriele Corso · Prudencio Tossou · Christian Dallago · Stephan Günnemann · Pietro Lió -
2021 : Approximate Latent Force Model Inference »
Jacob Moss · Felix Opolka · Pietro Lió -
2022 : Learning Feynman Diagrams using Graph Neural Networks »
Alexander Norcliffe · Harrison Mitchell · Pietro Lió -
2022 : DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking »
Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola -
2022 : A physics-informed search for metric solutions to Ricci flow, their embeddings, and visualisation »
Aarjav Jain · Challenger Mishra · Pietro Lió -
2022 : Improving Classification and Data Imputation for Single-Cell Transcriptomics with Graph Neural Networks »
Han-Bo Li · Ramon Viñas Torné · Pietro Lió -
2022 : DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking »
Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola -
2022 : Structure-based Drug Design with Equivariant Diffusion Models »
Arne Schneuing · Yuanqi Du · Charles Harris · Arian Jamasb · Ilia Igashov · weitao Du · Tom Blundell · Pietro Lió · Carla Gomes · Max Welling · Michael Bronstein · Bruno Correia -
2022 : A Federated Learning benchmark for Drug-Target Interaction »
Filip Svoboda · Gianluca Mittone · Nicholas Lane · Pietro Lió -
2022 : DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking »
Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola -
2022 : torchode: A Parallel ODE Solver for PyTorch »
Marten Lienen · Stephan Günnemann -
2022 : Standards, tooling and benchmarks to probe representation learning on proteins »
Joaquin Gomez Sanchez · Sebastian Franz · Michael Heinzinger · Burkhard Rost · Christian Dallago -
2022 : Benchmarking Graph Neural Network-based Imputation Methods on Single-Cell Transcriptomics Data »
Han-Bo Li · Ramon Viñas Torné · Pietro Lió -
2022 : Sheaf Attention Networks »
Federico Barbero · Cristian Bodnar · Haitz Sáez de Ocáriz Borde · Pietro Lió -
2022 : Modeling Temporal Data as Continuous Functions with Process Diffusion »
Marin Biloš · Kashif Rasul · Anderson Schneider · Yuriy Nevmyvaka · Stephan Günnemann -
2022 : Molecular Docking with Diffusion Generative Models »
Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola -
2022 : Training Differentially Private Graph Neural Networks with Random Walk Sampling »
Morgane Ayle · Jan Schuchardt · Lukas Gosch · Daniel Zügner · Stephan Günnemann -
2022 : Revisiting Robustness in Graph Machine Learning »
Lukas Gosch · Daniel Sturm · Simon Geisler · Stephan Günnemann -
2022 : Human Interventions in Concept Graph Networks »
Lucie Charlotte Magister · Pietro Barbiero · Dmitry Kazhdan · Federico Siciliano · Gabriele Ciravegna · Fabrizio Silvestri · Mateja Jamnik · Pietro Lió -
2022 : Revisiting Robustness in Graph Machine Learning »
Lukas Gosch · Daniel Sturm · Simon Geisler · Stephan Günnemann -
2023 Poster: Graph Denoising Diffusion for Inverse Protein Folding »
Kai Yi · Bingxin Zhou · Yiqing Shen · Pietro Lió · Yuguang Wang -
2023 Poster: Add and Thin: Diffusion for Temporal Point Processes »
David Lüdke · Marin Biloš · Oleksandr Shchur · Marten Lienen · Stephan Günnemann -
2023 Poster: (Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More »
Jan Schuchardt · Yan Scholten · Stephan Günnemann -
2023 Poster: Interpretable Graph Networks Formulate Universal Algebra Conjectures »
Francesco Giannini · Stefano Fioravanti · Oguzhan Keskin · Alisia Lupidi · Lucie Charlotte Magister · Pietro Lió · Pietro Barbiero -
2023 Poster: Sheaf Hypergraph Networks »
Iulia Duta · Giulia Cassarà · Fabrizio Silvestri · Pietro Lió -
2023 Poster: Adversarial Training for Graph Neural Networks »
Lukas Gosch · Simon Geisler · Daniel Sturm · Bertrand Charpentier · Daniel Zügner · Stephan Günnemann -
2023 Poster: Hierarchical Randomized Smoothing »
Yan Scholten · Jan Schuchardt · Aleksandar Bojchevski · Stephan Günnemann -
2023 Workshop: Machine Learning in Structural Biology Workshop »
Hannah Wayment-Steele · Roshan Rao · Ellen Zhong · Sergey Ovchinnikov · Gabriele Corso · Gina El Nesr -
2022 : Contributed Talk: Revisiting Robustness in Graph Machine Learning »
Lukas Gosch · Daniel Sturm · Simon Geisler · Stephan Günnemann -
2022 : Sheaf Attention Networks »
Federico Barbero · Cristian Bodnar · Haitz Sáez de Ocáriz Borde · Pietro Lió -
2022 : DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking »
Gabriele Corso · Hannes Stärk · Bowen Jing · Regina Barzilay · Tommi Jaakkola -
2022 : Dynamic outcomes-based clustering of disease progression in mechanically ventilated patients »
Emma Rocheteau · Ioana Bica · Pietro Lió · Ari Ercole -
2022 Poster: Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off »
Mateo Espinosa Zarlenga · Pietro Barbiero · Gabriele Ciravegna · Giuseppe Marra · Francesco Giannini · Michelangelo Diligenti · Zohreh Shams · Frederic Precioso · Stefano Melacci · Adrian Weller · Pietro Lió · Mateja Jamnik -
2022 Poster: Are Defenses for Graph Neural Networks Robust? »
Felix Mujkanovic · Simon Geisler · Stephan Günnemann · Aleksandar Bojchevski -
2022 Poster: Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs »
Cristian Bodnar · Francesco Di Giovanni · Benjamin Chamberlain · Pietro Lió · Michael Bronstein -
2022 Poster: Invariance-Aware Randomized Smoothing Certificates »
Jan Schuchardt · Stephan Günnemann -
2022 Poster: Torsional Diffusion for Molecular Conformer Generation »
Bowen Jing · Gabriele Corso · Jeffrey Chang · Regina Barzilay · Tommi Jaakkola -
2022 Poster: Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Biomolecular Structures and Interaction Networks »
Arian Jamasb · Ramon Viñas Torné · Eric Ma · Yuanqi Du · Charles Harris · Kexin Huang · Dominic Hall · Pietro Lió · Tom Blundell -
2022 Poster: Composite Feature Selection Using Deep Ensembles »
Fergus Imrie · Alexander Norcliffe · Pietro Lió · Mihaela van der Schaar -
2022 Poster: Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution »
Leon Hetzel · Simon Boehm · Niki Kilbertus · Stephan Günnemann · mohammad lotfollahi · Fabian Theis -
2022 Poster: Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks »
Yan Scholten · Jan Schuchardt · Simon Geisler · Aleksandar Bojchevski · Stephan Günnemann -
2022 Poster: SizeShiftReg: a Regularization Method for Improving Size-Generalization in Graph Neural Networks »
Davide Buffelli · Pietro Lió · Fabio Vandin -
2021 : Learning Graph Search Heuristics »
Michal Pándy · Rex Ying · Gabriele Corso · Petar Veličković · Jure Leskovec · Pietro Liò -
2021 : Neural ODE Processes: A Short Summary »
Alexander Norcliffe · Cristian Bodnar · Ben Day · Jacob Moss · Pietro Lió -
2021 : On Second Order Behaviour in Augmented Neural ODEs: A Short Summary »
Alexander Norcliffe · Cristian Bodnar · Ben Day · Nikola Simidjievski · Pietro Lió -
2021 : Structure-aware generation of drug-like molecules »
Pavol Drotar · Arian Jamasb · Ben Day · Catalina Cangea · Pietro Lió -
2021 Poster: Robustness of Graph Neural Networks at Scale »
Simon Geisler · Tobias Schmidt · Hakan Şirin · Daniel Zügner · Aleksandar Bojchevski · Stephan Günnemann -
2021 Poster: Directional Message Passing on Molecular Graphs via Synthetic Coordinates »
Johannes Gasteiger · Chandan Yeshwanth · Stephan Günnemann -
2021 Poster: Neural Flows: Efficient Alternative to Neural ODEs »
Marin Biloš · Johanna Sommer · Syama Sundar Rangapuram · Tim Januschowski · Stephan Günnemann -
2021 Poster: Detecting Anomalous Event Sequences with Temporal Point Processes »
Oleksandr Shchur · Ali Caner Turkmen · Tim Januschowski · Jan Gasthaus · Stephan Günnemann -
2021 Poster: GemNet: Universal Directional Graph Neural Networks for Molecules »
Johannes Gasteiger · Florian Becker · Stephan Günnemann -
2021 Poster: Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification »
Maximilian Stadler · Bertrand Charpentier · Simon Geisler · Daniel Zügner · Stephan Günnemann -
2020 Poster: Constraining Variational Inference with Geometric Jensen-Shannon Divergence »
Jacob Deasy · Nikola Simidjievski · Pietro Lió -
2020 Poster: Fast and Flexible Temporal Point Processes with Triangular Maps »
Oleksandr Shchur · Nicholas Gao · Marin Biloš · Stephan Günnemann -
2020 Poster: On Second Order Behaviour in Augmented Neural ODEs »
Alexander Norcliffe · Cristian Bodnar · Ben Day · Nikola Simidjievski · Pietro Lió -
2020 Poster: Deep Rao-Blackwellised Particle Filters for Time Series Forecasting »
Richard Kurle · Syama Sundar Rangapuram · Emmanuel de Bézenac · Stephan Günnemann · Jan Gasthaus -
2020 Poster: Reliable Graph Neural Networks via Robust Aggregation »
Simon Geisler · Daniel Zügner · Stephan Günnemann -
2020 Oral: Fast and Flexible Temporal Point Processes with Triangular Maps »
Oleksandr Shchur · Nicholas Gao · Marin Biloš · Stephan Günnemann -
2020 Poster: Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts »
Bertrand Charpentier · Daniel Zügner · Stephan Günnemann -
2019 Poster: Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift »
Stephan Rabanser · Stephan Günnemann · Zachary Lipton -
2019 Poster: Diffusion Improves Graph Learning »
Johannes Gasteiger · Stefan Weißenberger · Stephan Günnemann -
2019 Poster: Uncertainty on Asynchronous Time Event Prediction »
Marin Biloš · Bertrand Charpentier · Stephan Günnemann -
2019 Spotlight: Uncertainty on Asynchronous Time Event Prediction »
Marin Biloš · Bertrand Charpentier · Stephan Günnemann -
2019 Poster: Certifiable Robustness to Graph Perturbations »
Aleksandar Bojchevski · Stephan Günnemann