Workshop: Machine Learning in Structural Biology

Predicting cryptic pocket opening from protein structures using graph neural networks

Artur Meller · Michael Ward · Meghana Kshirsagar · Felipe Oviedo · Jonathan Borowsky · Juan Lavista Ferres · Greg Bowman


Proteins undergo structural fluctuations in vivo which can lead to the formation of pockets unseen in the native, folded structural state (i.e. “cryptic pockets”). Inferring cryptic pockets from experimentally determined protein structures is valuable when developing a drug since ligands typically require a pocket for tight binding. Toward this end, many studies employ molecular dynamics simulations to model protein structural fluctuations, but these simulations often require 100s of GPU hours. We hypothesized that machine learning algorithms that predict sites of cryptic pockets directly from folded structures can speed this up. Here, we adapt a graph neural network architecture, which previously achieved state-of-the-art performance on protein structure learning tasks, to predict sites of cryptic pocket formation from experimental protein structures. We trained this model by re-purposing an existing molecular simulation dataset that was generated to identify cryptic pockets in SARS-CoV-2 proteins. Our model achieves good performance (AUC=0.78) on a held-out test set of protein structures with ligands bound to cryptic sites and requires <1 second of compute on a single GPU.

Chat is not available.