Timezone: »
State-of-the-art methods for the prediction of the structures of interacting protein complexes rely on the construction of paired multiple sequence alignments, whose rows contain concatenated pairs of homologues of each of the interacting chains. Despite the inherent difficulty of accurately pairing interacting homologues of each chain, most existing methods use simple heuristic strategies for this purpose. The accuracy of these heuristic strategies and the consequences of their widespread usage remain poorly understood, due in large part to the paucity of ground truth data on correct pairings. To remedy this situation we propose a novel benchmark setting for interaction partner pairing algorithms, based on domain-domain interactions within single protein chains. The co-existence of pairs of domains within single chains means that ground-truth pairs of homologues are known a priori, allowing both the accuracy of pairing strategies and the influence of inaccurate pairings on downstream inferences to be quantified directly. We provide evidence that the widely used best-hit pairing strategy leads in many cases to very noisy paired MSAs, from which inferences of 3D structure can be significantly less accurate than those made using the correctly paired MSAs. We conclude that further improvements in pairing strategies promise significant benefits for structure predictors capable of exploiting co-evolutionary signal.
Author Information
Alex Hawkins-Hooker (University College London)
David Jones (University College London)
Brooks Paige (UCL)
More from the Same Authors
-
2021 : MSA-Conditioned Generative Protein Language Models for Fitness Landscape Modelling and Design »
Alex Hawkins-Hooker · David Jones · Brooks Paige -
2022 : Towards Healing the Blindness of Score Matching »
Mingtian Zhang · Oscar Key · Peter Hayes · David Barber · Brooks Paige · Francois-Xavier Briol -
2023 Poster: Moment Matching Denoising Gibbs Sampling »
Mingtian Zhang · Alex Hawkins-Hooker · Brooks Paige · David Barber -
2020 Workshop: Machine Learning for Molecules »
José Miguel Hernández-Lobato · Matt Kusner · Brooks Paige · Marwin Segler · Jennifer Wei -
2019 : Molecules and Genomes »
David Haussler · Djork-Arné Clevert · Michael Keiser · Alan Aspuru-Guzik · David Duvenaud · David Jones · Jennifer Wei · Alexander D'Amour