Skip to yearly menu bar Skip to main content

Workshop: Machine Learning in Structural Biology Workshop

Investigating Protein-DNA Binding Energetic of Mismatched DNA

Ruben Solozabal · Tamir Avioz · Yunxiang LI · Le Song · Martin Takac · Ariel Afek


Transcription Factors (TFs) bind to regulatory DNA regions, modulating gene expression. Although various high-throughput techniques have been used to characterize protein binding preferences, this work is the first to extend these studies to non-canonical mismatched bases. The mutagenesis study here presented allows us to determine the binding profile in the double-stranded DNA sequence. Additionally, we leverage deep learning to complete the pairwise interactions map. In this context, we introduce ShapPWM, a motif strategy that marginalizes individual nucleotide contribution by computing the Shapley values. Our model reveals that high synergistic interactions appear between nucleotides in the flanking regions of the contacts. This information offers valuable insights into the binding mechanism and reaction energy, without the necessity of solving intricate crystal structures.

Chat is not available.