Skip to yearly menu bar Skip to main content

Workshop: Machine Learning in Structural Biology Workshop

Preparation Of Labeled Cryo-ET Datasets For Training And Evaluation Of Machine Learning Models

Aygul Ishemgulova · Alex J. Noble · Tristan Bepler · Alex De Marco


We present datasets aimed at improving the efficiency of cryo-electron tomographic data analysis. While cryo-electron tomography (cryo-ET) holds immense promise as a tool for native structural biology, it faces persistent challenges in segmentation and annotation. These challenges primarily stem from the absence of diverse ground truth datasets for efficient model training, evaluation, and benchmarking. To address these challenges, we have collected and are currently annotating datasets spanning a range of complexities. Composed of carefully selected protein mixtures and organisms with small genomes, these datasets offer a broad spectrum of structures for study. The datasets are designed to provide a robust foundation for development and evaluation of machine learning models for annotation tasks, thereby enhancing the efficacy and applicability of cryo-ET in elucidating complex native biological structures and interactions. This ongoing project will soon offer the annotated datasets publicly, encouraging further innovation and research in the community.

Chat is not available.