One of the greatest challenges facing biologists and the statisticians that work with them is the goal of representation learning to discover and define appropriate representation of data in order to perform complex, multi-scale machine learning tasks. This workshop is designed to bring together trainee and expert machine learning scientists with those in the very forefront of biological research for this purpose. Our full-day workshop will advance the joint project of the CS and biology communities with the goal of "Learning Meaningful Representations of Life" (LMRL), emphasizing interpretable representation learning of structure and principle.
We will organize around the theme "From Genomes to Phenotype, and Back Again": an extension of a long-standing effort in the biological sciences to assign biochemical and cellular functions to the millions of as-yet uncharacterized gene products discovered by genome sequencing. ML methods to predict phenotype from genotype are rapidly advancing and starting to achieve widespread success. At the same time, large scale gene synthesis and genome editing technologies have rapidly matured, and become the foundation for new scientific insight as well as biomedical and industrial advances. ML-based methods have the potential to accelerate and extend these technologies' application, by providing tools for solving the key problem of going "back again," from a desired phenotype to the genotype necessary to achieve that desired set of observable characteristics. We will focus on this foundational design problem and its application to areas ranging from protein engineering to phylogeny, immunology, vaccine design and next generation therapies.
Generative modeling, semi-supervised learning, optimal experimental design, Bayesian optimization, & many other areas of machine learning have the potential to address the phenotype-to-genotype problem, and we propose to bring together experts in these fields as well as many others.
LMRL will take place on Dec 13, 2021.