Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Generative AI and Biology (GenBio@NeurIPS2023)

Through the looking glass: navigating in latent space to optimize over combinatorial synthesis libraries

Aryan Pedawi · Saulo de Oliveira · Henry van den Bedem

Keywords: [ virtual screening ] [ Reinforcement Learning ] [ Generative Models ]


Abstract: Commercially available, synthesis-on-demand virtual libraries contain trillions of readily synthesizable compounds and can serve as a bridge between _in silico_ property optimization and _in vitro_ validation. However, as these libraries continue to grow exponentially in size, traditional enumerative search strategies that scale linearly with the number of compounds encounter significant limitations. Hierarchical enumeration approaches scale more gracefully in library size, but are inherently greedy and implicitly rest on an additivity assumption of the molecular property with respect to its sub-components. In this work, we present a reinforcement learning approach to retrieving compounds from ultra-large libraries that satisfy a set of user-specified constraints. Along the way, we derive what we believe to be a new family of $\alpha$-divergences that may be of general interest in density estimation. Our method first trains a library-constrained generative model over a virtual library and subsequently trains a normalizing flow to learn a distribution over latent space that decodes constraint-satisfying compounds. The proposed approach naturally accommodates specification of multiple molecular property constraints and requires only black box access to the molecular property functions, thereby supporting a broad class of search problems over these libraries.

Chat is not available.