Timezone: »
In many natural domains, changing a small part of an entity can transform its semantics; for example, a single word change can alter the meaning of a sentence, or a single amino acid change can mutate a viral protein to escape antiviral treatment or immunity. Although identifying such mutations can be desirable (for example, therapeutic design that anticipates avenues of viral escape), the rules governing semantic change are often hard to quantify. Here, we introduce the problem of identifying mutations with a large effect on semantics, but where valid mutations are under complex constraints (for example, English grammar or biological viability), which we refer to as constrained semantic change search (CSCS). We propose an unsupervised solution based on language models that simultaneously learn continuous latent representations. We report good empirical performance on CSCS of single-word mutations to news headlines, map a continuous semantic space of viral variation, and, notably, show unprecedented zero-shot prediction of single-residue escape mutations to key influenza and HIV proteins, suggesting a productive link between modeling natural language and pathogenic evolution.
Author Information
Brian Hie (Massachusetts Institute of Technology)
Ellen Zhong (Massachusetts Institute of Technology)
Bryan Bryson (Massachusetts Institute of Technology)
Bonnie Berger (MIT)
More from the Same Authors
-
2021 : Adapting protein language models for rapid DTI prediction »
Samuel Sledzieski · Rohit Singh · Lenore J Cowen · Bonnie Berger -
2022 : Contrasting drugs from decoys »
Samuel Sledzieski · Rohit Singh · Lenore J Cowen · Bonnie Berger -
2021 Workshop: Machine Learning in Structural Biology »
Ellen Zhong · Raphael Townshend · Stephan Eismann · Namrata Anand · Roshan Rao · John Ingraham · Wouter Boomsma · Sergey Ovchinnikov · Bonnie Berger -
2020 : Exploring generative atomic models in cryo-EM reconstruction »
Ellen Zhong · Adam Lerer · · Bonnie Berger -
2020 : Contributed Talks Intro »
Ellen Zhong -
2020 : Morning Poster Session »
Ellen Zhong -
2020 : Andrea Thorn Intro »
Ellen Zhong -
2020 Workshop: Machine Learning for Structural Biology »
Raphael Townshend · Stephan Eismann · Ron Dror · Ellen Zhong · Namrata Anand · John Ingraham · Wouter Boomsma · Sergey Ovchinnikov · Roshan Rao · Per Greisen · Rachel Kolodny · Bonnie Berger -
2019 Poster: Explicitly disentangling image content from translation and rotation with spatial-VAE »
Tristan Bepler · Ellen Zhong · Kotaro Kelley · Edward Brignole · Bonnie Berger