Timezone: »
HotProtein: A Novel Framework for Protein Thermostability Prediction and Editing
Tianlong Chen · Chengyue Gong · Daniel Diaz · Xuxi Chen · Jordan Wells · Qiang Liu · Zhangyang Wang · Andrew Ellington · Alex Dimakis · Adam Klivans
Event URL: https://openreview.net/forum?id=RtV_iEbWeGE »
The molecular basis of protein thermal stability is only partially understood and has major significance for drug and vaccine discovery. The lack of datasets and standardized benchmarks considerably limits learning-based discovery methods. We present $\texttt{HotProtein}$, a large-scale protein dataset with \textit{growth temperature} annotations of thermostability, containing $182$K amino acid sequences and $3$K folded structures from $230$ different species with a wide temperature range $-20^{\circ}\texttt{C}\sim 120^{\circ}\texttt{C}$. Due to functional domain differences and data scarcity within each species, existing methods fail to generalize well on our dataset. We address this problem through a novel learning framework, consisting of ($1$) Protein structure-aware pre-training (SAP) which leverages 3D information to enhance sequence-based pre-training; ($2$) Factorized sparse tuning (FST) that utilizes low-rank and sparse priors as an implicit regularization, together with feature augmentations. Extensive empirical studies demonstrate that our framework improves thermostability prediction compared to other deep learning models. Finally, we propose a novel editing algorithm to efficiently generate positive amino acid mutations that improve thermostability.
The molecular basis of protein thermal stability is only partially understood and has major significance for drug and vaccine discovery. The lack of datasets and standardized benchmarks considerably limits learning-based discovery methods. We present $\texttt{HotProtein}$, a large-scale protein dataset with \textit{growth temperature} annotations of thermostability, containing $182$K amino acid sequences and $3$K folded structures from $230$ different species with a wide temperature range $-20^{\circ}\texttt{C}\sim 120^{\circ}\texttt{C}$. Due to functional domain differences and data scarcity within each species, existing methods fail to generalize well on our dataset. We address this problem through a novel learning framework, consisting of ($1$) Protein structure-aware pre-training (SAP) which leverages 3D information to enhance sequence-based pre-training; ($2$) Factorized sparse tuning (FST) that utilizes low-rank and sparse priors as an implicit regularization, together with feature augmentations. Extensive empirical studies demonstrate that our framework improves thermostability prediction compared to other deep learning models. Finally, we propose a novel editing algorithm to efficiently generate positive amino acid mutations that improve thermostability.
Author Information
Tianlong Chen (Unversity of Texas at Austin)
Chengyue Gong (University of Texas at Austin)
Daniel Diaz (University of Texas at Austin)
Xuxi Chen (University of Texas at Austin)
Jordan Wells (University of Texas at Austin)
Qiang Liu (Dartmouth College)
Zhangyang Wang (University of Texas at Austin)
Andrew Ellington (University of Texas at Austin)
Alex Dimakis (University of Texas, Austin)
Adam Klivans (UT Austin)
More from the Same Authors
-
2021 Spotlight: Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent »
Xingchao Liu · Xin Tong · Qiang Liu -
2022 : Score-based Seismic Inverse Problems »
Sriram Ravula · Dimitri Voytan · Elad Liebman · Ram Tuvi · Yash Gandhi · Hamza Ghani · Alex Ardel · Mrinal Sen · Alex Dimakis -
2022 : BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Mao Ye · Bo Liu · Stephen Wright · Peter Stone · Qiang Liu -
2022 : Diffusion-based Molecule Generation with Informative Prior Bridges »
Chengyue Gong · Lemeng Wu · Xingchao Liu · Mao Ye · Qiang Liu -
2022 : First hitting diffusion models »
Mao Ye · Lemeng Wu · Qiang Liu -
2022 : Discovering the Hidden Vocabulary of DALLE-2 »
Giannis Daras · Alex Dimakis -
2022 : Multiresolution Textual Inversion »
Giannis Daras · Alex Dimakis -
2022 : Neural Volumetric Mesh Generator »
Yan Zheng · Lemeng Wu · Xingchao Liu · Zhen Chen · Qiang Liu · Qixing Huang -
2022 : Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow »
Xingchao Liu · Chengyue Gong · Qiang Liu -
2022 : Let us Build Bridges: Understanding and Extending Diffusion Generative Models »
Xingchao Liu · Lemeng Wu · Mao Ye · Qiang Liu -
2022 Spotlight: Sparse Winning Tickets are Data-Efficient Image Recognizers »
Mukund Varma T · Xuxi Chen · Zhenyu Zhang · Tianlong Chen · Subhashini Venugopalan · Zhangyang Wang -
2022 Poster: Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets »
Ruisi Cai · Zhenyu Zhang · Tianlong Chen · Xiaohan Chen · Zhangyang Wang -
2022 Poster: First Hitting Diffusion Models for Generating Manifold, Graph and Categorical Data »
Mao Ye · Lemeng Wu · Qiang Liu -
2022 Poster: Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative »
Tianxin Wei · Yuning You · Tianlong Chen · Yang Shen · Jingrui He · Zhangyang Wang -
2022 Poster: Signal Processing for Implicit Neural Representations »
Dejia Xu · Peihao Wang · Yifan Jiang · Zhiwen Fan · Zhangyang Wang -
2022 Poster: Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve »
Giannis Daras · Negin Raoof · Zoi Gkalitsiou · Alex Dimakis -
2022 Poster: Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropagation »
Ziyu Jiang · Xuxi Chen · Xueqin Huang · Xianzhi Du · Denny Zhou · Zhangyang Wang -
2022 Poster: Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork »
Haotao Wang · Junyuan Hong · Aston Zhang · Jiayu Zhou · Zhangyang Wang -
2022 Poster: Zonotope Domains for Lagrangian Neural Network Verification »
Matt Jordan · Jonathan Hayase · Alex Dimakis · Sewoong Oh -
2022 Poster: Sampling in Constrained Domains with Orthogonal-Space Variational Gradient Descent »
Ruqi Zhang · Qiang Liu · Xin Tong -
2022 Poster: Scaling Multimodal Pre-Training via Cross-Modality Gradient Harmonization »
Junru Wu · Yi Liang · feng han · Hassan Akbari · Zhangyang Wang · Cong Yu -
2022 Poster: Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis »
Wuyang Chen · Wei Huang · Xinyu Gong · Boris Hanin · Zhangyang Wang -
2022 Poster: BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Bo Liu · Mao Ye · Stephen Wright · Peter Stone · Qiang Liu -
2022 Poster: Sparse Winning Tickets are Data-Efficient Image Recognizers »
Mukund Varma T · Xuxi Chen · Zhenyu Zhang · Tianlong Chen · Subhashini Venugopalan · Zhangyang Wang -
2022 Poster: Symbolic Distillation for Learned TCP Congestion Control »
S P Sharan · Wenqing Zheng · Kuo-Feng Hsu · Jiarong Xing · Ang Chen · Zhangyang Wang -
2022 Poster: M³ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design »
hanxue liang · Zhiwen Fan · Rishov Sarkar · Ziyu Jiang · Tianlong Chen · Kai Zou · Yu Cheng · Cong Hao · Zhangyang Wang -
2022 Poster: Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again »
AJAY JAISWAL · Peihao Wang · Tianlong Chen · Justin Rousseau · Ying Ding · Zhangyang Wang -
2022 Poster: Diffusion-based Molecule Generation with Informative Prior Bridges »
Lemeng Wu · Chengyue Gong · Xingchao Liu · Mao Ye · Qiang Liu -
2022 Poster: Advancing Model Pruning via Bi-level Optimization »
Yihua Zhang · Yuguang Yao · Parikshit Ram · Pu Zhao · Tianlong Chen · Mingyi Hong · Yanzhi Wang · Sijia Liu -
2022 Poster: Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks »
Sitan Chen · Aravind Gollakota · Adam Klivans · Raghu Meka -
2022 Poster: A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking »
Keyu Duan · Zirui Liu · Peihao Wang · Wenqing Zheng · Kaixiong Zhou · Tianlong Chen · Xia Hu · Zhangyang Wang -
2021 : Alex Dimakis Talk »
Alex Dimakis -
2021 Poster: Improving Contrastive Learning on Imbalanced Data via Open-World Sampling »
Ziyu Jiang · Tianlong Chen · Ting Chen · Zhangyang Wang -
2021 Poster: Sparse Training via Boosting Pruning Plasticity with Neuroregeneration »
Shiwei Liu · Tianlong Chen · Xiaohan Chen · Zahra Atashgahi · Lu Yin · Huanyu Kou · Li Shen · Mykola Pechenizkiy · Zhangyang Wang · Decebal Constantin Mocanu -
2021 Poster: Conflict-Averse Gradient Descent for Multi-task learning »
Bo Liu · Xingchao Liu · Xiaojie Jin · Peter Stone · Qiang Liu -
2021 Poster: Efficiently Learning One Hidden Layer ReLU Networks From Queries »
Sitan Chen · Adam Klivans · Raghu Meka -
2021 Poster: Inverse Problems Leveraging Pre-trained Contrastive Representations »
Sriram Ravula · Georgios Smyrnis · Matt Jordan · Alex Dimakis -
2021 Poster: Sampling with Trusthworthy Constraints: A Variational Gradient Framework »
Xingchao Liu · Xin Tong · Qiang Liu -
2021 Poster: Robust Compressed Sensing MRI with Deep Generative Priors »
Ajil Jalal · Marius Arvinte · Giannis Daras · Eric Price · Alex Dimakis · Jon Tamir -
2021 Poster: Chasing Sparsity in Vision Transformers: An End-to-End Exploration »
Tianlong Chen · Yu Cheng · Zhe Gan · Lu Yuan · Lei Zhang · Zhangyang Wang -
2021 Poster: Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective »
Tianlong Chen · Yu Cheng · Zhe Gan · Jingjing Liu · Zhangyang Wang -
2021 Poster: Automatic and Harmless Regularization with Constrained and Lexicographic Optimization: A Dynamic Barrier Approach »
Chengyue Gong · Xingchao Liu · Qiang Liu -
2021 Poster: argmax centroid »
Chengyue Gong · Mao Ye · Qiang Liu -
2021 Poster: Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent »
Xingchao Liu · Xin Tong · Qiang Liu -
2021 Poster: Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot? »
Xiaolong Ma · Geng Yuan · Xuan Shen · Tianlong Chen · Xuxi Chen · Xiaohan Chen · Ning Liu · Minghai Qin · Sijia Liu · Zhangyang Wang · Yanzhi Wang -
2021 Poster: You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership »
Xuxi Chen · Tianlong Chen · Zhenyu Zhang · Zhangyang Wang -
2020 Workshop: Second Workshop on AI for Humanitarian Assistance and Disaster Response »
Ritwik Gupta · Robin Murphy · Eric Heim · Zhangyang Wang · Bryce Goodman · Nirav Patel · Piotr Bilinski · Edoardo Nemni -
2020 Poster: From Boltzmann Machines to Neural Networks and Back Again »
Surbhi Goel · Adam Klivans · Frederic Koehler -
2020 Poster: Implicit Regularization and Convergence for Weight Normalization »
Xiaoxia Wu · Edgar Dobriban · Tongzheng Ren · Shanshan Wu · Zhiyuan Li · Suriya Gunasekar · Rachel Ward · Qiang Liu -
2020 Poster: Graph Contrastive Learning with Augmentations »
Yuning You · Tianlong Chen · Yongduo Sui · Ting Chen · Zhangyang Wang · Yang Shen -
2020 Poster: SMYRF - Efficient Attention using Asymmetric Clustering »
Giannis Daras · Nikita Kitaev · Augustus Odena · Alex Dimakis -
2020 Poster: MATE: Plugging in Model Awareness to Task Embedding for Meta Learning »
Xiaohan Chen · Zhangyang Wang · Siyu Tang · Krikamol Muandet -
2020 Poster: Robust Pre-Training by Adversarial Contrastive Learning »
Ziyu Jiang · Tianlong Chen · Ting Chen · Zhangyang Wang -
2020 Poster: Training Stronger Baselines for Learning to Optimize »
Tianlong Chen · Weiyi Zhang · Zhou Jingyang · Shiyu Chang · Sijia Liu · Lisa Amini · Zhangyang Wang -
2020 Poster: Applications of Common Entropy for Causal Inference »
Murat Kocaoglu · Sanjay Shakkottai · Alex Dimakis · Constantine Caramanis · Sriram Vishwanath -
2020 Spotlight: Training Stronger Baselines for Learning to Optimize »
Tianlong Chen · Weiyi Zhang · Zhou Jingyang · Shiyu Chang · Sijia Liu · Lisa Amini · Zhangyang Wang -
2020 Poster: Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free »
Haotao Wang · Tianlong Chen · Shupeng Gui · TingKuei Hu · Ji Liu · Zhangyang Wang -
2020 Poster: FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training »
Yonggan Fu · Haoran You · Yang Zhao · Yue Wang · Chaojian Li · Kailash Gopalakrishnan · Zhangyang Wang · Yingyan Lin -
2020 Poster: Exactly Computing the Local Lipschitz Constant of ReLU Networks »
Matt Jordan · Alex Dimakis -
2020 Poster: Statistical-Query Lower Bounds via Functional Gradients »
Surbhi Goel · Aravind Gollakota · Adam Klivans -
2020 Poster: The Lottery Ticket Hypothesis for Pre-trained BERT Networks »
Tianlong Chen · Jonathan Frankle · Shiyu Chang · Sijia Liu · Yang Zhang · Zhangyang Wang · Michael Carbin -
2020 Poster: Robust compressed sensing using generative models »
Ajil Jalal · Liu Liu · Alex Dimakis · Constantine Caramanis -
2020 Poster: ShiftAddNet: A Hardware-Inspired Deep Network »
Haoran You · Xiaohan Chen · Yongan Zhang · Chaojian Li · Sicheng Li · Zihao Liu · Zhangyang Wang · Yingyan Lin -
2019 : Opening Remarks »
Reinhard Heckel · Paul Hand · Alex Dimakis · Joan Bruna · Deanna Needell · Richard Baraniuk -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alex Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Workshop: Solving inverse problems with deep networks: New architectures, theoretical foundations, and applications »
Reinhard Heckel · Paul Hand · Richard Baraniuk · Joan Bruna · Alex Dimakis · Deanna Needell -
2019 Poster: Inverting Deep Generative models, One layer at a time »
Qi Lei · Ajil Jalal · Inderjit Dhillon · Alex Dimakis -
2019 Poster: Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes »
Matt Jordan · Justin Lewis · Alex Dimakis -
2019 Poster: Primal-Dual Block Generalized Frank-Wolfe »
Qi Lei · JIACHENG ZHUO · Constantine Caramanis · Inderjit Dhillon · Alex Dimakis -
2019 Poster: Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals »
Surbhi Goel · Sushrut Karmalkar · Adam Klivans -
2019 Spotlight: Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals »
Surbhi Goel · Sushrut Karmalkar · Adam Klivans -
2019 Poster: Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models »
Shanshan Wu · Sujay Sanghavi · Alex Dimakis -
2019 Spotlight: Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models »
Shanshan Wu · Sujay Sanghavi · Alex Dimakis -
2019 Poster: Learning Distributions Generated by One-Layer ReLU Networks »
Shanshan Wu · Alex Dimakis · Sujay Sanghavi -
2019 Poster: List-decodable Linear Regression »
Sushrut Karmalkar · Adam Klivans · Pravesh Kothari -
2019 Spotlight: List-decodable Linear Regression »
Sushrut Karmalkar · Adam Klivans · Pravesh Kothari -
2018 Poster: Experimental Design for Cost-Aware Learning of Causal Graphs »
Erik Lindgren · Murat Kocaoglu · Alex Dimakis · Sriram Vishwanath -
2017 Workshop: NIPS Highlights (MLTrain), Learn How to code a paper with state of the art frameworks »
Alex Dimakis · Nikolaos Vasiloglou · Guy Van den Broeck · Alexander Ihler · Assaf Araki -
2017 Poster: Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks »
Surbhi Goel · Adam Klivans -
2017 Poster: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alex Dimakis · Moran Feldman · Amin Karbasi -
2017 Oral: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alex Dimakis · Moran Feldman · Amin Karbasi -
2017 Poster: Model-Powered Conditional Independence Test »
Rajat Sen · Ananda Theertha Suresh · Karthikeyan Shanmugam · Alex Dimakis · Sanjay Shakkottai -
2016 Poster: Leveraging Sparsity for Efficient Submodular Data Summarization »
Erik Lindgren · Shanshan Wu · Alex Dimakis -
2016 Poster: Single Pass PCA of Matrix Products »
Shanshan Wu · Srinadh Bhojanapalli · Sujay Sanghavi · Alex Dimakis -
2015 Poster: Orthogonal NMF through Subspace Exploration »
Megasthenis Asteris · Dimitris Papailiopoulos · Alex Dimakis -
2015 Poster: Sparse PCA via Bipartite Matchings »
Megasthenis Asteris · Dimitris Papailiopoulos · Anastasios Kyrillidis · Alex Dimakis -
2015 Poster: Learning Causal Graphs with Small Interventions »
Karthikeyan Shanmugam · Murat Kocaoglu · Alex Dimakis · Sriram Vishwanath -
2014 Poster: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alex Dimakis · Adam Klivans -
2014 Poster: On the Information Theoretic Limits of Learning Ising Models »
Rashish Tandon · Karthikeyan Shanmugam · Pradeep Ravikumar · Alex Dimakis -
2014 Oral: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alex Dimakis · Adam Klivans