Timezone: »
Poster
Rare Gems: Finding Lottery Tickets at Initialization
Kartik Sreenivasan · Jy-yong Sohn · Liu Yang · Matthew Grinde · Alliot Nagle · Hongyi Wang · Eric Xing · Kangwook Lee · Dimitris Papailiopoulos
Large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by following a time-consuming "train, prune, re-train" approach. Frankle & Carbin conjecture that we can avoid this by training lottery tickets, i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we resolve this open problem by proposing Gem-Miner which finds lottery tickets at initialization that beat current baselines. Gem-Miner finds lottery tickets trainable to accuracy competitive or better than Iterative Magnitude Pruning (IMP), and does so up to $19\times$ faster.
Author Information
Kartik Sreenivasan (University of Wisconsin-Madison)
Jy-yong Sohn (University of Wisconsin-Madison)
Liu Yang (University of Wisconsin, Madison)
Matthew Grinde (University of Wisconsin - Madison)
Alliot Nagle (University of Texas at Austin)
Hongyi Wang (CMU, Carnegie Mellon University)
Eric Xing (Petuum Inc.)
Kangwook Lee (UW Madison, Krafton)
Dimitris Papailiopoulos (University of Wisconsin-Madison)
More from the Same Authors
-
2021 : Geometric Question Answering Towards Multimodal Numerical Reasoning »
Jiaqi Chen · Jianheng Tang · Jinghui Qin · Xiaodan Liang · Lingbo Liu · Eric Xing · Liang Lin -
2022 : Active Learning is a Strong Baseline for Data Subset Selection »
Dongmin Park · Dimitris Papailiopoulos · Kangwook Lee -
2022 : A Better Way to Decay: Proximal Gradient Training Algorithms for Neural Nets »
Liu Yang · Jifan Zhang · Joseph Shenouda · Dimitris Papailiopoulos · Kangwook Lee · Robert Nowak -
2022 : The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning »
Hanlin Zhang · yifan zhang · Li Erran Li · Eric Xing -
2022 : Betty: An Automatic Differentiation Library for Multilevel Optimization »
Sang Choe · Willie Neiswanger · Pengtao Xie · Eric Xing -
2023 Poster: Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs »
Yingcong Li · Kartik Sreenivasan · Angeliki Giannou · Dimitris Papailiopoulos · Samet Oymak -
2023 Poster: FedNAR: Federated Optimization with Normalized Annealing Regularization »
Junbo Li · Ang Li · Chong Tian · Qirong Ho · Eric Xing · Hongyi Wang -
2023 Poster: Counterfactual Generation with Identifiability Guarantee »
hanqi yan · Lingjing Kong · Lin Gui · Yuejie Chi · Eric Xing · Yulan He · Kun Zhang -
2023 Poster: Making Scalable Meta Learning Practical »
Sang Choe · Sanket Vaibhav Mehta · Hwijeen Ahn · Willie Neiswanger · Pengtao Xie · Emma Strubell · Eric Xing -
2023 Poster: Temporally Disentangled Representation Learning under Unknown Nonstationarity »
Xiangchen Song · Weiran Yao · Yewen Fan · Xinshuai Dong · Guangyi Chen · Juan Carlos Niebles · Eric Xing · Kun Zhang -
2023 Poster: Identification of Nonlinear Latent Hierarchical Models »
Lingjing Kong · Biwei Huang · Feng Xie · Eric Xing · Yuejie Chi · Kun Zhang -
2023 Poster: Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer »
Bowen Tan · Yun Zhu · Lijuan Liu · Eric Xing · Zhiting Hu · Jindong Chen -
2023 Poster: 3D Open-vocabulary Segmentation with Foundation Models »
Kunhao Liu · Fangneng Zhan · Jiahui Zhang · MUYU XU · Yingchen Yu · Abdulmotaleb El Saddik · Christian Theobalt · Eric Xing · Shijian Lu -
2023 Poster: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models »
Ying Fan · Olivia Watkins · Yuqing Du · Hao Liu · Moonkyung Ryu · Craig Boutilier · Pieter Abbeel · Mohammad Ghavamzadeh · Kangwook Lee · Kimin Lee -
2023 Poster: Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective »
Zeyuan Yin · Eric Xing · Zhiqiang Shen -
2023 Poster: Vicuna Evaluation: Exploring LLM-as-a-Judge and Chatbot Arena »
Lianmin Zheng · Wei-Lin Chiang · Ying Sheng · Siyuan Zhuang · Zhanghao Wu · Yonghao Zhuang · Zi Lin · Zhuohan Li · Dacheng Li · Eric Xing · Hao Zhang · Joseph Gonzalez · Ion Stoica -
2023 Workshop: Machine Learning with New Compute Paradigms »
Jannes Gladrow · Benjamin Scellier · Eric Xing · Babak Rahmani · Francesca Parmigiani · Paul Prucnal · Cheng Zhang -
2022 Spotlight: Masked Generative Adversarial Networks are Data-Efficient Generation Learners »
Jiaxing Huang · Kaiwen Cui · Dayan Guan · Aoran Xiao · Fangneng Zhan · Shijian Lu · Shengcai Liao · Eric Xing -
2022 : Poster Session 2 »
Jinwuk Seok · Bo Liu · Ryotaro Mitsuboshi · David Martinez-Rubio · Weiqiang Zheng · Ilgee Hong · Chen Fan · Kazusato Oko · Bo Tang · Miao Cheng · Aaron Defazio · Tim G. J. Rudner · Gabriele Farina · Vishwak Srinivasan · Ruichen Jiang · Peng Wang · Jane Lee · Nathan Wycoff · Nikhil Ghosh · Yinbin Han · David Mueller · Liu Yang · Amrutha Varshini Ramesh · Siqi Zhang · Kaifeng Lyu · David Yunis · Kumar Kshitij Patel · Fangshuo Liao · Dmitrii Avdiukhin · Xiang Li · Sattar Vakili · Jiaxin Shi -
2022 Poster: LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks »
Tuan Dinh · Yuchen Zeng · Ruisu Zhang · Ziqian Lin · Michael Gira · Shashank Rajput · Jy-yong Sohn · Dimitris Papailiopoulos · Kangwook Lee -
2022 Poster: AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness »
Dacheng Li · Hongyi Wang · Eric Xing · Hao Zhang -
2022 Poster: Score-based Generative Modeling Secretly Minimizes the Wasserstein Distance »
Dohyun Kwon · Ying Fan · Kangwook Lee -
2022 Poster: Masked Generative Adversarial Networks are Data-Efficient Generation Learners »
Jiaxing Huang · Kaiwen Cui · Dayan Guan · Aoran Xiao · Fangneng Zhan · Shijian Lu · Shengcai Liao · Eric Xing -
2021 Poster: An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks »
Shashank Rajput · Kartik Sreenivasan · Dimitris Papailiopoulos · Amin Karbasi -
2020 Poster: Bad Global Minima Exist and SGD Can Reach Them »
Shengchao Liu · Dimitris Papailiopoulos · Dimitris Achlioptas -
2020 Poster: Attack of the Tails: Yes, You Really Can Backdoor Federated Learning »
Hongyi Wang · Kartik Sreenivasan · Shashank Rajput · Harit Vishwakarma · Saurabh Agarwal · Jy-yong Sohn · Kangwook Lee · Dimitris Papailiopoulos -
2020 Poster: Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient »
Ankit Pensia · Shashank Rajput · Alliot Nagle · Harit Vishwakarma · Dimitris Papailiopoulos -
2020 Spotlight: Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient »
Ankit Pensia · Shashank Rajput · Alliot Nagle · Harit Vishwakarma · Dimitris Papailiopoulos -
2019 Poster: DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation »
Shashank Rajput · Hongyi Wang · Zachary Charles · Dimitris Papailiopoulos -
2019 Poster: Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering »
Biwei Huang · Kun Zhang · Pengtao Xie · Mingming Gong · Eric Xing · Clark Glymour -
2018 Poster: The Effect of Network Width on the Performance of Large-batch Training »
Lingjiao Chen · Hongyi Wang · Jinman Zhao · Dimitris Papailiopoulos · Paraschos Koutris -
2018 Poster: ATOMO: Communication-efficient Learning via Atomic Sparsification »
Hongyi Wang · Scott Sievert · Shengchao Liu · Zachary Charles · Dimitris Papailiopoulos · Stephen Wright -
2016 Poster: Cyclades: Conflict-free Asynchronous Machine Learning »
Xinghao Pan · Maximilian Lam · Stephen Tu · Dimitris Papailiopoulos · Ce Zhang · Michael Jordan · Kannan Ramchandran · Christopher RĂ© · Benjamin Recht -
2015 Poster: Orthogonal NMF through Subspace Exploration »
Megasthenis Asteris · Dimitris Papailiopoulos · Alex Dimakis -
2015 Poster: Sparse PCA via Bipartite Matchings »
Megasthenis Asteris · Dimitris Papailiopoulos · Anastasios Kyrillidis · Alex Dimakis -
2015 Poster: Parallel Correlation Clustering on Big Graphs »
Xinghao Pan · Dimitris Papailiopoulos · Samet Oymak · Benjamin Recht · Kannan Ramchandran · Michael Jordan