Timezone: »
Poster
On Training Implicit Models
Zhengyang Geng · Xin-Yu Zhang · Shaojie Bai · Yisen Wang · Zhouchen Lin
This paper focuses on training implicit models of infinite layers. Specifically, previous works employ implicit differentiation and solve the exact gradient for the backward propagation. However, is it necessary to compute such an exact but expensive gradient for training? In this work, we propose a novel gradient estimate for implicit models, named phantom gradient, that 1) forgoes the costly computation of the exact gradient; and 2) provides an update direction empirically preferable to the implicit model training. We theoretically analyze the condition under which an ascent direction of the loss landscape could be found and provide two specific instantiations of the phantom gradient based on the damped unrolling and Neumann series. Experiments on large-scale tasks demonstrate that these lightweight phantom gradients significantly accelerate the backward passes in training implicit models by roughly 1.7 $\times$ and even boost the performance over approaches based on the exact gradient on ImageNet.
Author Information
Zhengyang Geng (Peking University)
Xin-Yu Zhang (TuSimple)
Xin-Yu Zhang is an algorithm researcher at TuSimple. His research interests mostly lie in machine learning and computer vision.
Shaojie Bai (Carnegie Mellon University)
Yisen Wang (Peking University)
Zhouchen Lin (Peking University)
More from the Same Authors
-
2021 Spotlight: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 Spotlight: Clustering Effect of Adversarial Robust Models »
Yang Bai · Xin Yan · Yong Jiang · Shu-Tao Xia · Yisen Wang -
2022 Poster: Rethinking Knowledge Graph Evaluation Under the Open-World Assumption »
Haotong Yang · Zhouchen Lin · Muhan Zhang -
2022 Poster: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors »
Qixun Wang · Yifei Wang · Hong Zhu · Yisen Wang -
2022 Poster: When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture »
Yichuan Mo · Dongxian Wu · Yifei Wang · Yiwen Guo · Yisen Wang -
2022 : Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models »
Xingyu Xie · Pan Zhou · Huan Li · Zhouchen Lin · Shuicheng Yan -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders »
Qi Zhang · Yifei Wang · Yisen Wang -
2022 Spotlight: When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture »
Yichuan Mo · Dongxian Wu · Yifei Wang · Yiwen Guo · Yisen Wang -
2022 Spotlight: Lightning Talks 4A-3 »
Zhihan Gao · Yabin Wang · Xingyu Qu · Luziwei Leng · Mingqing Xiao · Bohan Wang · Yu Shen · Zhiwu Huang · Xingjian Shi · Qi Meng · Yupeng Lu · Diyang Li · Qingyan Meng · Kaiwei Che · Yang Li · Hao Wang · Huishuai Zhang · Zongpeng Zhang · Kaixuan Zhang · Xiaopeng Hong · Xiaohan Zhao · Di He · Jianguo Zhang · Yaofeng Tu · Bin Gu · Yi Zhu · Ruoyu Sun · Yuyang (Bernie) Wang · Zhouchen Lin · Qinghu Meng · Wei Chen · Wentao Zhang · Bin CUI · Jie Cheng · Zhi-Ming Ma · Mu Li · Qinghai Guo · Dit-Yan Yeung · Tie-Yan Liu · Jianxing Liao -
2022 Spotlight: Online Training Through Time for Spiking Neural Networks »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Di He · Zhouchen Lin -
2022 Spotlight: Lightning Talks 1B-3 »
Chaofei Wang · Qixun Wang · Jing Xu · Long-Kai Huang · Xi Weng · Fei Ye · Harsh Rangwani · shrinivas ramasubramanian · Yifei Wang · Qisen Yang · Xu Luo · Lei Huang · Adrian G. Bors · Ying Wei · Xinglin Pan · Sho Takemori · Hong Zhu · Rui Huang · Lei Zhao · Yisen Wang · Kato Takashi · Shiji Song · Yanan Li · Rao Anwer · Yuhei Umeda · Salman Khan · Gao Huang · Wenjie Pei · Fahad Shahbaz Khan · Venkatesh Babu R · Zenglin Xu -
2022 Spotlight: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors »
Qixun Wang · Yifei Wang · Hong Zhu · Yisen Wang -
2022 Poster: Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network? »
Yibo Yang · Shixiang Chen · Xiangtai Li · Liang Xie · Zhouchen Lin · Dacheng Tao -
2022 Poster: How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders »
Qi Zhang · Yifei Wang · Yisen Wang -
2022 Poster: Towards Theoretically Inspired Neural Initialization Optimization »
Yibo Yang · Hong Wang · Haobo Yuan · Zhouchen Lin -
2022 Poster: Deep Equilibrium Approaches to Diffusion Models »
Ashwini Pokle · Zhengyang Geng · J. Zico Kolter -
2022 Poster: Online Training Through Time for Spiking Neural Networks »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Di He · Zhouchen Lin -
2021 Poster: Clustering Effect of Adversarial Robust Models »
Yang Bai · Xin Yan · Yong Jiang · Shu-Tao Xia · Yisen Wang -
2021 Poster: Joint inference and input optimization in equilibrium networks »
Swaminathan Gurumurthy · Shaojie Bai · Zachary Manchester · J. Zico Kolter -
2021 Poster: Dissecting the Diffusion Process in Linear Graph Convolutional Networks »
Yifei Wang · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2021 Poster: $(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations »
Zhichun Huang · Shaojie Bai · J. Zico Kolter -
2021 Poster: Adversarial Neuron Pruning Purifies Backdoored Deep Models »
Dongxian Wu · Yisen Wang -
2021 Poster: Gauge Equivariant Transformer »
Lingshen He · Yiming Dong · Yisen Wang · Dacheng Tao · Zhouchen Lin -
2021 Poster: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State »
Mingqing Xiao · Qingyan Meng · Zongpeng Zhang · Yisen Wang · Zhouchen Lin -
2021 Poster: Efficient Equivariant Network »
Lingshen He · Yuxuan Chen · zhengyang shen · Yiming Dong · Yisen Wang · Zhouchen Lin -
2021 Poster: Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness »
Jie Ren · Die Zhang · Yisen Wang · Lu Chen · Zhanpeng Zhou · Yiting Chen · Xu Cheng · Xin Wang · Meng Zhou · Jie Shi · Quanshi Zhang -
2021 Poster: Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks »
Hanxun Huang · Yisen Wang · Sarah Erfani · Quanquan Gu · James Bailey · Xingjun Ma -
2021 Poster: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks »
Chen Ma · Xiangyu Guo · Li Chen · Jun-Hai Yong · Yisen Wang -
2021 Poster: Residual Relaxation for Multi-view Representation Learning »
Yifei Wang · Zhengyang Geng · Feng Jiang · Chuming Li · Yisen Wang · Jiansheng Yang · Zhouchen Lin -
2021 Poster: MoriĆ© Attack (MA): A New Potential Risk of Screen Photos »
Dantong Niu · Ruohao Guo · Yisen Wang -
2020 Poster: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding »
Yibo Yang · Hongyang Li · Shan You · Fei Wang · Chen Qian · Zhouchen Lin -
2020 Poster: Adversarial Weight Perturbation Helps Robust Generalization »
Dongxian Wu · Shu-Tao Xia · Yisen Wang -
2018 Workshop: NIPS 2018 workshop on Compact Deep Neural Networks with industrial applications »
Lixin Fan · Zhouchen Lin · Max Welling · Yurong Chen · Werner Bailer -
2018 Poster: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Spotlight: SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator »
Cong Fang · Chris Junchi Li · Zhouchen Lin · Tong Zhang -
2018 Poster: Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution »
Zhisheng Zhong · Tiancheng Shen · Yibo Yang · Zhouchen Lin · Chao Zhang -
2017 Poster: Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers »
Cong Fang · Feng Cheng · Zhouchen Lin -
2015 Poster: Accelerated Proximal Gradient Methods for Nonconvex Programming »
Huan Li · Zhouchen Lin