Timezone: »
There is an emerging trend to train a network with stochastic architectures to enable various architectures to be plugged and played during inference. However, the existing investigation is highly entangled with neural architecture search (NAS), limiting its widespread use across scenarios. In this work, we decouple the training of a network with stochastic architectures (NSA) from NAS and provide a first systematical investigation on it as a stand-alone problem. We first uncover the characteristics of NSA in various aspects ranging from training stability, convergence, predictive behaviour, to generalization capacity to unseen architectures. We identify various issues of the vanilla NSA, such as training/test disparity and function mode collapse, and further propose the solutions to these issues with theoretical and empirical insights. We believe that these results could also serve as good heuristics for NAS. Given these understandings, we further apply the NSA with our improvements into diverse scenarios to fully exploit its promise of inference-time architecture stochasticity, including model ensemble, uncertainty estimation and semi-supervised learning. Remarkable performance (e.g., 2.75% error rate and 0.0032 expected calibration error on CIFAR-10) validate the effectiveness of such a model, providing new perspectives of exploring the potential of the network with stochastic architectures, beyond NAS.
Author Information
Zhijie Deng (Tsinghua University)
Yinpeng Dong (Tsinghua University)
Shifeng Zhang (Department of Computer Science and Technology, Tsinghua University)
Jun Zhu (Tsinghua University)
More from the Same Authors
-
2020 Poster: Multi-label classification: do Hamming loss and subset accuracy really conflict with each other? »
Guoqiang Wu · Jun Zhu -
2020 Poster: Bi-level Score Matching for Learning Energy-based Latent Variable Models »
Fan Bao · Chongxuan LI · Kun Xu · Hang Su · Jun Zhu · Bo Zhang -
2020 Poster: Further Analysis of Outlier Detection with Deep Generative Models »
Ziyu Wang · Bin Dai · David P Wipf · Jun Zhu -
2020 Poster: Efficient Learning of Generative Models via Finite-Difference Score Matching »
Tianyu Pang · Kun Xu · Chongxuan LI · Yang Song · Stefano Ermon · Jun Zhu -
2020 Poster: Calibrated Reliable Regression using Maximum Mean Discrepancy »
Peng Cui · Wenbo Hu · Jun Zhu -
2020 Poster: Boosting Adversarial Training with Hypersphere Embedding »
Tianyu Pang · Xiao Yang · Yinpeng Dong · Kun Xu · Jun Zhu · Hang Su -
2020 Poster: Adversarial Distributional Training for Robust Deep Learning »
Yinpeng Dong · Zhijie Deng · Tianyu Pang · Jun Zhu · Hang Su -
2019 Poster: Improving Black-box Adversarial Attacks with a Transfer-based Prior »
Shuyu Cheng · Yinpeng Dong · Tianyu Pang · Hang Su · Jun Zhu -
2019 Poster: Generative Well-intentioned Networks »
Justin Cosentino · Jun Zhu -
2019 Poster: Multi-objects Generation with Amortized Structural Regularization »
Kun Xu · Chongxuan LI · Jun Zhu · Bo Zhang -
2018 Poster: Towards Robust Detection of Adversarial Examples »
Tianyu Pang · Chao Du · Yinpeng Dong · Jun Zhu -
2018 Spotlight: Towards Robust Detection of Adversarial Examples »
Tianyu Pang · Chao Du · Yinpeng Dong · Jun Zhu -
2018 Poster: Graphical Generative Adversarial Networks »
Chongxuan LI · Max Welling · Jun Zhu · Bo Zhang -
2017 Poster: Triple Generative Adversarial Nets »
Chongxuan LI · Kun Xu · Jun Zhu · Bo Zhang -
2017 Poster: Population Matching Discrepancy and Applications in Deep Learning »
Jianfei Chen · Chongxuan LI · Yizhong Ru · Jun Zhu -
2016 Poster: Kernel Bayesian Inference with Posterior Regularization »
Yang Song · Jun Zhu · Yong Ren -
2016 Poster: Stochastic Gradient Geodesic MCMC Methods »
Chang Liu · Jun Zhu · Yang Song -
2016 Poster: Conditional Generative Moment-Matching Networks »
Yong Ren · Jun Zhu · Jialian Li · Yucen Luo -
2015 Poster: Max-Margin Majority Voting for Learning from Crowds »
TIAN TIAN · Jun Zhu -
2015 Poster: Max-Margin Deep Generative Models »
Chongxuan Li · Jun Zhu · Tim Shi · Bo Zhang -
2014 Poster: Distributed Bayesian Posterior Sampling via Moment Sharing »
Minjie Xu · Balaji Lakshminarayanan · Yee Whye Teh · Jun Zhu · Bo Zhang -
2014 Poster: Spectral Methods for Supervised Topic Models »
Yining Wang · Jun Zhu -
2014 Poster: Robust Bayesian Max-Margin Clustering »
Changyou Chen · Jun Zhu · Xinhua Zhang -
2013 Poster: Scalable Inference for Logistic-Normal Topic Models »
Jianfei Chen · Jun Zhu · Zi Wang · Xun Zheng · Bo Zhang -
2012 Poster: Monte Carlo Methods for Maximum Margin Supervised Topic Models »
Qixia Jiang · Jun Zhu · Maosong Sun · Eric Xing -
2012 Poster: Bayesian Nonparametric Maximum Margin Matrix Factorization for Collaborative Prediction »
Minjie Xu · Jun Zhu · Bo Zhang -
2011 Poster: Infinite Latent SVM for Classification and Multi-task Learning »
Jun Zhu · Ning Chen · Eric Xing -
2010 Poster: Large Margin Learning of Upstream Scene Understanding Models »
Jun Zhu · Li-Jia Li · Li Fei-Fei · Eric Xing -
2010 Poster: Predictive Subspace Learning for Multi-view Data: a Large Margin Approach »
Ning Chen · Jun Zhu · Eric Xing -
2010 Poster: Adaptive Multi-Task Lasso: with Application to eQTL Detection »
Seunghak Lee · Jun Zhu · Eric Xing -
2010 Poster: Efficient Relational Learning with Hidden Variable Detection »
Ni Lao · Jun Zhu · Liu Xinwang · Yandong Liu · William Cohen -
2008 Poster: Partially Observed Maximum Entropy Discrimination Markov Networks »
Jun Zhu · Eric Xing · Bo Zhang