Timezone: »
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance. However, weight sharing across models has an inherent deficiency, i.e., insufficient training of subnetworks in the hypernetwork. To alleviate this problem, we present a simple yet effective architecture distillation method. The central idea is that subnetworks can learn collaboratively and teach each other throughout the training process, aiming to boost the convergence of individual models. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Distilling knowledge from the prioritized paths is able to boost the training of subnetworks. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop. We directly select the most promising one from the prioritized paths as the final architecture, without using other complex search methods, such as reinforcement learning or evolution algorithms. The experiments on ImageNet verify such path distillation method can improve the convergence ratio and performance of the hypernetwork, as well as boosting the training of subnetworks. The discovered architectures achieve superior performance compared to the recent MobileNetV3 and EfficientNet families under aligned settings. Moreover, the experiments on object detection and more challenging search space show the generality and robustness of the proposed method. Code and models are available at \url{https://github.com/neurips-20/cream.git}.
Author Information
Houwen Peng (Microsoft Research)
Hao Du (Microsoft Research)
Hongyuan Yu (MSRA)
QI LI (Tsinghua Univeristy)
Jing Liao (City University of Hong Kong)
Jianlong Fu (Microsoft Research)
More from the Same Authors
-
2022 Poster: Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning »
Yuchong Sun · Hongwei Xue · Ruihua Song · Bei Liu · Huan Yang · Jianlong Fu -
2022 Poster: PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies »
Guocheng Qian · Yuchen Li · Houwen Peng · Jinjie Mai · Hasan Hammoud · Mohamed Elhoseiny · Bernard Ghanem -
2021 Poster: Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers »
Yanhong Zeng · Huan Yang · Hongyang Chao · Jianbo Wang · Jianlong Fu -
2021 Poster: Searching the Search Space of Vision Transformer »
Minghao Chen · Kan Wu · Bolin Ni · Houwen Peng · Bei Liu · Jianlong Fu · Hongyang Chao · Haibin Ling -
2021 Poster: Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training »
Hongwei Xue · Yupan Huang · Bei Liu · Houwen Peng · Jianlong Fu · Houqiang Li · Jiebo Luo -
2020 Poster: Passport-aware Normalization for Deep Model Protection »
Jie Zhang · Dongdong Chen · Jing Liao · Weiming Zhang · Gang Hua · Nenghai Yu -
2020 Poster: Learning Semantic-aware Normalization for Generative Adversarial Networks »
Heliang Zheng · Jianlong Fu · Yanhong Zeng · Jiebo Luo · Zheng-Jun Zha -
2020 Spotlight: Learning Semantic-aware Normalization for Generative Adversarial Networks »
Heliang Zheng · Jianlong Fu · Yanhong Zeng · Jiebo Luo · Zheng-Jun Zha -
2019 Poster: Learning Deep Bilinear Transformation for Fine-grained Image Representation »
Heliang Zheng · Jianlong Fu · Zheng-Jun Zha · Jiebo Luo -
2019 Poster: Transductive Zero-Shot Learning with Visual Structure Constraint »
Ziyu Wan · Dongdong Chen · Yan Li · Xingguang Yan · Junge Zhang · Yizhou Yu · Jing Liao