Timezone: »

Layer-Parallel Training of Residual Networks with Auxiliary Variables
Qi Sun · Hexin Dong · Zewei Chen · WeiZhen Dian · Jiacheng Sun · Yitong Sun · Zhenguo Li · Bin Dong

Tue Dec 14 06:45 AM -- 07:30 AM (PST) @ None
Event URL: https://openreview.net/forum?id=IQnyk7w1BC »

Backpropagation algorithm is indispensable for training modern residual networks (ResNets) and usually tends to be time-consuming due to its inherent algorithmic lockings. Auxiliary-variable methods, e.g., the penalty and augmented Lagrangian (AL) methods, have attracted much interest lately due to their ability to exploit layer5 wise parallelism. However, we find that large communication overhead and lacking data augmentation are two key challenges of these approaches, which may lead to low speedup and accuracy drop. Inspired by the continuous-time formulation of ResNets, we propose a novel serial-parallel hybrid (SPH) training strategy to enable the use of data augmentation during training, together with downsampling (DS) filters to reduce the communication cost. This strategy first trains the network by solving a succession of independent sub-problems in parallel and then improve the trained network through a full serial forward-backward propagation of data. We validate our methods on modern ResNets across benchmark datasets, achieving speedup over the backpropagation while maintaining comparable accuracy.

Author Information

Qi Sun (Peking University, Tsinghua University)
Hexin Dong (Peking University)
Zewei Chen (The Hong Kong University of Science and Technology)
WeiZhen Dian (Peking university)
Jiacheng Sun (Huawei Technologies Co., Ltd)
Yitong Sun (University of Michigan)
Zhenguo Li (Noah's Ark Lab, Huawei Tech Investment Co Ltd)
Bin Dong (Peking University)

More from the Same Authors