Poster
Implicit Regularization of Decentralized Gradient Descent for Sparse Regression
Tongle Wu · Ying Sun
West Ballroom A-D #6006
[
Abstract
]
Thu 12 Dec 11 a.m. PST
— 2 p.m. PST
Abstract:
We consider learning a sparse model from linear measurements taken by a network of agents. Different from existing decentralized methods designed based on the LASSO regression with explicit $\ell_1$ norm regularization, we exploit the implicit regularization of gradient descent and analyze its decentralized counterpart applied to an over-parameterized nonconvex least squares formulation without penalization. Our first result shows that despite nonconvexity, the classical decentralized gradient descent algorithm (DGD) with small initialization and early stopping can compute the statistically optimal solution. Sufficient conditions on the initialization scale, choice of step size, network connectivity, and stopping time are further provided to achieve convergence. Our result recovers the convergence rate of gradient descent in the centralized setting, showing its tightness. Based on the analysis of DGD, we further propose a communication-efficient version, termed T-DGD, by truncating the iterates before transmission. In the high signal-to-noise ratio (SNR) regime, we show that T-DGD achieves comparable statistical accuracy to DGD, while the communication cost is logarithmic in the number of parameters. Numerical results are provided validating the effectiveness of DGD and T-DGD for sparse learning through implicit regularization.
Live content is unavailable. Log in and register to view live content