Skip to yearly menu bar Skip to main content


Poster

On the Optimal Weighted 2 Regularization in Overparameterized Linear Regression

Denny Wu · Ji Xu

Poster Session 6 #1726

Abstract: We consider the linear model \vy=\vX\vbeta+\vepsilon with \vXRn×p in the overparameterized regime p>n. We estimate \vbeta via generalized (weighted) ridge regression: \vbeta^λ=(\vX\t\vX+λ\vSigmaw)\vX\t\vy, where \vSigmaw is the weighting matrix. Under a random design setting with general data covariance \vSigmax and anisotropic prior on the true coefficients \bbE\vbeta\vbeta\t=\vSigmaβ, we provide an exact characterization of the prediction risk E(y\vx\t\vbeta^λ)2 in the proportional asymptotic limit p/nγ(1,). Our general setup leads to a number of interesting findings. We outline precise conditions that decide the sign of the optimal setting λ\opt for the ridge parameter λ, which suggests an implicit 2 regularization effect of overparameterization, and theoretically justifies the surprising empirical observation that λ\opt can be \textit{negative} in the overparameterized regime. We also characterize the double descent phenomenon for principal component regression (PCR) when \vX and \vbeta are non-isotropic. Finally, we determine the optimal \vSigmaw for both the ridgeless (λ0) and optimally regularized (λ=λ\opt) case, and demonstrate the advantage of the weighted objective over standard ridge regression and PCR.

Chat is not available.