NIPS Poster Fast Algorithms for Robust PCA via Gradient Descent

Poster

Fast Algorithms for Robust PCA via Gradient Descent

Xinyang Yi · Dohyung Park · Yudong Chen · Constantine Caramanis

Area 5+6+7+8 #28

Keywords: [ Learning Theory ] [ Matrix Factorization ] [ (Other) Optimization ] [ Sparsity and Feature Selection ] [ (Other) Machine Learning Topics ] [ Component Analysis (ICA,PCA,CCA, FLDA) ]

[ Abstract ]

Abstract: We consider the problem of Robust PCA in the fully and partially observed settings. Without corruptions, this is the well-known matrix completion problem. From a statistical standpoint this problem has been recently well-studied, and conditions on when recovery is possible (how many observations do we need, how many corruptions can we tolerate) via polynomial-time algorithms is by now understood. This paper presents and analyzes a non-convex optimization approach that greatly reduces the computational complexity of the above problems, compared to the best available algorithms. In particular, in the fully observed case, with

r

$r$ denoting rank and

d

$d$ dimension, we reduce the complexity from

O (r^{2} d^{2} \log (1 / ϵ))

$O(r^2d^2\log(1/\epsilon))$ to

O (r d^{2} \log (1 / ϵ))

$O(rd^2\log(1/\epsilon))$ -- a big savings when the rank is big. For the partially observed case, we show the complexity of our algorithm is no more than

O (r^{4} d \log (d) \log (1 / ϵ))

$O(r^4d\log(d)\log(1/\epsilon))$ . Not only is this the best-known run-time for a provable algorithm under partial observation, but in the setting where

r

$r$ is small compared to

d

$d$ , it also allows for near-linear-in-

d

$d$ run-time that can be exploited in the fully-observed case as well, by simply running our algorithm on a subset of the observations.

Live content is unavailable. Log in and register to view live content