NIPS Poster Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering

Poster

Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering

Dogyoon Song · Christina Lee · Yihua Li · Devavrat Shah

Area 5+6+7+8 #39

Keywords: [ Learning Theory ] [ (Other) Probabilistic Models and Methods ] [ Similarity and Distance Learning ] [ (Application) Collaborative Filtering and Recommender Systems ] [ Ranking and Preference Learning ]

[ Abstract ]

Abstract: We introduce the framework of blind regression motivated by matrix completion for recommendation systems: given

m

$m$ users,

n

$n$ movies, and a subset of user-movie ratings, the goal is to predict the unobserved user-movie ratings given the data, i.e., to complete the partially observed matrix. Following the framework of non-parametric statistics, we posit that user

u

$u$ and movie

i

$i$ have features

x 1 (u)

$x1(u)$ and

x 2 (i)

$x2(i)$ respectively, and their corresponding rating

y (u, i)

$y(u,i)$ is a noisy measurement of

f (x 1 (u), x 2 (i))

$f(x1(u), x2(i))$ for some unknown function

f

$f$ . In contrast with classical regression, the features

x = (x 1 (u), x 2 (i))

$x = (x1(u), x2(i))$ are not observed, making it challenging to apply standard regression methods to predict the unobserved ratings. Inspired by the classical Taylor's expansion for differentiable functions, we provide a prediction algorithm that is consistent for all Lipschitz functions. In fact, the analysis through our framework naturally leads to a variant of collaborative filtering, shedding insight into the widespread success of collaborative filtering in practice. Assuming each entry is sampled independently with probability at least

max (m^{- 1 + δ}, n^{- 1 / 2 + δ})

$\max(m^{-1+\delta},n^{-1/2+\delta})$ with

δ > 0

$\delta > 0$ , we prove that the expected fraction of our estimates with error greater than

ϵ

$\epsilon$ is less than

γ^{2} / ϵ^{2}

$\gamma^2 / \epsilon^2$ plus a polynomially decaying term, where

γ^{2}

$\gamma^2$ is the variance of the additive entry-wise noise term. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides principled improvements over basic collaborative filtering and is competitive with matrix factorization methods.

Live content is unavailable. Log in and register to view live content