Timezone: »
Matrix factorization (MF) is a versatile learning method that has found wide applications in various data-driven disciplines. Still, many MF algorithms do not adequately scale with the size of available datasets and/or lack interpretability. To improve the computational efficiency of the method, an online (streaming) MF algorithm was proposed in Mairal et al., 2010. To enable data interpretability, a constrained version of MF, termed convex MF, was introduced in Ding et al., 2010. In the latter work, the basis vectors are required to lie in the convex hull of the data samples, thereby ensuring that every basis can be interpreted as a weighted combination of data samples. No current algorithmic solutions for online convex MF are known as it is challenging to find adequate convex bases without having access to the complete dataset. We address both problems by proposing the first online convex MF algorithm that maintains a collection of constant-size sets of representative data samples needed for interpreting each of the basis (Ding et al., 2010) and has the same almost sure convergence guarantees as the online learning algorithm of Mairal et al., 2010. Our proof techniques combine random coordinate descent algorithms with specialized quasi-martingale convergence analysis. Experiments on synthetic and real world datasets show significant computational savings of the proposed online convex MF method compared to classical convex MF. Since the proposed method maintains small representative sets of data samples needed for convex interpretations, it is related to a body of work in theoretical computer science, pertaining to generating point sets (Blum et al., 2016), and in computer vision, pertaining to archetypal analysis (Mei et al., 2018). Nevertheless, it differs from these lines of work both in terms of the objective and algorithmic implementations.
Author Information
Jianhao Peng (University of Illinois at Urbana Champaign)
Olgica Milenkovic (University of Illinois at Urbana-Champaign)
Abhishek Agarwal (University of Illinois at Urbana Champaign)
More from the Same Authors
-
2022 : Certified Graph Unlearning »
Eli Chien · Chao Pan · Olgica Milenkovic -
2023 Poster: Differentially Private Decoupled Graph Convolutions for Multigranular Topology Protection »
Eli Chien · Wei-Ning Chen · Chao Pan · Pan Li · Ayfer Ozgur · Olgica Milenkovic -
2019 Poster: Optimizing Generalized PageRank Methods for Seed-Expansion Community Detection »
Pan Li · Eli Chien · Olgica Milenkovic -
2018 Poster: Query K-means Clustering and the Double Dixie Cup Problem »
Eli Chien · Chao Pan · Olgica Milenkovic -
2018 Poster: Revisiting Decomposable Submodular Function Minimization with Incidence Relations »
Pan Li · Olgica Milenkovic -
2018 Poster: Quadratic Decomposable Submodular Function Minimization »
Pan Li · Niao He · Olgica Milenkovic -
2017 Poster: Inhomogeneous Hypergraph Clustering with Applications »
Pan Li · Olgica Milenkovic -
2017 Spotlight: Inhomogoenous Hypergraph Clustering with Applications »
Pan Li · Olgica Milenkovic -
2012 Workshop: Social Choice: Theory and Practice »
Behrouz Touri · Olgica Milenkovic · Faramarz Fekri