Combining Online CUR Decomposition and Matrix Sketching for Data Streams in Open Feature Spaces
Abstract
Online learning is crucial for real-time prediction on streaming data. Two key approaches are sparse online learning, which enforces model efficiency and interpretability, and online active learning, which minimizes labeling costs by querying the most informative instances. However, existing methods often treat these separately and assume a static feature space, ignoring the reality of features that appear or disappear over time. To bridge this gap, we propose a novel online learning framework Online-CUR-MS for data streams in open feature spaces. It combines sparsity constraints with active learning under labeling budgets. Its core is a novel online CUR decomposition, strengthened by Huber regularization for outlier robustness and matrix sketching for memory efficiency. Empirical results on dynamic benchmarks show its superior accuracy and sparsity over state-of-the-art baselines.