Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Mathematics of Modern Machine Learning (M3L)

Complexity of Vector-valued Prediction: From Linear Models to Stochastic Convex Optimization

Matan Schliserman · Tomer Koren

Keywords: [ Sample complexity ] [ Vector Valued predictors ]


Abstract: We study the problem of learning vector-valued linear predictors: these are prediction rules parameterized by a matrix that maps an m-dimensional feature vector to a k-dimensional target. We focus on the fundamental case with a convex and Lipschitz loss function, and show several new theoretical results that shed light on the complexity of this problem and its connection to related learning models. First, we give a tight characterization of the sample complexity of Empirical Risk Minimization (ERM) in this setting, establishing that Ω~(k/ϵ2) examples are necessary for ERM to reach ϵ excess (population) risk; this provides for an exponential improvement over recent results by Magen and Shamir (2023) in terms of the dependence on the target dimension k, and matches a classical upper bound due to Maurer (2016).Second, we present a black-box conversion from general d-dimensional Stochastic Convex Optimization (SCO) to vector-valued linear prediction, showing that any SCO problem can be embedded as a prediction problem with k=Θ(d) outputs.These results portray the setting of vector-valued linear prediction as bridging between two extensively studied yet disparate learning models: linear models (corresponds to k=1) and general d-dimensional SCO (with k=Θ(d)).

Chat is not available.