Timezone: »

Learning single-index models with shallow neural networks
Alberto Bietti · Joan Bruna · Clayton Sanford · Min Jae Song

Tue Nov 29 09:00 AM -- 11:00 AM (PST) @ Hall J #508

Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input. These models are particularly relevant in high dimension, when the data might present low-dimensional structure that learning algorithms should adapt to. While several statistical aspects of this model, such as the sample complexity of recovering the relevant (one-dimensional) subspace, are well-understood, they rely on tailored algorithms that exploit the specific structure of the target function. In this work, we introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow. More precisely, we consider shallow networks in which biases of the neurons are frozen at random initialization. We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.

Author Information

Alberto Bietti (Meta AI / NYU)
Joan Bruna (NYU)
Clayton Sanford (Columbia University)
Min Jae Song (New York University)

I am a 5th year PhD candidate advised by Prof. Joan Bruna and Prof. Oded Regev at the Courant Institute, NYU. I am a member of the CILVR (Computational Intelligence, Learning, Vision and Robotics) group and the MaD (Math and Data) group. I am interested in theoretical computer science and machine learning. My focus is on understanding the limitations of learning using computational intractability assumptions and the power of learning with neural networks.

More from the Same Authors