Timezone: »

On the interplay between data structure and loss function in classification problems
Stéphane d'Ascoli · Marylou Gabrié · Levent Sagun · Giulio Biroli

Thu Dec 09 08:30 AM -- 10:00 AM (PST) @ None #None

One of the central features of modern machine learning models, including deep neural networks, is their generalization ability on structured data in the over-parametrized regime. In this work, we consider an analytically solvable setup to investigate how properties of data impact learning in classification problems, and compare the results obtained for quadratic loss and logistic loss. Using methods from statistical physics, we obtain a precise asymptotic expression for the train and test errors of random feature models trained on a simple model of structured data. The input covariance is built from independent blocks allowing us to tune the saliency of low-dimensional structures and their alignment with respect to the target function.Our results show in particular that in the over-parametrized regime, the impact of data structure on both train and test error curves is greater for logistic loss than for mean-squared loss: the easier the task, the wider the gap in performance between the two losses at the advantage of the logistic. Numerical experiments on MNIST and CIFAR10 confirm our insights.

Author Information

Stéphane d'Ascoli (ENS Paris / Meta AI)

Currently a joint Ph.D. student between ENS (supervised by Giulio Biroli) and FAIR (supervised by Levent Sagun). Working on theory of deep learning.

Marylou Gabrié (NYU)
Levent Sagun (EPFL)
Giulio Biroli (Ecole Normale Superieure)

More from the Same Authors