NeurIPS 2020 : Coresets for Robust Training of Deep Neural Networks against Noisy Labels



Coresets for Robust Training of Deep Neural Networks against Noisy Labels

Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec

Poster Session 7 (more posters)
on 2020-12-10T21:00:00-08:00 - 2020-12-10T23:00:00-08:00

Toggle Abstract Paper (in Proceedings / .pdf)

Abstract: Modern neural networks have the capacity to overfit noisy labels frequently found in real-world datasets. Although great progress has been made, existing techniques are very limited in providing theoretical guarantees for the performance of the neural networks trained with noisy labels. To tackle this challenge, we propose a novel approach with strong theoretical guarantees for robust training of neural networks trained with noisy labels. The key idea behind our method is to select subsets of clean data points that provide an approximately low-rank Jacobian matrix. We then prove that gradient descent applied to the subsets cannot overfit the noisy labels, without regularization or early stopping. Our extensive experiments corroborate our theory and demonstrate that deep networks trained on our subsets achieve a significantly superior performance, e.g., 7% increase in accuracy on mini Webvision with 50% noisy labels, compared to state-of-the art.

Coresets for Robust Training of Deep Neural Networks against Noisy Labels

Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec

Preview Video and Chat

To see video, interact with the author and ask questions please use registration and login.