Spotlight
Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang

Thu Dec 12th 10:20 -- 10:25 AM @ West Exhibition Hall B

Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to $\ell2$-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably $\ell2$-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further. Our code and trained models are available at http://github.com/Hadisalman/smoothing-adversarial.

Author Information

Hadi Salman (Microsoft Research AI)
Jerry Li (Microsoft)
Ilya Razenshteyn (Microsoft Research)
Pengchuan Zhang (Microsoft Research)
Huan Zhang (Microsoft Research AI)
Sebastien Bubeck (Microsoft Research)
Greg Yang (Microsoft Research)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors