NeurIPS Poster Connecting Certified and Adversarial Training

Poster

Connecting Certified and Adversarial Training

Yuhao Mao · Mark Müller · Marc Fischer · Martin Vechev

Great Hall & Hall B1+B2 (level 1) #1708

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: Training certifiably robust neural networks remains a notoriously hard problem.While adversarial training optimizes under-approximations of the worst-case loss, which leads to insufficient regularization for certification, sound certified training methods, optimize loose over-approximations, leading to over-regularization and poor (standard) accuracy.In this work, we propose TAPS, an (unsound) certified training method that combines IBP and PGD training to optimize more precise, although not necessarily sound, worst-case loss approximations, reducing over-regularization and increasing certified and standard accuracies.Empirically, TAPS achieves a new state-of-the-art in many settings, e.g., reaching a certified accuracy of

22

$22$ % on TinyImageNet for

ℓ_{\infty}

$\ell_\infty$ -perturbations with radius

ϵ = 1 / 255

$\epsilon=1/255$ . We make our implementation and networks public at https://github.com/eth-sri/taps.

Chat is not available.