NIPS Poster Lower bounds on the robustness to adversarial perturbations

Poster

Lower bounds on the robustness to adversarial perturbations

Jonathan Peck · Joris Roels · Bart Goossens · Yvan Saeys

Pacific Ballroom #138

[ Abstract ]

Abstract:

The input-output mappings learned by state-of-the-art neural networks are significantly discontinuous. It is possible to cause a neural network used for image recognition to misclassify its input by applying very specific, hardly perceptible perturbations to the input, called adversarial perturbations. Many hypotheses have been proposed to explain the existence of these peculiar samples as well as several methods to mitigate them. A proven explanation remains elusive, however. In this work, we take steps towards a formal characterization of adversarial perturbations by deriving lower bounds on the magnitudes of perturbations necessary to change the classification of neural networks. The bounds are experimentally verified on the MNIST and CIFAR-10 data sets.

Live content is unavailable. Log in and register to view live content