On the Sensitivity of Adversarial Robustness to Input Data Distributions

Existing literature on adversarial training has largely focused on improving models’ robustness. In this paper, we demonstrate an intriguing phenomenon about adversarial training – that adversarial robustness, unlike clean accuracy, is highly sensitive to the input data distribution. Even a semantics-preserving transformation on the input data distribution can cause drastically different robustness for the adversarially trained model, which is both trained and evaluated on the new distribution.

We discover this sensitivity by analyzing the Bayes classifier’s clean accuracy and robust accuracy. Extensive empirical investigation confirms our finding. Numerous neural nets trained on MNIST and CIFAR10 variants achieve comparable clean accuracies, but they exhibit very different robustness when adversarially trained. This counter-intuitive phenomenon suggests that input data distribution alone can affect the adversarial robustness of trained neural networks, not necessarily the tasks themselves. Lastly, we discuss practical implications on evaluating adversarial robustness, and make initial attempts to understand this complex phenomenon.

Related Research

A High-level Overview of Large Language Models

A High-level Overview of Large Language Models

W. Zi, L. El Asri, and S. Prince.

Learning And Generalization; Natural Language Processing

Research
Borealis AI at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future

Borealis AI at International Conference on Learning Representations (ICLR): Machine Learning for a better financial future

Learning And Generalization; Natural Language Processing; Time series Modelling

Research
Few-Shot Learning & Meta-Learning | Tutorial

Few-Shot Learning & Meta-Learning | Tutorial

W. Zi, L. S. Ghoraie, and S. Prince.

Learning And Generalization

Research

Cookies Settings

Related Research