Adversarial Examples on Segmentation Models Can be Easy to Transfer
Deep neural network-based image classification can be misled by adversarial
examples with small and quasi-imperceptible perturbations. Furthermore, the
adversarial examples created on one classification model can also fool another
different model. The transferability of the adversarial examples has recently
attracted a growing interest since it makes black-box attacks on classification
models feasible. As an extension of classification, semantic segmentation has
also received much attention towards its adversarial robustness. However, the
transferability of adversarial examples on segmentation models has not been
systematically studied. In this work, we intensively study this topic. First,
we explore the overfitting phenomenon of adversarial examples on classification
and segmentation models. In contrast to the observation made on classification
models that the transferability is limited by overfitting to the source model,
we find that the adversarial examples on segmentations do not always overfit
the source models. Even when no overfitting is presented, the transferability
of adversarial examples is limited. We attribute the limitation to the
architectural traits of segmentation models, i.e., multi-scale object
recognition. Then, we propose a simple and effective method, dubbed dynamic
scaling, to overcome the limitation. The high transferability achieved by our
method shows that, in contrast to the observations in previous work,
adversarial examples on a segmentation model can be easy to transfer to other
segmentation models. Our analysis and proposals are supported by extensive
experiments.
Authors
Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr