Gradient Masking of Label Smoothing in Adversarial Robustness

Hyungyu Lee, Ho Bae, Sungroh Yoon

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Deep neural networks (DNNs) have achieved impressive results in several image classification tasks. However, these architectures are unstable for adversarial examples (AEs) such as inputs crafted by a hardly perceptible perturbation with the intent of causing neural networks to make errors. AEs must be considered to prevent accidents in areas such as unmanned car driving using visual object detection in Internet of Things (IoT) networks. Gaussian noise with label smoothing or logit squeezing can be used to increase the robustness against AEs in the training of DNNs. However, from a model interpretability aspect, Gaussian noise with label smoothing does not increase the adversarial robustness of the model. To resolve this problem, we tackle the AE instead of measuring the accuracy of the model against AEs. Considering that a robust model shows a small curvature of the loss surface, we propose a metric to measure the strength of the AEs and the robustness of the model. Furthermore, we introduce a method to verify the existence of the obfuscated gradients of the model based on the black-box attack sanity check method. The proposed method enables us to identify a gradient masking problem wherein the model does not provide useful gradients and exploits false defenses. We evaluate our technique against representative adversarially trained models using the CIFAR10, CIFAR100, SVHN, and Restricted ImageNet datasets. Our results show that the performance of some false defense models decreases by up to 32% compared to the previous evaluation metrics. Moreover, our metric reveals that traditional metrics used to measure the robustness of the model may produce false results.

Original languageEnglish
Article number9311250
Pages (from-to)6453-6464
Number of pages12
JournalIEEE Access
Volume9
DOIs
StatePublished - 2021

Keywords

  • Adversarial learning
  • IoT
  • IoT security
  • deep learning
  • evasion attack
  • gradient masking
  • interpretability
  • label smoothing

Fingerprint

Dive into the research topics of 'Gradient Masking of Label Smoothing in Adversarial Robustness'. Together they form a unique fingerprint.

Cite this