Regularizing Hard Examples Improves Adversarial Robustness

Hyungyu Lee, Saehyung Lee, Ho Bae, Sungroh Yoon

Research output: Contribution to journalArticlepeer-review

Abstract

Recent studies have validated that pruning hard-to-learn examples from training improves the generalization performance of neural networks (NNs). In this study, we investigate this intriguing phenomenon—the negative effect of hard examples on generalization—in adversarial training. Particularly, we theoretically demonstrate that the increase in the difficulty of hard examples in adversarial training is significantly greater than the increase in the difficulty of easy examples. Furthermore, we verify that hard examples are only fitted through memorization of the label in adversarial training. We conduct both theoretical and empirical analyses of this memorization phenomenon, showing that pruning hard examples in adversarial training can enhance the model’s robustness. However, the challenge remains in finding the optimal threshold for removing hard examples that degrade robustness performance. Based upon these observations, we propose a new approach, difficulty proportional label smoothing (DPLS), to adaptively mitigate the negative effect of hard examples, thereby improving the adversarial robustness of NNs. Notably, our experimental result indicates that our method can successfully leverage hard examples while circumventing the negative effect.

Original languageEnglish
JournalJournal of Machine Learning Research
Volume26
StatePublished - 2025

Bibliographical note

Publisher Copyright:
©2025 Hyungyu Lee, Saehyung Lee, Ho Bae, and Sungroh Yoon.

Keywords

  • adversarial examples
  • adversarial robustness
  • hard examples
  • memorization
  • robust overfitting

Fingerprint

Dive into the research topics of 'Regularizing Hard Examples Improves Adversarial Robustness'. Together they form a unique fingerprint.

Cite this