Adversarial training has recently emerged as an important defense mechanism to robustify machine learning models in the presence adversarial examples. Although adversarial training can boost the robustness of machine learning algorithms by a margin, research has not been conducted to determine if adversarial training is effective in the long-term. As deployments of machine learning algorithms are characterized by dynamics, change of the underlying model is inevitable. The dynamics are a result of model's evolution over time by introducing new training data and drifting the model by changing its parameters. In this paper, we examine the limitations of adversarial training due to the temporal changes of machine learning models. Using a natural language task, we conduct various experiments using a variety of datasets to measure the impact of concept drift on the efficacy of adversarial training. In particular, our analysis shows that certain adversarially-trained models are even more prone to the drift than others. In particular, WordCNN and LSTM-based models are shown more susceptible to the temporal changes than others such as BERT. We validate our findings using multiple real-world datasets on different network architectures. Our work calls for further research into the temporal aspects of adversarial training.
|Title of host publication||CySSS 2022 - Proceedings of the 1st Workshop on Cybersecurity and Social Sciences|
|Publisher||Association for Computing Machinery, Inc|
|Number of pages||7|
|State||Published - 30 May 2022|
|Event||1st International Workshop on Cybersecurity and Social Sciences, CySSS 2022 - Virtual, Online, Japan|
Duration: 30 May 2022 → …
|Name||CySSS 2022 - Proceedings of the 1st Workshop on Cybersecurity and Social Sciences|
|Conference||1st International Workshop on Cybersecurity and Social Sciences, CySSS 2022|
|Period||30/05/22 → …|
Bibliographical noteFunding Information:
Acknowledgement. Supported by Global Research Laboratory (GRL) Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2016K1A1A2912757). The first and second authors contributed equal.
© 2022 ACM.
- adversarial training
- concept drift
- sentiment analysis