PuVAE: A Variational Autoencoder to Purify Adversarial Examples

Uiwon Hwang, Jaewoo Park, Hyemi Jang, Sungroh Yoon, Nam Ik Cho

Research output: Contribution to journalArticlepeer-review

59 Scopus citations

Abstract

Deep neural networks are widely used and exhibit excellent performance in many areas. However, they are vulnerable to adversarial attacks that compromise networks at inference time by applying elaborately designed perturbations to input data. Although several defense methods have been proposed to address specific attacks, other types of attacks can circumvent these defense mechanisms. Therefore, we propose Purifying Variational AutoEncoder (PuVAE), a method to purify adversarial examples. The proposed method eliminates an adversarial perturbation by projecting an adversarial example on the manifold of each class and determining the closest projection as a purified sample. We experimentally illustrate the robustness of PuVAE against various attack methods without any prior knowledge about the attacks. In our experiments, the proposed method exhibits performances that are competitive with state-of-the-art defense methods, and the inference time is approximately 130 times faster than that of Defense-GAN which is a state-of-the art purifier method.

Original languageEnglish
Article number8824108
Pages (from-to)126582-126593
Number of pages12
JournalIEEE Access
Volume7
DOIs
StatePublished - 2019

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Keywords

  • Adversarial attack
  • deep learning
  • variational autoencoder

Fingerprint

Dive into the research topics of 'PuVAE: A Variational Autoencoder to Purify Adversarial Examples'. Together they form a unique fingerprint.

Cite this