Adversarial Fine-tune with Dynamically Regulated Adversary
Pengyue Hou, Ming Zhou, Jie Han, Petr Musı́lek, Xingyu Li
- 发表年份
- 2022
- 引用次数
- 4
摘要
Adversarial training is an effective method to boost model robustness to malicious, adversarial attacks. However, such improvement in model robustness often leads to a significant sacrifice of standard performance on clean images. In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks. This leads us to the question: to what extent can we improve the robustness of the model without sacrificing standard performance? This work tackles this problem and proposes a simple yet effective transfer learning based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance. In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity. Extensive experiments show that the proposed approach outperforms previous adversarial training algorithms with the following objective: to improve the robustness of the model while preserving model's standard accuracy on clean data.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002