Investigating Adversarial Examples for Deep Residual Networks
CST Part II Deep Neural Networks Mini-Project
In this project, I investigated the one-shot and iterative fast gradient sign method (FGSM) attack, both untargeted and targeted, to generate adversarial examples for ResNet-50, and experimented the transferability of each FGSM to other neural networks, namely ResNet-18 and VGG-19, as well as other image inputs. The following conclusions can be reached after the above experiments:
- Targeted FGSM is more effective than FGSM;
- In white-box attacks, both the untargeted and targeted one-shot FGSMs are less effective than the iterative methods (both untargeted and targeted);
- When it comes to transferability to other neural networks or inputs, the one-shot FGSMs turn out to be much more effective than the iterative FGSMs, as the iterative FGSMs suffer from overfitting and hence completely unable to transfer to other neural networks or inputs.
- Since transferability is a foundation to black-box attacks, it is likely that the one-shot FGSMs can also be more effective than the iterative FGSMs in black-box attacks.