Adversarial Attacks for PyTorch
Project description
Adversarial-Attacks-Pytorch
Torchattacks is a PyTorch library that contains adversarial attacks to generate adversarial examples.
Clean Image | Adversarial Image |
---|---|
Table of Contents
- Usage
- Attacks and Papers
- Documentation
- Expanding the Usage
- Contribution
- Recommended Sites and Packages
Usage
:clipboard: Dependencies
- torch 1.2.0
- python 3.6
:hammer: Installation
pip install torchattacks
orgit clone https://github.com/Harry24k/adversairal-attacks-pytorch
import torchattacks
# Untargeted (Default)
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
adversarial_images = atk(images, labels)
# Targeted (User Define)
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
target_map_function = lambda images, labels: labels.fill_(300)
atk.set_attack_mode("targeted", target_map_function=target_map_function)
adversarial_images = atk(images, labels)
# Targeted (Least Likely)
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
atk.set_attack_mode("least_likely")
adversarial_images = atk(images, labels)
# Type of Return
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
atk.set_return_type('int')
# Save Adversarial Images and Show accuracy
atk.save(data_loader=test_loader, save_path="./data/cifar10_pgd.pt", verbose=True)
:warning: Precautions
-
All images should be scaled to [0, 1] with transform[to.Tensor()] before used in attacks. To make it easy to use adversarial attacks, a reverse-normalization is not included in the attack process. To apply an input normalization, please add a normalization layer to the model. Please refer to the demo.
-
All models should return ONLY ONE vector of
(N, C)
whereC = number of classes
. Considering most models in torchvision.models return one vector of(N,C)
, whereN
is the number of inputs andC
is thenumber of classes, torchattacks also only supports limited forms of output. Please check the shape of the model’s output carefully. -
torch.backends.cudnn.deterministic = True
to get same adversarial examples with fixed random seed. Some operations are non-deterministic with float tensors on GPU [discuss]. If you want to get same results with same inputs, please runtorch.backends.cudnn.deterministic = True
[ref].
Attacks and Papers
Implemented adversarial attacks in the papers.
The distance measure in parentheses.
-
Explaining and harnessing adversarial examples (Dec 2014): Paper
- FGSM (Linf)
-
DeepFool: a simple and accurate method to fool deep neural networks (Nov 2015): Paper
- DeepFool (L2)
-
Adversarial Examples in the Physical World (Jul 2016): Paper
- BIM or iterative-FSGM (Linf)
-
Towards Evaluating the Robustness of Neural Networks (Aug 2016): Paper
- CW (L2)
-
Ensemble Adversarial Traning: Attacks and Defences (May 2017): Paper
- RFGSM (Linf)
-
Towards Deep Learning Models Resistant to Adversarial Attacks (Jun 2017): Paper
- PGD (Linf)
-
Boosting Adversarial Attacks with Momentum (Oct 2017): Paper
- MIFGSM (Linf) - :heart_eyes: Contributor zhuangzi926, huitailangyz
-
Theoretically Principled Trade-off between Robustness and Accuracy (Jan 2019): Paper
- TPGD (Linf)
-
Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network" (Jul 2019): Paper
- APGD or EOT + PGD (Linf)
-
Fast is better than free: Revisiting adversarial training (Jan 2020): Paper
- FFGSM (Linf)
Performance Comparison
All experiments were done on GeForce RTX 2080.
For a fair comparison, Robustbench is used.
As for the comparison methods, currently updated and the most cited methods were selected:
For other methods, please refer to each projects' github on Recommended Sites and Packages.
The code is here (code, nbviewer).
Accuracy and elapsed time on the first 50 images of CIFAR10. For L2 attacks, the average L2 distances between adversarial images and the original images are recorded.
Attack | Package | Wong2020Fast | Rice2020Overfitting | Carmon2019Unlabeled | Remark |
---|---|---|---|---|---|
FGSM (Linf) | torchattacks | 48% (15 ms) | 62% (88 ms) | 68% (11 ms) | |
foolbox | 48% (15 ms) | 62% (55 ms) | 68% (24 ms) | ||
ART | 48% (64 ms) | 62% (750 ms) | 68% (223 ms) | ||
BIM (Linf) | torchattacks | 46% (83 ms) | 58% (671 ms) | 64% (119 ms) | |
foolbox | 46% (80 ms) | 58% (1169 ms) | 64% (256 ms) | ||
ART | 46% (248 ms) | 58% (2571 ms) | 64% (760 ms) | ||
PGD (Linf) | torchattacks | 46% (64 ms) | 58% (593 ms) | 64% (95 ms) | |
foolbox | 46% (70 ms) | 58% (1177 ms) | 64% (264 ms) | ||
ART | 46% (243 ms) | 58% (2569 ms) | 64% (759 ms) | ||
CW (L2) | torchattacks | 14% / 0.00016 (4361 ms) | 22% / 0.00013 (4361 ms) | 26% / 8.5e-05 (13052 ms) | Different Results |
foolbox | 32% / 0.00016 (4564 ms) | 34% / 0.00017 (4361 ms) | 32% / 0.00016 (13332 ms) | ||
ART | 32% / 0.00016 (72684 ms) | 34% / 0.00017 (4361 ms) | 32% / 0.00016 (206290 ms) | Slower than others | |
DeepFool (L2) | torchattacks | 20% / 0.00063 (12942 ms) | 14% / 0.00094 (46856 ms) | 10% / 0.0021 (14232 ms) | Different Results / Slower than others |
foolbox | 40% / 0.00018 (1959 ms) | 36% / 0.00019 (20410 ms) | 46% / 0.00021 (5936 ms) | ||
ART | 40% / 0.00018 (2193 ms) | 36% / 0.00019 (19941 ms) | 46% / 0.00021 (5905 ms) |
- Note:
- In torchattacks, there is no binary search algorithms for const
c
. It will be added in the future. Recommanded to use MultiAttack. - In torchattacks, DeepFool takes longer time than other methods. Altough it produces stronger adverarial examples, please use other packages untill fixed.
- In torchattacks, there is no binary search algorithms for const
Documentation
:book: ReadTheDocs
Here is a documentation for this package.
:mag_right: Update Records
Here is update records of this package.
:bell: Citation
If you want to cite this package, please use the following BibTex:
@article{kim2020torchattacks,
title={Torchattacks: A Pytorch Repository for Adversarial Attacks},
author={Kim, Hoki},
journal={arXiv preprint arXiv:2010.01950},
year={2020}
}
:rocket: Demos
- White Box Attack with ImageNet (code, nbviewer): Using torchattacks to make adversarial examples with the ImageNet dataset to fool Inception v3.
- Black Box Attack with CIFAR10 (code, nbviewer): This demo provides an example of black box attack with two different models. First, make adversarial datasets from a holdout model with CIFAR10 and save it as torch dataset. Second, use the adversarial datasets to attack a target model.
- Adversairal Training with MNIST (code, nbviewer): This code shows how to do adversarial training with this repository. The MNIST dataset and a custom model are used in this code. The adversarial training is performed with PGD, and then FGSM is applied to evaluate the model.
- Applications of MultiAttack with CIFAR10 (code, nbviewer): This code shows the applications of Multiattack. It can be used for implementing (1) Attack with random restarts, and (2) Attack on only correct examples.
Expanding the Usage
Torchattacks supports collaboration with other attack packages.
Through expending the usage, we can use fucntions in torchattacks such as save and multiattack.
:milky_way: AutoAttack
- https://github.com/fra31/auto-attack
pip install git+https://github.com/fra31/auto-attack
from torchattacks.attack import Attack
import autoattack
class AutoAttack(Attack):
def __init__(self, model, eps):
super(AutoAttack, self).__init__("AutoAttack", model)
self.adversary = autoattack.AutoAttack(self.model, norm='Linf',
eps=eps, version='standard', verbose=False)
self._attack_mode = 'only_default'
def forward(self, images, labels):
adv_images = self.adversary.run_standard_evaluation(images.cuda(), labels.cuda(),
bs=images.shape[0])
return adv_images
atk = AutoAttack(model, eps=0.3)
atk.save(data_loader=test_loader, file_name="_temp.pt", accuracy=True)
:milky_way: FoolBox
- https://github.com/bethgelab/foolbox
pip install foolbox
- e.g., L2BrendelBethge
from torchattacks.attack import Attack
import foolbox as fb
class L2BrendelBethge(Attack):
def __init__(self, model):
super(L2BrendelBethge, self).__init__("L2BrendelBethge", model)
self.fmodel = fb.PyTorchModel(self.model, bounds=(0,1), device=self.device)
self.init_attack = fb.attacks.DatasetAttack()
self.adversary = fb.attacks.L2BrendelBethgeAttack(init_attack=self.init_attack)
self._attack_mode = 'only_default'
def forward(self, images, labels):
images, labels = images.to(self.device), labels.to(self.device)
# DatasetAttack
batch_size = len(images)
batches = [(images[:batch_size//2], labels[:batch_size//2]),
(images[batch_size//2:], labels[batch_size//2:])]
self.init_attack.feed(model=self.fmodel, inputs=batches[0][0]) # feed 1st batch of inputs
self.init_attack.feed(model=self.fmodel, inputs=batches[1][0]) # feed 2nd batch of inputs
criterion = fb.Misclassification(labels)
init_advs = self.init_attack.run(self.fmodel, images, criterion)
# L2BrendelBethge
adv_images = self.adversary.run(self.fmodel, images, labels, starting_points=init_advs)
return adv_images
atk = L2BrendelBethge(model)
atk.save(data_loader=test_loader, file_name="_temp.pt", accuracy=True)
:milky_way: Adversarial-Robustness-Toolbox (ART)
- https://github.com/IBM/adversarial-robustness-toolbox
git clone https://github.com/IBM/adversarial-robustness-toolbox
- e.g., SaliencyMapMethod (or Jacobian based saliency map attack)
import torch.nn as nn
import torch.optim as optim
from torchattacks.attack import Attack
import art.attacks.evasion as evasion
from art.classifiers import PyTorchClassifier
class JSMA(Attack):
def __init__(self, model, theta=1/255, gamma=0.15, batch_size=128):
super(JSMA, self).__init__("JSMA", model)
self.classifier = PyTorchClassifier(
model=self.model, clip_values=(0, 1),
loss=nn.CrossEntropyLoss(),
optimizer=optim.Adam(self.model.parameters(), lr=0.01),
input_shape=(1, 28, 28), nb_classes=10)
self.adversary = evasion.SaliencyMapMethod(classifier=self.classifier,
theta=theta, gamma=gamma,
batch_size=batch_size)
self.target_map_function = lambda labels: (labels+1)%10
self._attack_mode = 'only_default'
def forward(self, images, labels):
adv_images = self.adversary.generate(images, self.target_map_function(labels))
return torch.tensor(adv_images).to(self.device)
atk = JSMA(model)
atk.save(data_loader=test_loader, file_name="_temp.pt", accuracy=True)
Contribution
Contribution is always welcome! Use pull requests :blush:
Recommended Sites and Packages
-
Adversarial Attack Packages:
- https://github.com/IBM/adversarial-robustness-toolbox: Adversarial attack and defense package made by IBM. TensorFlow, Keras, Pyotrch available.
- https://github.com/bethgelab/foolbox: Adversarial attack package made by Bethge Lab. TensorFlow, Pyotrch available.
- https://github.com/tensorflow/cleverhans: Adversarial attack package made by Google Brain. TensorFlow available.
- https://github.com/BorealisAI/advertorch: Adversarial attack package made by BorealisAI. Pytorch available.
- https://github.com/DSE-MSU/DeepRobust: Adversarial attack (especially on GNN) package made by BorealisAI. Pytorch available.
- https://github.com/fra31/auto-attack: Set of attacks that is believed to be the strongest in existence. TensorFlow, Pyotrch available.
-
Adversarial Defense Leaderboard:
-
Adversarial Attack and Defense Papers:
- https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html: A Complete List of All (arXiv) Adversarial Example Papers made by Nicholas Carlini.
- https://github.com/chawins/Adversarial-Examples-Reading-List: Adversarial Examples Reading List made by Chawin Sitawarin.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for torchattacks-2.11.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ed0e59375310e39f5f42bb44ff4342493ec66e5fe347d215d3309dd66e6cc74 |
|
MD5 | 7f098a4dd35d24b633e3050c9c0237fc |
|
BLAKE2b-256 | a8e06834075fc05b8fcfed20105e744c1000e6b8368d7da2d4554fdf0400db1b |