Skip to main content

Adversarial Attacks for PyTorch

Project description

Adversarial-Attacks-PyTorch

MIT License Pypi Documentation Status

Torchattacks is a PyTorch library that contains adversarial attacks to generate adversarial examples.

Clean Image Adversarial Image

Table of Contents

  1. Usage
  2. Attacks and Papers
  3. Performance Comparison
  4. Documentation
  5. Citation
  6. Expanding the Usage
  7. Contribution
  8. Recommended Sites and Packages

Usage

:clipboard: Dependencies

  • torch==1.4.0
  • python==3.6

:hammer: Installation

  • pip install torchattacks or
  • git clone https://github.com/Harry24k/adversairal-attacks-pytorch
import torchattacks
atk = torchattacks.PGD(model, eps=8/255, alpha=2/255, steps=4)
adversarial_images = atk(images, labels)

:warning: Precautions

  • All images should be scaled to [0, 1] with transform[to.Tensor()] before used in attacks. To make it easy to use adversarial attacks, a reverse-normalization is not included in the attack process. To apply an input normalization, please add a normalization layer to the model. Please refer to the demo.

  • All models should return ONLY ONE vector of (N, C) where C = number of classes. Considering most models in torchvision.models return one vector of (N,C), where N is the number of inputs and C is thenumber of classes, torchattacks also only supports limited forms of output. Please check the shape of the model’s output carefully. In the case of the model returns multiple outputs, please refer to the demo.

  • torch.backends.cudnn.deterministic = True to get same adversarial examples with fixed random seed. Some operations are non-deterministic with float tensors on GPU [discuss]. If you want to get same results with same inputs, please run torch.backends.cudnn.deterministic = True[ref].

Attacks and Papers

Implemented adversarial attacks in the papers.

The distance measure in parentheses.

Name Paper Remark
FGSM
(Linf)
Explaining and harnessing adversarial examples (Goodfellow et al., 2014)
BIM
(Linf)
Adversarial Examples in the Physical World (Kurakin et al., 2016) Basic iterative method or Iterative-FSGM
CW
(L2)
Towards Evaluating the Robustness of Neural Networks (Carlini et al., 2016)
RFGSM
(Linf)
Ensemble Adversarial Traning: Attacks and Defences (Tramèr et al., 2017) Random initialization + FGSM
PGD
(Linf)
Towards Deep Learning Models Resistant to Adversarial Attacks (Mardry et al., 2017) Projected Gradient Method
MIFGSM
(Linf)
Boosting Adversarial Attacks with Momentum (Dong et al., 2017) :heart_eyes: Contributor zhuangzi926, huitailangyz
TPGD
(Linf)
Theoretically Principled Trade-off between Robustness and Accuracy (Zhang et al., 2019)
APGD
(Linf)
Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network" (Zimmermann, 2019) EOT + PGD
FFGSM
(Linf)
Fast is better than free: Revisiting adversarial training (Wong et al., 2020) Random initialization + FGSM

Performance Comparison

All experiments were done on GeForce RTX 2080.

For a fair comparison, Robustbench is used.

As for the comparison methods, currently updated and the most cited methods were selected:

  • Foolbox: 178 citations and last update 2020.10.19.

  • ART: 102 citations and last update 2020.12.11.

For other methods, please refer to each projects' github on Recommended Sites and Packages.

The code is here (code, nbviewer).

Robust accuracy against each attack and elapsed time on the first 50 images of CIFAR10. For L2 attacks, the average L2 distances between adversarial images and the original images are recorded.

Attack Package Wong2020 Rice2020 Carmon2019 Remark
FGSM torchattacks 48% (15 ms) 62% (88 ms) 68% (11 ms)
(Linf) foolbox 48% (15 ms) 62% (55 ms) 68% (24 ms)
ART 48% (64 ms) 62% (750 ms) 68% (223 ms)
BIM torchattacks 46% (83 ms) 58% (671 ms) 64% (119 ms)
(Linf) foolbox 46% (80 ms) 58% (1169 ms) 64% (256 ms)
ART 46% (248 ms) 58% (2571 ms) 64% (760 ms)
PGD torchattacks 46% (64 ms) 58% (593 ms) 64% (95 ms)
(Linf) foolbox 46% (70 ms) 58% (1177 ms) 64% (264 ms)
ART 46% (243 ms) 58% (2569 ms) 64% (759 ms)
CW torchattacks 14% / 0.00016
(4361 ms)
22% / 0.00013
(44572 ms)
26% / 8.5e-05
(13052 ms)
Different results
(L2) foolbox 32% / 0.00016
(4564 ms)
34% / 0.00017
(45034 ms)
32% / 0.00016
(13332 ms)
ART 32% / 0.00016
(72684 ms)
34% / 0.00017
(711699 ms)
32% / 0.00016
(206290 ms)
Slower than others

In torchattacks, CW has no binary search algorithms for const c. Instead of binary search, torchattacks supports customized search as in code, nbviewer.

Documentation

:book: ReadTheDocs

Here is a documentation for this package.

:mag_right: Update Records

Here is update records of this package.

:rocket: Demos

  • White Box Attack with ImageNet (code, nbviewer): Using torchattacks to make adversarial examples with the ImageNet dataset to fool Inception v3.
  • Black Box Attack with CIFAR10 (code, nbviewer): This demo provides an example of black box attack with two different models. First, make adversarial datasets from a holdout model with CIFAR10 and save it as torch dataset. Second, use the adversarial datasets to attack a target model.
  • Adversairal Training with MNIST (code, nbviewer): This code shows how to do adversarial training with this repository. The MNIST dataset and a custom model are used in this code. The adversarial training is performed with PGD, and then FGSM is applied to evaluate the model.
  • Applications of MultiAttack with CIFAR10 (code, nbviewer): This code shows the applications of Multiattack. It can be used for implementing (1) Attack with random restarts, and (2) Attack on only correct examples.

Citation

If you use this package, please cite the following BibTex:

@article{kim2020torchattacks,
  title={Torchattacks: A Pytorch Repository for Adversarial Attacks},
  author={Kim, Hoki},
  journal={arXiv preprint arXiv:2010.01950},
  year={2020}
}

Expanding the Usage

Torchattacks supports collaboration with other attack packages.

Through expending the usage, we can use fucntions in torchattacks such as save and multiattack.

:milky_way: AutoAttack

from torchattacks.attack import Attack
import autoattack

class AutoAttack(Attack):
    def __init__(self, model, eps):
        super(AutoAttack, self).__init__("AutoAttack", model)
        self.adversary = autoattack.AutoAttack(self.model, norm='Linf',
                                               eps=eps, version='standard', verbose=False)
        self._attack_mode = 'only_default'

    def forward(self, images, labels):
        adv_images = self.adversary.run_standard_evaluation(images.cuda(), labels.cuda(),
                                                            bs=images.shape[0])
        return adv_images

atk = AutoAttack(model, eps=0.3)
atk.save(data_loader=test_loader, save_path="_temp.pt", verbose=True)

:milky_way: FoolBox

from torchattacks.attack import Attack
import foolbox as fb

class L2BrendelBethge(Attack):
    def __init__(self, model):
        super(L2BrendelBethge, self).__init__("L2BrendelBethge", model)
        self.fmodel = fb.PyTorchModel(self.model, bounds=(0,1), device=self.device)
        self.init_attack = fb.attacks.DatasetAttack()
        self.adversary = fb.attacks.L2BrendelBethgeAttack(init_attack=self.init_attack)
        self._attack_mode = 'only_default'

    def forward(self, images, labels):
        images, labels = images.to(self.device), labels.to(self.device)

        # DatasetAttack
        batch_size = len(images)
        batches = [(images[:batch_size//2], labels[:batch_size//2]),
                   (images[batch_size//2:], labels[batch_size//2:])]
        self.init_attack.feed(model=self.fmodel, inputs=batches[0][0]) # feed 1st batch of inputs
        self.init_attack.feed(model=self.fmodel, inputs=batches[1][0]) # feed 2nd batch of inputs
        criterion = fb.Misclassification(labels)
        init_advs = self.init_attack.run(self.fmodel, images, criterion)

        # L2BrendelBethge
        adv_images = self.adversary.run(self.fmodel, images, labels, starting_points=init_advs)
        return adv_images

atk = L2BrendelBethge(model)
atk.save(data_loader=test_loader, save_path="_temp.pt", verbose=True)

:milky_way: Adversarial-Robustness-Toolbox (ART)

import torch.nn as nn
import torch.optim as optim

from torchattacks.attack import Attack

import art.attacks.evasion as evasion
from art.classifiers import PyTorchClassifier

class JSMA(Attack):
    def __init__(self, model, theta=1/255, gamma=0.15, batch_size=128):
        super(JSMA, self).__init__("JSMA", model)
        self.classifier = PyTorchClassifier(
                            model=self.model, clip_values=(0, 1),
                            loss=nn.CrossEntropyLoss(),
                            optimizer=optim.Adam(self.model.parameters(), lr=0.01),
                            input_shape=(1, 28, 28), nb_classes=10)
        self.adversary = evasion.SaliencyMapMethod(classifier=self.classifier,
                                                   theta=theta, gamma=gamma,
                                                   batch_size=batch_size)
        self.target_map_function = lambda labels: (labels+1)%10
        self._attack_mode = 'only_default'

    def forward(self, images, labels):
        adv_images = self.adversary.generate(images, self.target_map_function(labels))
        return torch.tensor(adv_images).to(self.device)

atk = JSMA(model)
atk.save(data_loader=test_loader, save_path="_temp.pt", verbose=True)

Contribution

All kind of contributions are always welcome! :blush:

If you are interested in adding a new attack in this repo or fixing some issues, please have a look at contribution.md.

Recommended Sites and Packages

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

torchattacks-2.12.0-py3-none-any.whl (32.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page