Many Class Activation Map methods implemented in Pytorch. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Class Activation Map methods implemented in Pytorch

Tested on Common CNN Networks and Vision Transformers!

Method	What it does
GradCAM	Weight the 2D activations by the average gradient
GradCAM++	Like GradCAM but uses second order gradients
XGradCAM	Like GradCAM but scale the gradients by the normalized activations
AblationCAM	Zero out activations and measure how the output drops (this repository includes a fast batched implementation)
ScoreCAM	Perbutate the image by the scaled activations and measure how the output drops
EigenCAM	Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results)

What makes the network think the image label is 'pug, pug-dog' and 'tabby, tabby cat':

Dog Cat

Combining Grad-CAM with Guided Backpropagation for the 'pug, pug-dog' class:

Combined

More Visual Examples

Resnet50:

Category	Image	GradCAM	AblationCAM	ScoreCAM
Dog
Cat

Vision Transfomer (Deit Tiny):

Category	Image	GradCAM	AblationCAM	ScoreCAM
Dog
Cat

It seems that GradCAM++ is almost the same as GradCAM, in most networks except VGG where the advantage is larger.

Network	Image	GradCAM	GradCAM++	Score-CAM	Ablation-CAM	Eigen-CAM
VGG16
Resnet50

Chosing the Target Layer

You need to choose the target layer to compute CAM for. Some common choices are:

Resnet18 and 50: model.layer4[-1]
VGG and densenet161: model.features[-1]
mnasnet1_0: model.layers[-1]
ViT: model.blocks[-1].norm1

Using from code as a library

pip install grad-cam

from pytorch_grad_cam import GradCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM
from pytorch_grad_cam.utils.image import show_cam_on_image
from torchvision.models import resnet50

model = resnet50(pretrained=True)
target_layer = model.layer4[-1]
input_tensor = # Create an input tensor image for your model..

# This should be constructed once:
cam = GradCAM(model=model, target_layer=target_layer, use_cuda=args.use_cuda)

# And then cam be used on many images:
grayscale_cam = cam(input_tensor=input_tensor, target_category=1)
visualization = show_cam_on_image(rgb_img, grayscale_cam)

Running the example script:

Usage: python cam.py --image-path <path_to_image> --method <method>

To use with CUDA: python cam.py --image-path <path_to_image> --use-cuda

You can choose between:

GradCAM , ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM and EigenCAM.

Some methods like ScoreCAM and AblationCAM require a large number of forward passes, and have a batched implementation.

You can control the batch size with cam.batch_size =

How does it work with Vision Transformers

See vit_example.py

In ViT the output of the layers are typically BATCH x 197 x 192. In the dimension with 197, the first element represents the class token, and the rest represent the 14x14 patches in the image. We can treat the last 196 elements as a 14x14 spatial image, with 192 channels.

To reshape the activations and gradients to 2D spatial images, we can pass the CAM constructor a reshape_transform function.

This can also be a starting point for other architectures that will come in the future.

GradCAM(model=model, target_layer=target_layer, reshape_transform=reshape_transform)

def reshape_transform(tensor, height=14, width=14):
    result = tensor[:, 1 :  , :].reshape(tensor.size(0),
        height, width, tensor.size(2))

    # Bring the channels to the first dimension,
    # like in CNNs.
    result = result.transpose(2, 3).transpose(1, 2)
    return result

Which target_layer should we chose for Vision Transformers?

Since the final classification is done on the class token computed in the last attention block, the output will not be affected by the 14x14 channels in the last layer. The gradient of the output with respect to them, will be 0!

We should chose any layer before the final attention block, for example:

target_layer = model.blocks[-1].norm1

Citation

If you use this for research, please cite. Here is an example BibTeX entry:

@misc{jacobgilpytorchcam,
  title={pytorch-cam},
  author={Jacob Gildenblat and contributors},
  year={2021},
  publisher={GitHub},
  howpublished={\url{https://github.com/jacobgil/pytorch-grad-cam}},
}

References

https://arxiv.org/abs/1610.02391 Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra

https://arxiv.org/abs/1710.11063 Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks Aditya Chattopadhyay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian

https://arxiv.org/abs/1910.01279 Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

https://ieeexplore.ieee.org/abstract/document/9093360/ Saurabh Desai and Harish G Ramaswamy. Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization. In WACV, pages 972–980, 2020

https://arxiv.org/abs/2008.02312 Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, Biao Li

https://arxiv.org/abs/2008.00299 Eigen-CAM: Class Activation Map using Principal Components Mohammed Bany Muhammad, Mohammed Yeasin

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.5.5

Apr 7, 2025

1.5.4

Oct 9, 2024

1.5.3

Aug 18, 2024

1.5.2

May 31, 2024

1.5.0

Dec 19, 2023

1.4.8

Jun 16, 2023

1.4.7

Jun 15, 2023

1.4.6

Oct 18, 2022

1.4.5

Aug 25, 2022

1.4.4

Aug 24, 2022

1.4.3

Aug 13, 2022

1.4.2

Aug 1, 2022

1.4.1

Aug 1, 2022

1.4.0

Jul 9, 2022

1.3.9

May 20, 2022

1.3.7

Dec 30, 2021

1.3.6

Dec 21, 2021

1.3.5

Nov 5, 2021

1.3.4

Nov 5, 2021

1.3.3

Oct 30, 2021

1.3.1

Aug 20, 2021

1.2.9

May 16, 2021

1.2.8

May 14, 2021

1.2.7

May 1, 2021

1.2.6

Apr 30, 2021

1.2.5

Apr 30, 2021

This version

1.2.4

Apr 28, 2021

1.2.3

Apr 25, 2021

1.2.2

Apr 25, 2021

1.2.1

Apr 25, 2021

1.1.1

Apr 16, 2021

1.1.0

Apr 15, 2021

1.0.0

Apr 6, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grad-cam-1.2.4.tar.gz (1.6 MB view details)

Uploaded Apr 28, 2021 Source

File details

Details for the file grad-cam-1.2.4.tar.gz.

File metadata

Download URL: grad-cam-1.2.4.tar.gz
Upload date: Apr 28, 2021
Size: 1.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.21.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for grad-cam-1.2.4.tar.gz
Algorithm	Hash digest
SHA256	`be44e4a3412bfb18ee3b8a89acb6fef88f02fb720ce43a10a706a01789acf20c`
MD5	`a1207dece9a2f976bf51133106b14ab9`
BLAKE2b-256	`e2a11f0099b2eb6538e5f91fcc57472be66388231b91aa399d51ca255b0a4114`

See more details on using hashes here.

grad-cam 1.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Class Activation Map methods implemented in Pytorch

What makes the network think the image label is 'pug, pug-dog' and 'tabby, tabby cat':

Combining Grad-CAM with Guided Backpropagation for the 'pug, pug-dog' class:

More Visual Examples

Resnet50:

Vision Transfomer (Deit Tiny):

Chosing the Target Layer

Using from code as a library

Running the example script:

How does it work with Vision Transformers

Which target_layer should we chose for Vision Transformers?

Citation

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes