A modular PyTorch library for vision transformer models
Project description
VFormer
A modular PyTorch library for vision transformers models
Library Features
- Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
- Makes it easy to develop custom models by composing components of different architectures
- Utilities for visualizing attention using techniques such as gradient rollout
Installation
From source (recommended)
git clone https://github.com/SforAiDl/vformer.git
cd vformer/
python setup.py install
From PyPI
pip install vformer
Models supported
- Vanilla ViT
- Swin Transformer
- Pyramid Vision Transformer
- CrossViT
- Compact Vision Transformer
- Compact Convolutional Transformer
- Visformer
- Vision Transformers for Dense Prediction
- CvT
- ConViT
- ViViT
- Perceiver IO
- Memory Efficient Attention
Example usage
To instantiate and use a Swin Transformer model -
import torch
from vformer.models.classification import SwinTransformer
image = torch.randn(1, 3, 224, 224) # Example data
model = SwinTransformer(
img_size=224,
patch_size=4,
in_channels=3,
n_classes=10,
embed_dim=96,
depths=[2, 2, 6, 2],
num_heads=[3, 6, 12, 24],
window_size=7,
drop_rate=0.2,
)
logits = model(image)
VFormer
has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.
from vformer.attention import WindowAttention
window_attn = WindowAttention(
dim=128,
window_size=7,
num_heads=2,
**kwargs,
)
from vformer.encoder import SwinEncoder
swin_encoder = SwinEncoder(
dim=128,
input_resolution=(224, 224),
depth=2,
num_heads=2,
window_size=7,
**kwargs,
)
Please refer to our documentation to know more.
References
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vformer-0.1.3.tar.gz
(60.3 kB
view details)
Built Distribution
File details
Details for the file vformer-0.1.3.tar.gz
.
File metadata
- Download URL: vformer-0.1.3.tar.gz
- Upload date:
- Size: 60.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee43cb736e9cc155ea470f58241f6d15b3c21df8410a4fa01f9f053f0733e750 |
|
MD5 | 179d229e78b5b846232f109da737f44b |
|
BLAKE2b-256 | f58b511ece4f144d8bb2f84588702dd7a661446b970d8a24bb6f2fd13f0f8327 |
File details
Details for the file vformer-0.1.3-py2.py3-none-any.whl
.
File metadata
- Download URL: vformer-0.1.3-py2.py3-none-any.whl
- Upload date:
- Size: 73.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 198ad9ee15c0b09ea1d68fc181c022697eaf1fa7903bcee19a1e3f1f90676fc6 |
|
MD5 | dc2f0487625fee1a63c8fbfd38389cb3 |
|
BLAKE2b-256 | 29e8b156db70933b2d166f6096e2e3d1ffc0f0ff984d9ac8567e43c8aaabe0a9 |