Skip to main content

A modular PyTorch library for vision transformer models

Project description

VFormer

A modular PyTorch library for vision transformers models

Library Features

  • Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
  • Makes it easy to develop custom models by composing components of different architectures
  • Utilities for visualizing attention using techniques such as gradient rollout

Installation

From source (recommended)

git clone https://github.com/SforAiDl/vformer.git
cd vformer/
python setup.py install

From PyPI

pip install vformer

Models supported

Example usage

To instantiate and use a Swin Transformer model -

import torch
from vformer.models.classification import SwinTransformer

image = torch.randn(1, 3, 224, 224)       # Example data
model = SwinTransformer(
        img_size=224,
        patch_size=4,
        in_channels=3,
        n_classes=10,
        embed_dim=96,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        window_size=7,
        drop_rate=0.2,
    )
logits = model(image)

VFormer has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.

from vformer.attention import WindowAttention

window_attn = WindowAttention(
        dim=128,
        window_size=7,
        num_heads=2,
        **kwargs,
    )
from vformer.encoder import SwinEncoder

swin_encoder = SwinEncoder(
        dim=128,
        input_resolution=(224, 224),
        depth=2,
        num_heads=2,
        window_size=7,
        **kwargs,
    )

Please refer to our documentation to know more.


References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vformer-0.1.3.tar.gz (60.3 kB view details)

Uploaded Source

Built Distribution

vformer-0.1.3-py2.py3-none-any.whl (73.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file vformer-0.1.3.tar.gz.

File metadata

  • Download URL: vformer-0.1.3.tar.gz
  • Upload date:
  • Size: 60.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for vformer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ee43cb736e9cc155ea470f58241f6d15b3c21df8410a4fa01f9f053f0733e750
MD5 179d229e78b5b846232f109da737f44b
BLAKE2b-256 f58b511ece4f144d8bb2f84588702dd7a661446b970d8a24bb6f2fd13f0f8327

See more details on using hashes here.

File details

Details for the file vformer-0.1.3-py2.py3-none-any.whl.

File metadata

  • Download URL: vformer-0.1.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 73.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for vformer-0.1.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 198ad9ee15c0b09ea1d68fc181c022697eaf1fa7903bcee19a1e3f1f90676fc6
MD5 dc2f0487625fee1a63c8fbfd38389cb3
BLAKE2b-256 29e8b156db70933b2d166f6096e2e3d1ffc0f0ff984d9ac8567e43c8aaabe0a9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page