Skip to main content

Spatial Shift Vision Transformer

Project description

Spatial Shift ViT

S²-ViT is a hierarchical vision transformer with shifted window attention. In contrast to Swin, the shift operation used is based on S²-MLP, which shifts in all four directions simultaneously, and does not use the roll or unroll operation. Additionally, in leverages the patch embedding and positional encoding methods from Twins-SVT.

Prerequisites

  • Python 3.10+
  • PyTorch 2.0+

Installation

pip install s2vit

Usage

import torch
from s2vit import S2ViT

vit = S2ViT(
    depths=(2, 2, 6, 2),
    dims=(64, 128, 160, 320),
    global_pool=True
    num_classes=1000,
)

img = torch.randn(1, 3, 256, 256)
vit(img) # (1, 1000)

Citations

@article{Yu2021S2MLPSM,
  title={S2-MLP: Spatial-Shift MLP Architecture for Vision},
  author={Tan Yu and Xu Li and Yunfeng Cai and Mingming Sun and Ping Li},
  journal={2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2021},
  pages={3615-3624},
  url={https://api.semanticscholar.org/CorpusID:235422259}
}
@article{Liu2021SwinTH,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Ze Liu and Yutong Lin and Yue Cao and Han Hu and Yixuan Wei and Zheng Zhang and Stephen Lin and Baining Guo},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021},
  pages={9992-10002},
  url={https://api.semanticscholar.org/CorpusID:232352874}
}
@inproceedings{Chu2021TwinsRT,
  title={Twins: Revisiting the Design of Spatial Attention in Vision Transformers},
  author={Xiangxiang Chu and Zhi Tian and Yuqing Wang and Bo Zhang and Haibing Ren and Xiaolin Wei and Huaxia Xia and Chunhua Shen},
  booktitle={Neural Information Processing Systems},
  year={2021},
  url={https://api.semanticscholar.org/CorpusID:234364557}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s2vit-0.2.0.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s2vit-0.2.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file s2vit-0.2.0.tar.gz.

File metadata

  • Download URL: s2vit-0.2.0.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for s2vit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6c82be59b735cb428609ed53c7dcc602896b96dcf19981dc8b8f05c030b01c31
MD5 8f41a41faf83020081b25779b4015dfb
BLAKE2b-256 d2398d4749187d7babdc76f50519b68837219317656beb290535b6e4a59c4e36

See more details on using hashes here.

File details

Details for the file s2vit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: s2vit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for s2vit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f8edfca3f53ea0974d45c3f1fa7aa5889cc3ca3ee9ec058e4a28e22c6ddd81e
MD5 b248532821a043b71b3897f2ee4333c8
BLAKE2b-256 15d9dd937fa47b38a31ccaac096898776c03b0d915ecab8ed602583114762a22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page