Spatial Shift Vision Transformer
Project description
Spatial Shift ViT
S²-ViT is a hierarchical vision transformer with shifted window attention. In contrast to Swin, the shift operation used is based on S²-MLP, which shifts in all four directions simultaneously, and does not use the roll or unroll operation. Additionally, in leverages the patch embedding and positional encoding methods from Twins-SVT.
Prerequisites
- Python 3.10+
- PyTorch 2.0+
Installation
pip install s2vit
Usage
import torch
from s2vit import S2ViT
vit = S2ViT(
depths=(2, 2, 6, 2),
dims=(64, 128, 160, 320),
global_pool=True
num_classes=1000,
)
img = torch.randn(1, 3, 256, 256)
vit(img) # (1, 1000)
Citations
@article{Yu2021S2MLPSM,
title={S2-MLP: Spatial-Shift MLP Architecture for Vision},
author={Tan Yu and Xu Li and Yunfeng Cai and Mingming Sun and Ping Li},
journal={2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year={2021},
pages={3615-3624},
url={https://api.semanticscholar.org/CorpusID:235422259}
}
@article{Liu2021SwinTH,
title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
author={Ze Liu and Yutong Lin and Yue Cao and Han Hu and Yixuan Wei and Zheng Zhang and Stephen Lin and Baining Guo},
journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2021},
pages={9992-10002},
url={https://api.semanticscholar.org/CorpusID:232352874}
}
@inproceedings{Chu2021TwinsRT,
title={Twins: Revisiting the Design of Spatial Attention in Vision Transformers},
author={Xiangxiang Chu and Zhi Tian and Yuqing Wang and Bo Zhang and Haibing Ren and Xiaolin Wei and Huaxia Xia and Chunhua Shen},
booktitle={Neural Information Processing Systems},
year={2021},
url={https://api.semanticscholar.org/CorpusID:234364557}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s2vit-0.2.0.tar.gz.
File metadata
- Download URL: s2vit-0.2.0.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c82be59b735cb428609ed53c7dcc602896b96dcf19981dc8b8f05c030b01c31
|
|
| MD5 |
8f41a41faf83020081b25779b4015dfb
|
|
| BLAKE2b-256 |
d2398d4749187d7babdc76f50519b68837219317656beb290535b6e4a59c4e36
|
File details
Details for the file s2vit-0.2.0-py3-none-any.whl.
File metadata
- Download URL: s2vit-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f8edfca3f53ea0974d45c3f1fa7aa5889cc3ca3ee9ec058e4a28e22c6ddd81e
|
|
| MD5 |
b248532821a043b71b3897f2ee4333c8
|
|
| BLAKE2b-256 |
15d9dd937fa47b38a31ccaac096898776c03b0d915ecab8ed602583114762a22
|