Vision Transformer (ViT) - Pytorch
Project description
Pytorch Implementation of ViT
Original Paper link: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale(Alexey Dosovitskiy et al.)
Install
$ pip install vit-pytorch-implementation
#Usage:
import torch
from vit_pytorch import lilViT
v = lilViT(
img_size=224,
in_channels=3,
patch_size=16,
num_transformer_layers=12,
embedding_dim=768,
mlp_size=3072,
num_heads=12,
attn_dropout=0,
mlp_dropout=0.1,
embedding_dropout=0.1,
num_classes=1000
)
img = torch.randn(1, 3, 224, 224)
preds = v(img) # (1, 1000)
preds.shape
Parameters
img_size
: int.
Image resolution. Default=224(224x224)in_channels
: int.
Image channels. Default3
patch_size
: int.
Size of patches.image_size
must be divisible bypatch_size
.
The number of patches is:n = (image_size // patch_size) ** 2
andn
must be greater than 16. Default16
num_transformer_layers
: int.
Depth(number of transformer blocks). Default12
embedding_dim
: int.
Embedding dimension. Default768
mlp_size
: int.
MLP size. Default3072
num_heads
: int.
Number of heads in Multi-head Attention layer. Default12
attn_dropout
: float.
Dropout for attention projection. Default0
mlp_dropout
: float
Dropout for dense/MLP layers. Default0.1
embedding_dropout
: float.
Dropout for patch and position embeddings.Default0.1
num_classes
: int.
Number of classes to classify. Default1000
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file vit-pytorch-implementation-1.0.2.tar.gz
.
File metadata
- Download URL: vit-pytorch-implementation-1.0.2.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60cd8447dec9f445cadd634efc7bcdc0aedc40cee77784f4e2da8275c920cddd |
|
MD5 | b03a20aa63edbe07232c7cd77dc5e05b |
|
BLAKE2b-256 | cbf290f9ce6d7371f7fad08a6b5195998685dbb69c386fe0f235212a7eb49cbe |
File details
Details for the file vit_pytorch_implementation-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: vit_pytorch_implementation-1.0.2-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a2c3ba6101229fbd19e4ca7d0b0de308a5d0b8cf89c6cfad6bdf4af662a82d8 |
|
MD5 | effe1b95176a18783642ef741492133f |
|
BLAKE2b-256 | 22c6ed1e04c5230d8f04afc66828e4aec0bcd0d29b9a4486b538a863dc29253a |