A library for creating vision transformers for computer vision
Project description
transcv
transcv creates vision transformers for visual recognition which can be pre-trained using self-supervised learning
Acknowledgement
- Ross Wightman's repository for vision transformer
- Hugging Face
- DETR repository
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- fastai
- nbdev
Install
pip install transcv
Along with transcv, we also need fastai and nbdev. So, it is recommended to use :
pip install fastai nbdev transcv -q --upgrade
How to use
An example for using the VisRecTrans class for getting a custom ViT model :
from transcv.visrectrans import VisRecTrans
vis_rec_ob = VisRecTrans('vit_small_patch16_224', 10, False)
model = vis_rec_ob.create_model()
vis_rec_ob.initialize(model)
embed_callback = vis_rec_ob.get_callback()
Now, the model
, along with the embed_callback
, can be used with the Learner class, of fastai, and can be fine-tuned on any image classification dataset.
An example for using the SwinT class for building a Swin transformer model :
from transcv.swin import SwinT
swint_ob = SwinT('swin_base_patch4_window7_224', pretrained = False, num_classes = 10)
swin_model = swint_ob.get_model()
assert isinstance(swin_model, nn.Sequential)
Now, the swin_model
can be used with the Learner class, of fastai, and can be fine-tuned on any dataset for visual recognition task.
For a detailed description of the classes and methods, please refer to the documentation.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.