Skip to main content

Qwen VL - Pytorch

Project description

Multi-Modality

Qwen-VL

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

Install

pip3 install qwen


Usage

import torch
from qwen.model import QwenVL

#usage
img = torch.randn(1, 3, 256, 256)
caption = torch.randint(0, 20000, (1, 1024))

model = QwenVL()
output = model(img, caption)
print(output.shape)

Training

from qwen.train import Train


def train():
    os.environ['MASTER_ADDR'] #'localhost'
    os.environ['MASTER_PORT'] #= '9994'
    
    # # [CRITICAL] Pay attention to this when scaling to multiple GPUs and clusters
    os.environ['RANK']       #= str(0) # Number of nodes (servers)
    os.environ['WORLD_SIZE'] # = str(torch.cuda.device_count())

    dist.init_process_group(backend='nccl') #init_method="env://")
    
    Train()

if __name__ == '__main__':
    train()
  1. Set the environment variables:

    • ENTITY_NAME: Your wandb project name
    • OUTPUT_DIR: Directory to save the weights (e.g., ./weights)
    • MASTER_ADDR: For distributed training
    • MASTER_PORT For master port distributed training
    • RANK- Number of nodes services
    • WORLD_SIZE Number of gpus
  2. Configure the training:

    • Accelerate Config
    • Enable Deepspeed 3
    • Accelerate launch train_distributed_accelerate.py

For more information, refer to the Training SOP.


Citations

Please use the following to cite this work:

@article{bai2023qwen,
  title={Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities},
  author={Bai, Jinze and Bai, Shuai and Yang, Shusheng and Wang, Shijie and Tan, Sinan and Wang, Peng and Lin, Junyang and Zhou, Chang and Zhou, Jingren},
  journal={arXiv preprint arXiv:2308.12966},
  year={2023},
  url={https://doi.org/10.48550/arXiv.2308.12966}
}

For more details, please refer to the full paper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwen-0.0.9.tar.gz (29.5 kB view hashes)

Uploaded Source

Built Distribution

qwen-0.0.9-py3-none-any.whl (29.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page