Skip to main content

Qwen VL - Pytorch

Project description

Multi-Modality

Qwen-VL

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

Install

pip3 install qwen


Usage

import torch
from qwen.model import QwenVL

#usage
img = torch.randn(1, 3, 256, 256)
caption = torch.randint(0, 20000, (1, 1024))

model = QwenVL()
output = model(img, caption)
print(output.shape)

Training

from qwen.train import Train


def train():
    os.environ['MASTER_ADDR'] #'localhost'
    os.environ['MASTER_PORT'] #= '9994'
    
    # # [CRITICAL] Pay attention to this when scaling to multiple GPUs and clusters
    os.environ['RANK']       #= str(0) # Number of nodes (servers)
    os.environ['WORLD_SIZE'] # = str(torch.cuda.device_count())

    dist.init_process_group(backend='nccl') #init_method="env://")
    
    Train()

if __name__ == '__main__':
    train()
  1. Set the environment variables:

    • ENTITY_NAME: Your wandb project name
    • OUTPUT_DIR: Directory to save the weights (e.g., ./weights)
    • MASTER_ADDR: For distributed training
    • MASTER_PORT For master port distributed training
    • RANK- Number of nodes services
    • WORLD_SIZE Number of gpus
  2. Configure the training:

    • Accelerate Config
    • Enable Deepspeed 3
    • Accelerate launch train_distributed_accelerate.py

For more information, refer to the Training SOP.


Citations

Please use the following to cite this work:

@article{bai2023qwen,
  title={Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities},
  author={Bai, Jinze and Bai, Shuai and Yang, Shusheng and Wang, Shijie and Tan, Sinan and Wang, Peng and Lin, Junyang and Zhou, Chang and Zhou, Jingren},
  journal={arXiv preprint arXiv:2308.12966},
  year={2023},
  url={https://doi.org/10.48550/arXiv.2308.12966}
}

For more details, please refer to the full paper.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qwen-0.0.9.tar.gz (29.5 kB view details)

Uploaded Source

Built Distribution

qwen-0.0.9-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file qwen-0.0.9.tar.gz.

File metadata

  • Download URL: qwen-0.0.9.tar.gz
  • Upload date:
  • Size: 29.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for qwen-0.0.9.tar.gz
Algorithm Hash digest
SHA256 57bcb880806b4afd3d5deb33124f8fc0e03e649949d36dd12276b10e7f936558
MD5 07bbb2de6bbb4ab2a115400b378ae92e
BLAKE2b-256 734de326e338e3f26eb4b2e23740b98fbe76f281bfaedbc45a42ebf4ddddd5cb

See more details on using hashes here.

File details

Details for the file qwen-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: qwen-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/22.4.0

File hashes

Hashes for qwen-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 9b07d51adb127b0f5b401f34353efb430ca477107d131f81b38f4df563888958
MD5 17d715146549f673342ebc64406d581a
BLAKE2b-256 0db5cfb85ac0e2d973e4876c51b2709336d3cddf149f28d3527d1773cd040d9f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page