Skip to main content

Vision Llama - Pytorch

Project description

Multi-Modality

Vision LLama

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta. PAPER LINK

install

$ pip install vision-llama

usage

import torch
from vision_llama import VisionLlamaBlock

# Create a random tensor of shape (1, 3, 224, 224)
x = torch.randn(1, 3, 224, 224)

# Create an instance of the VisionLlamaBlock model with the specified parameters
model = VisionLlamaBlock(768, 12, 3, 12)

# Print the shape of the output tensor when x is passed through the model
print(model(x).shape)

# Print the output tensor when x is passed through the model
print(model(x))

License

MIT

Citation

@misc{chu2024visionllama,
    title={VisionLLaMA: A Unified LLaMA Interface for Vision Tasks}, 
    author={Xiangxiang Chu and Jianlin Su and Bo Zhang and Chunhua Shen},
    year={2024},
    eprint={2403.00522},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

todo

  • Implement the AS2DRoPE rope
  • Implement the GSA attention

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_llama-0.0.5.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

vision_llama-0.0.5-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file vision_llama-0.0.5.tar.gz.

File metadata

  • Download URL: vision_llama-0.0.5.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/23.3.0

File hashes

Hashes for vision_llama-0.0.5.tar.gz
Algorithm Hash digest
SHA256 6665a4b8a2d1ca17043c4318de464b562462e11bb5b118a532c66019bf680f36
MD5 29cc493435d02a7f58f09c0552e76b2c
BLAKE2b-256 f51fb5587fa8767b4501b69b8dac4138a9759dd2a578929ae10e44adf3675833

See more details on using hashes here.

File details

Details for the file vision_llama-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: vision_llama-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 4.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/23.3.0

File hashes

Hashes for vision_llama-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 922f3f11c2714a3da914a3000ad6b056fe0ee49e2ee53937267af1728a334ace
MD5 18acd833bf7f164e49f98bab01552d7b
BLAKE2b-256 650377be59310d90846f2c00c77002a9cfeb0b6dc6b42d051415cb5fd74b5493

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page