Skip to main content

Vision Llama - Pytorch

Project description

Multi-Modality

Vision LLama

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta. PAPER LINK

install

$ pip install vision-llama

usage

import torch
from vision_llama import VisionLlamaBlock

# Create a random tensor of shape (1, 3, 224, 224)
x = torch.randn(1, 3, 224, 224)

# Create an instance of the VisionLlamaBlock model with the specified parameters
model = VisionLlamaBlock(768, 12, 3, 12)

# Print the shape of the output tensor when x is passed through the model
print(model(x).shape)

# Print the output tensor when x is passed through the model
print(model(x))

License

MIT

Citation

@misc{chu2024visionllama,
    title={VisionLLaMA: A Unified LLaMA Interface for Vision Tasks}, 
    author={Xiangxiang Chu and Jianlin Su and Bo Zhang and Chunhua Shen},
    year={2024},
    eprint={2403.00522},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

todo

  • Implement the AS2DRoPE rope
  • Implement the GSA attention

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_llama-0.0.6.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

vision_llama-0.0.6-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file vision_llama-0.0.6.tar.gz.

File metadata

  • Download URL: vision_llama-0.0.6.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/23.3.0

File hashes

Hashes for vision_llama-0.0.6.tar.gz
Algorithm Hash digest
SHA256 2ac987d55709943971bcbcca1ecfbe61d78443f1ce4c6a434f4094cfc0f870ba
MD5 761329c83e944cfb15ca35dfe1c01b00
BLAKE2b-256 f93f36e4d37531532e6318429a512829d8bbb3fe9be7489bb99475c4140ed09c

See more details on using hashes here.

File details

Details for the file vision_llama-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: vision_llama-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.11.0 Darwin/23.3.0

File hashes

Hashes for vision_llama-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 02cf712efc1997c040ff902598761b3f465e20ad36235d93dbb8176f441f6320
MD5 820297d8088eec547444f0fea9444d79
BLAKE2b-256 a3c70ee752ac4087e20e03a7e080c77e3b7d77d7af76c7bd676d1382b07bc632

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page