pali3 - Pytorch
Project description
Pali3
"Figure 1: Overview of the PaLI-3 (5B) model: images are encoded into visual tokens individually by the contrastively pretrained 2B SigLIP vision model. Along with a query, these visual tokens are passed to an 3B encoder-decoder UL2 Transformer which produces the desired answer."
Installation
pip install pali3
Usage:
License
MIT
Todo
- Implement sig_lip vit model with training recipe
- Implement the text tokenizer, maybe use token monster
- Implement the UL2 Transformer Encoder and Decoder
- Implement training scripts
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pali3-0.0.1.tar.gz
(6.3 kB
view hashes)
Built Distribution
pali3-0.0.1-py3-none-any.whl
(6.2 kB
view hashes)