Gaia2 - Pytorch
Project description
Gaia2 - Pytorch (wip)
Implementation of the world model architecture proposed for the domain of self driving out of Wayve
Install
$ pip install gaia2-pytorch
Usage
import torch
from gaia2_pytorch import VideoTokenizer, Gaia2
video = torch.randn(1, 3, 10, 16, 16)
tokenizer = VideoTokenizer()
loss = tokenizer(video)
loss.backward()
gaia2 = Gaia2(tokenizer)
loss = gaia2(video)
loss.backward()
generated = gaia2.generate((10, 16, 16))
assert generated.shape == video.shape
Contributing
$ pip install '.[test]'
Then add a test to tests and run the following
$ pytest tests
That's it
Citations
@article{Russell2025GAIA2AC,
title = {GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving},
author = {Lloyd Russell and Anthony Hu and Lorenzo Bertoni and George Fedoseev and Jamie Shotton and Elahe Arani and Gianluca Corrado},
journal = {ArXiv},
year = {2025},
volume = {abs/2503.20523},
url = {https://api.semanticscholar.org/CorpusID:277321454}
}
@article{Rombach2021HighResolutionIS,
title = {High-Resolution Image Synthesis with Latent Diffusion Models},
author = {Robin Rombach and A. Blattmann and Dominik Lorenz and Patrick Esser and Bj{\"o}rn Ommer},
journal = {2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021},
pages = {10674-10685},
url = {https://api.semanticscholar.org/CorpusID:245335280}
}
@article{Zhu2025FracConnectionsFE,
title = {Frac-Connections: Fractional Extension of Hyper-Connections},
author = {Defa Zhu and Hongzhi Huang and Jundong Zhou and Zihao Huang and Yutao Zeng and Banggu Wu and Qiyang Min and Xun Zhou},
journal = {ArXiv},
year = {2025},
volume = {abs/2503.14125},
url = {https://api.semanticscholar.org/CorpusID:277104144}
}
@inproceedings{Huang2025TheGI,
title = {The GAN is dead; long live the GAN! A Modern GAN Baseline},
author = {Yiwen Huang and Aaron Gokaslan and Volodymyr Kuleshov and James Tompkin},
year = {2025},
url = {https://api.semanticscholar.org/CorpusID:275405495}
}
@inproceedings{Darcet2023VisionTN,
title = {Vision Transformers Need Registers},
author = {Timoth'ee Darcet and Maxime Oquab and Julien Mairal and Piotr Bojanowski},
year = {2023},
url = {https://api.semanticscholar.org/CorpusID:263134283}
}
@misc{chen2025deepcompressionautoencoderefficient,
title = {Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models},
author = {Junyu Chen and Han Cai and Junsong Chen and Enze Xie and Shang Yang and Haotian Tang and Muyang Li and Yao Lu and Song Han},
year = {2025},
eprint = {2410.10733},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2410.10733},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gaia2_pytorch-0.0.33.tar.gz
(462.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gaia2_pytorch-0.0.33.tar.gz.
File metadata
- Download URL: gaia2_pytorch-0.0.33.tar.gz
- Upload date:
- Size: 462.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eebb0d2cd07f1454ef9412d4d290b8a16f422ac2b8c836dce6c7ac6b39044255
|
|
| MD5 |
466c2555228fa796d37384fb8590e6c5
|
|
| BLAKE2b-256 |
b5ceb9998ee9729e875d8f72d3173f16ad4a17f51b42541b18bc06639b87beb1
|
File details
Details for the file gaia2_pytorch-0.0.33-py3-none-any.whl.
File metadata
- Download URL: gaia2_pytorch-0.0.33-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b349fccaf282fd2fd6db2449710b8ac9748278b408a5478ef238cc9d8e2849f1
|
|
| MD5 |
0324456ef23ac02d171cd66c1337b001
|
|
| BLAKE2b-256 |
78688523b33977ed48ffcec3463929a0d048f8e67feda899994d909d09831eac
|