VeOmni: Scaling any Modality Model Training to any Accelerators with PyTorch native Training Framework
Project description
🍪 Overview
VeOmni is a versatile framework for both single- and multi-modal pre-training and post-training. It empowers users to seamlessly scale models of any modality across various accelerators, offering both flexibility and user-friendliness.
Our guiding principles when building VeOmni are:
-
Flexibility and Modularity: VeOmni is built with a modular design, allowing users to decouple most components and replace them with their own implementations as needed.
-
Trainer-free: VeOmni supports linear training scripts that avoid rigid, structured trainer classes (e.g., PyTorch-Lightning or HuggingFace Trainer). These training scripts expose the entire training logic to users for maximum transparency and control. Besides, VeOmni supports a basic trainer for text-only or vlm/omni models training and a rl trainer as a trainer backend in reinforcement learning.
-
Omni model native: VeOmni enables users to effortlessly scale any omni-model across devices and accelerators.
-
Torch native: VeOmni is designed to leverage PyTorch’s native functions to the fullest extent, ensuring maximum compatibility and performance.
🔥 Latest News
- [2025/11] Our Paper OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo was accepted by AAAI 2026
- [2025/09] We release first offical release v0.1.0 of VeOmni.
- [2025/08] We release VeOmni Tech report and open the WeChat group. Feel free to join us!
- [2025/04] We release VeOmni!
📚 Key Features
- FSDP, FSDP2 backend for training.
- Sequence Parallelism with Deepspeed Ulysess, support with non-async and async mode.
- Experts Parallelism support large MOE model training, like Qwen3-Moe.
- Efficient GroupGemm kernel for Moe model, Liger-Kernel.
- Compatible with HuggingFace Transformers models. Qwen3, Qwen3-VL, Qwen3-Moe, etc
- Dynamic batching strategy, Omnidata processing
- Torch Distributed Checkpoint for checkpoint.
- Support for both Nvidia-GPU and Ascend-NPU training.
- Experiment tracking with wandb
📝 Upcoming Features and Changes
- VeOmni v0.2 Roadmap https://github.com/ByteDance-Seed/VeOmni/issues/268, https://github.com/ByteDance-Seed/VeOmni/issues/271
- Vit balance tool https://github.com/ByteDance-Seed/VeOmni/issues/280
- Validation dataset during training https://github.com/ByteDance-Seed/VeOmni/issues/247
- RL post training for omni-modality models with VeRL https://github.com/ByteDance-Seed/VeOmni/issues/262
🚀 Getting Started
Quick Start
✏️ Supported Models
| Model | Model size | Example config File |
|---|---|---|
| DeepSeek2.5/3/R1 | 236B/671B | deepseek.yaml |
| Llama3-3.3 | 1B/3B/8B/70B | llama3.yaml |
| Qwen2-3 | 0.5B/1.5B/3B/7B/14B/32B/72B/ | qwen2_5.yaml |
| Qwen2-3 VL/QVQ | 2B/3B/7B/32B/72B | qwen3_vl_dense.yaml |
| Qwen3-VL MoE | 30BA3B/235BA22B | qwen3_vl_moe.yaml |
| Qwen3-MoE | 30BA3B/235BA22B | qwen3-moe.yaml |
| Qwen2-3 Omni | 7B/30BA3B | qwen25_omni.yaml |
| Wan | Wan2.1-I2V-14B-480P | wan_sft.yaml |
| Omni Model | Any Modality Training | seed_omni.yaml |
Support new models to VeOmni see Support New Models
⛰️ Performance
For more details, please refer to our paper.
💡 Awesome work using VeOmni
- dFactory: Easy and Efficient dLLM Fine-Tuning
- LMMs-Engine
- UI-TARS: Pioneering Automated GUI Interaction with Native Agents
- OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft
- UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
- Open-dLLM: Open Diffusion Large Language Models
- LingBot-VLA: A Pragmatic VLA Foundation Model
🎨 Contributing
Contributions from the community are welcome! Please check out CONTRIBUTING.md our project roadmap(To be updated),
📝 Citation and Acknowledgement
If you find VeOmni useful for your research and applications, feel free to give us a star ⭐ or cite us using:
@article{ma2025veomni,
title={VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo},
author={Ma, Qianli and Zheng, Yaowei and Shi, Zhelun and Zhao, Zhongkai and Jia, Bin and Huang, Ziyue and Lin, Zhiqi and Li, Youjie and Yang, Jiacheng and Peng, Yanghua and others},
journal={arXiv preprint arXiv:2508.02317},
year={2025}
}
Thanks to the following projects for their excellent work:
Star History
🌱 About ByteDance Seed Team
Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society. You can get to know Bytedance Seed better through the following channels👇
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veomni-0.1.11.tar.gz.
File metadata
- Download URL: veomni-0.1.11.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4f2400384a2bf9cc91f786f426e684bf289d8b337a966c3419f80df7287b797
|
|
| MD5 |
5c499dcb1fb3649661db29ed3ec33014
|
|
| BLAKE2b-256 |
9087ef3e7edc90ed5266b5fcd6210b439e6bd86eddd6add6d46807465f0bc0cb
|
Provenance
The following attestation bundles were made for veomni-0.1.11.tar.gz:
Publisher:
publish.yml on ByteDance-Seed/VeOmni
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veomni-0.1.11.tar.gz -
Subject digest:
d4f2400384a2bf9cc91f786f426e684bf289d8b337a966c3419f80df7287b797 - Sigstore transparency entry: 1632622513
- Sigstore integration time:
-
Permalink:
ByteDance-Seed/VeOmni@f90b3dc6fbb0ce693745223cc7a94064123dbf4d -
Branch / Tag:
refs/tags/v0.1.11 - Owner: https://github.com/ByteDance-Seed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f90b3dc6fbb0ce693745223cc7a94064123dbf4d -
Trigger Event:
release
-
Statement type:
File details
Details for the file veomni-0.1.11-py3-none-any.whl.
File metadata
- Download URL: veomni-0.1.11-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe399ead11350fa3d53cc53293cd60d0c1042f2fa15d599aeb89db9aa417b129
|
|
| MD5 |
c05a8777ea1ba57b7576c8e72cc822e1
|
|
| BLAKE2b-256 |
8f16b70afce7300c6680d8460923c31c315f0aa64e55aea4f6aeee780da6ad91
|
Provenance
The following attestation bundles were made for veomni-0.1.11-py3-none-any.whl:
Publisher:
publish.yml on ByteDance-Seed/VeOmni
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veomni-0.1.11-py3-none-any.whl -
Subject digest:
fe399ead11350fa3d53cc53293cd60d0c1042f2fa15d599aeb89db9aa417b129 - Sigstore transparency entry: 1632622569
- Sigstore integration time:
-
Permalink:
ByteDance-Seed/VeOmni@f90b3dc6fbb0ce693745223cc7a94064123dbf4d -
Branch / Tag:
refs/tags/v0.1.11 - Owner: https://github.com/ByteDance-Seed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f90b3dc6fbb0ce693745223cc7a94064123dbf4d -
Trigger Event:
release
-
Statement type: