ONE for all, Optimal generator with No Exception.
Project description
MindSpore ONE
This repository contains SoTA algorithms, models, and interesting projects in the area of multimodal understanding and content generation.
ONE is short for "ONE for all"
News
- [2025.12.24] We release v0.5.0, compatibility with ๐ค Transformers v4.57.1 (70+ new models) and ๐ค Diffusers v0.35.2, plus previews of v0.36 pipelines like Flux2, QwenImageEditPlus, Lucy and Kandinsky5. Also introduces initial ComfyUI integration. Happy exploring!
- [2025.11.02] v0.4.0 is released, with 280+ transformers models and 70+ diffusers pipelines supported. See here
- [2025.04.10] We release v0.3.0. More than 15 SoTA generative models are added, including Flux, CogView4, OpenSora2.0, Movie Gen 30B, CogVideoX 5B~30B. Have fun!
- [2025.02.21] We support DeepSeek Janus-Pro, a SoTA multimodal understanding and generation model. See here
- [2024.11.06] v0.2.0 is released
Quick tour
To install v0.5.0, please install MindSpore 2.6.0 - 2.7.1 and run pip install mindone
Alternatively, to install the latest version from the master branch, please run:
git clone https://github.com/mindspore-lab/mindone.git
cd mindone
pip install -e .
We support state-of-the-art diffusion models for generating images, audio, and video. Let's get started using Stable Diffusion 3 as an example.
Hello MindSpore from Stable Diffusion 3!
import mindspore
from mindone.diffusers import StableDiffusion3Pipeline
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3-medium-diffusers",
mindspore_dtype=mindspore.float16,
)
prompt = "A cat holding a sign that says 'Hello MindSpore'"
image = pipe(prompt)[0][0]
image.save("sd3.png")
run hf diffusers on mindspore
- mindone diffusers is under active development, most tasks were tested with MindSpore 2.6.0-2.7.1 on Ascend Atlas 800T A2 machines
- compatible with ๐ค diffusers v0.35.2, preview supports for SoTA v0.36 pipelines, see support list
- 18+ training examples - controlnet, dreambooth, lora and more
run hf transformers on mindspore
- mindone transformers is under active development, most tasks were tested with mindspore 2.6.0-2.7.1 on Ascend Atlas 800T A2 machines
- compatibale with ๐ค transformers v4.57.1
- providing 350+ state-of-the-art machine learning models in text, computer vision, audio, video, and multimodal model for inference, see support list
supported models under mindone/examples
| task | model | inference | finetune | pretrain | institute |
|---|---|---|---|---|---|
| Text/Image-to-Video | wan2.1 ๐ฅ | โ | โ๏ธ | โ๏ธ | Alibaba |
| Text/Image-to-Video | wan2.2 ๐ฅ๐ฅ | โ | โ | โ๏ธ | Alibaba |
| Audio/Image-Text-to-Text | qwen2_5_omni ๐ฅ๐ฅ | โ | โ | โ๏ธ | Alibaba |
| Image/Video-Text-to-Text | qwen2_5_vl ๐ฅ๐ฅ | โ | โ | โ๏ธ | Alibaba |
| Any-to-Any | qwen3_omni_moe ๐ฅ๐ฅ๐ฅ | โ | โ๏ธ | โ๏ธ | Alibaba |
| Image-Text-to-Text | qwen3_vl/qwen3_vl_moe ๐ฅ๐ฅ๐ฅ | โ | โ๏ธ | โ๏ธ | Alibaba |
| Text-to-Image | qwen_image ๐ฅ๐ฅ๐ฅ | โ | โ | โ๏ธ | Alibaba |
| Text-to-Text | minicpm ๐ฅ๐ฅ | โ | โ๏ธ | โ๏ธ | OpenBMB |
| Any-to-Any | janus | โ | โ | โ | DeepSeek |
| Any-to-Any | emu3 | โ | โ | โ | BAAI |
| Class-to-Image | var | โ | โ | โ | ByteDance |
| Text-to-Image | omnigen2 ๐ฅ | โ | โ | โ๏ธ | VectorSpaceLab |
| Text/Image-to-Video | hpcai open sora 1.2/2.0 | โ | โ | โ | HPC-AI Tech |
| Text/Image-to-Video | cogvideox 1.5 5B~30B | โ | โ | โ | Zhipu |
| Image/Text-to-Text | glm4v ๐ฅ | โ | โ๏ธ | โ๏ธ | Zhipu |
| Text-to-Video | open sora plan 1.3 | โ | โ | โ | PKU |
| Text-to-Video | hunyuanvideo | โ | โ | โ | Tencent |
| Image-to-Video | hunyuanvideo-i2v ๐ฅ | โ | โ๏ธ | โ๏ธ | Tencent |
| Text-to-Video | movie gen 30B | โ | โ | โ | Meta |
| Segmentation | lang_sam ๐ฅ | โ | โ๏ธ | โ๏ธ | Meta |
| Segmentation | sam2 | โ | โ๏ธ | โ๏ธ | Meta |
| Text-to-Video | step_video_t2v | โ | โ๏ธ | โ๏ธ | StepFun |
| Text-to-Speech | sparktts | โ | โ๏ธ | โ๏ธ | Spark Audio |
| Text-to-Image | flux | โ | โ | โ๏ธ | Black Forest Lab |
| Text-to-Image | stable diffusion 3 | โ | โ | โ๏ธ | Stability AI |
supported captioner
| task | model | inference | finetune | pretrain | features |
|---|---|---|---|---|---|
| Image-Text-to-Text | pllava | โ | โ๏ธ | โ๏ธ | support video and image captioning |
training-free acceleration
Introduce dit infer acceleration - DiTCache, PromptGate and FBCache with Taylorseer, tested on sd3 and flux.1.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mindone-0.5.0.tar.gz.
File metadata
- Download URL: mindone-0.5.0.tar.gz
- Upload date:
- Size: 8.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e4317cc3aaa934d7f3b84f088cf18c1493f9f1835fab3ad8c20cc9b2c8a95ca
|
|
| MD5 |
17d963ab2807f8e14af7a0cf9cf94d12
|
|
| BLAKE2b-256 |
1c6245c794115ccad71f3bb5fc8b1925ad9672ee3e210be20c8fae8059379253
|
Provenance
The following attestation bundles were made for mindone-0.5.0.tar.gz:
Publisher:
publish.yml on mindspore-lab/mindone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mindone-0.5.0.tar.gz -
Subject digest:
9e4317cc3aaa934d7f3b84f088cf18c1493f9f1835fab3ad8c20cc9b2c8a95ca - Sigstore transparency entry: 779067521
- Sigstore integration time:
-
Permalink:
mindspore-lab/mindone@d246095bcd6acae7502fa7f7f08c0019731c160f -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mindspore-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d246095bcd6acae7502fa7f7f08c0019731c160f -
Trigger Event:
release
-
Statement type:
File details
Details for the file mindone-0.5.0-py3-none-any.whl.
File metadata
- Download URL: mindone-0.5.0-py3-none-any.whl
- Upload date:
- Size: 9.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebea916c3034d324f1f43d0e10cc526a0b73bc955ea93ce91fca039157c13b2c
|
|
| MD5 |
e0d0db8f626251e85da2ab8fe5b1e00f
|
|
| BLAKE2b-256 |
0d059597fa739248bb9867265f4c816503717ece44912513f8be06f32fdc47a5
|
Provenance
The following attestation bundles were made for mindone-0.5.0-py3-none-any.whl:
Publisher:
publish.yml on mindspore-lab/mindone
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mindone-0.5.0-py3-none-any.whl -
Subject digest:
ebea916c3034d324f1f43d0e10cc526a0b73bc955ea93ce91fca039157c13b2c - Sigstore transparency entry: 779067524
- Sigstore integration time:
-
Permalink:
mindspore-lab/mindone@d246095bcd6acae7502fa7f7f08c0019731c160f -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/mindspore-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d246095bcd6acae7502fa7f7f08c0019731c160f -
Trigger Event:
release
-
Statement type: