Skip to main content

ONE for all, Optimal generator with No Exception.

Project description

MindSpore ONE

This repository contains SoTA algorithms, models, and interesting projects in the area of multimodal understanding and content generation.

ONE is short for "ONE for all"

News

  • [2025.04.10] We release v0.3.0. More than 15 SoTA generative models are added, including Flux, CogView4, OpenSora2.0, Movie Gen 30B , CogVideoX 5B~30B. Have fun!
  • [2025.02.21] We support DeepSeek Janus-Pro, a SoTA multimodal understanding and generation model. See here
  • [2024.11.06] v0.2.0 is released

Quick tour

To install v0.3.0, please install MindSpore 2.5.0 and run pip install mindone

Alternatively, to install the latest version from the master branch, please run.

git clone https://github.com/mindspore-lab/mindone.git
cd mindone
pip install -e .

We support state-of-the-art diffusion models for generating images, audio, and video. Let's get started using Stable Diffusion 3 as an example.

Hello MindSpore from Stable Diffusion 3!

sd3
import mindspore
from mindone.diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    mindspore_dtype=mindspore.float16,
)
prompt = "A cat holding a sign that says 'Hello MindSpore'"
image = pipe(prompt)[0][0]
image.save("sd3.png")

run hf diffusers on mindspore

  • mindone diffusers is under active development, most tasks were tested with mindspore 2.5.0 on Ascend Atlas 800T A2 machines.
  • compatibale with hf diffusers 0.32.2
component features
pipeline support text-to-image,text-to-video,text-to-audio tasks 160+
models support audoencoder & transformers base models same as hf diffusers 50+
schedulers support diffusion schedulers (e.g., ddpm and dpm solver) same as hf diffusers 35+

supported models under mindone/examples

task model inference finetune pretrain institute
Image-to-Video hunyuanvideo-i2v ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ Tencent
Text/Image-to-Video wan2.1 ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ Alibaba
Text/Image/Speech-to-Video wan2.2 ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ Alibaba
Text-to-Image cogview4 ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ Zhipuai
Text-to-Video step_video_t2v ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ StepFun
Image-Text-to-Text qwen2_vl ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ Alibaba
Any-to-Any janus ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… DeepSeek
Any-to-Any emu3 ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… BAAI
Class-to-Image var๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… ByteDance
Text/Image-to-Video hpcai open sora 1.2/2.0 ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… HPC-AI Tech
Text/Image-to-Video cogvideox 1.5 5B~30B ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… Zhipu
Text-to-Video open sora plan 1.3 ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… PKU
Text-to-Video hunyuanvideo ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… Tencent
Text-to-Video movie gen 30B ๐Ÿ”ฅ๐Ÿ”ฅ โœ… โœ… โœ… Meta
Video-Encode-Decode magvit โœ… โœ… โœ… Google
Text-to-Image story_diffusion โœ… โœ–๏ธ โœ–๏ธ ByteDance
Image-to-Video dynamicrafter โœ… โœ–๏ธ โœ–๏ธ Tencent
Video-to-Video venhancer โœ… โœ–๏ธ โœ–๏ธ Shanghai AI Lab
Text-to-Video t2v_turbo โœ… โœ… โœ… Google
Image-to-Video svd โœ… โœ… โœ… Stability AI
Text-to-Video animate diff โœ… โœ… โœ… CUHK
Text/Image-to-Video video composer โœ… โœ… โœ… Alibaba
Text-to-Image flux ๐Ÿ”ฅ โœ… โœ… โœ–๏ธ Black Forest Lab
Text-to-Image stable diffusion 3 ๐Ÿ”ฅ โœ… โœ… โœ–๏ธ Stability AI
Text-to-Image kohya_sd_scripts โœ… โœ… โœ–๏ธ kohya
Text-to-Image stable diffusion xl โœ… โœ… โœ… Stability AI
Text-to-Image stable diffusion โœ… โœ… โœ… Stability AI
Text-to-Image hunyuan_dit โœ… โœ… โœ… Tencent
Text-to-Image pixart_sigma โœ… โœ… โœ… Huawei
Text-to-Image fit โœ… โœ… โœ… Shanghai AI Lab
Class-to-Video latte โœ… โœ… โœ… Shanghai AI Lab
Class-to-Image dit โœ… โœ… โœ… Meta
Text-to-Image t2i-adapter โœ… โœ… โœ… Shanghai AI Lab
Text-to-Image ip adapter โœ… โœ… โœ… Tencent
Text-to-3D mvdream โœ… โœ… โœ… ByteDance
Image-to-3D instantmesh โœ… โœ… โœ… Tencent
Image-to-3D sv3d โœ… โœ… โœ… Stability AI
Text/Image-to-3D hunyuan3d-1.0 โœ… โœ… โœ… Tencent

supported captioner

task model inference finetune pretrain features
Image-Text-to-Text pllava ๐Ÿ”ฅ โœ… โœ–๏ธ โœ–๏ธ support video and image captioning

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mindone_testing-0.5.0rc1.tar.gz (15.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mindone_testing-0.5.0rc1-py3-none-any.whl (8.8 MB view details)

Uploaded Python 3

File details

Details for the file mindone_testing-0.5.0rc1.tar.gz.

File metadata

  • Download URL: mindone_testing-0.5.0rc1.tar.gz
  • Upload date:
  • Size: 15.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.8

File hashes

Hashes for mindone_testing-0.5.0rc1.tar.gz
Algorithm Hash digest
SHA256 5c1c63e407393efa228c67a3a25162f60698a7c6989e93a0bb00fae3a5dd7222
MD5 9e7acf2423eed77aa91813b0a9b078ea
BLAKE2b-256 e17d966fd19022b8bec2bd9d3323f4692e366fb6627e4b27ab7d6811594dce6d

See more details on using hashes here.

File details

Details for the file mindone_testing-0.5.0rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for mindone_testing-0.5.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 b7a1596baaa0b76d9e9589dc7cc716ddc6516807175447c3dcd722b49d2ca021
MD5 e81d13090e0eda744e2deeda98eba6a8
BLAKE2b-256 ebb71826bee597afedc5cb816e8b30118654a0dc4d859103397a317d1360d7f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page