mindformers platform: linux, cpu: x86_64

These details have not been verified by PyPI

Project links

Project description

MindSpore Transformers (MindFormers)

1. Introduction

The MindSpore Transformers suite aims to build a comprehensive development toolkit covering the entire lifecycle of large-scale models, including pre-training, fine-tuning, evaluation, inference, and deployment. It provides industry-leading large Transformer-based language models, multimodal understanding models, and omni-modal models, enabling users to easily achieve end-to-end development of large-scale models.

Based on MindSpore's built-in multi-dimensional hybrid parallelism technology and modular design, the MindSpore Transformers suite offers the following features:

Configurable one-click launch for pre-training, fine-tuning, evaluation, inference, and deployment of large-scale models.
Integration with mainstream ecosystems such as Hugging Face, Megatron-LM, vLLM, and OpenCompass.
Rich multi-dimensional hybrid parallelism and debugging/tuning capabilities, supporting training for trillion-parameter models.
System-level deep optimization for training and inference, enhancing performance for hundred-billion dense and trillion sparse large-scale models.
High availability in training, ensuring stable operation of large models on clusters with tens of thousands of NPUs.
Fine-grained, multi-level training monitoring to facilitate anomaly detection and analysis.
Simplified model integration through Mcore architecture upgrades and modular design, offering broader standardization and stronger ecosystem support.

For details about MindSpore Transformers tutorials and API documents, see MindSpore Transformers Documentation. The following are quick jump links to some of the key content:

If you have any suggestions on MindSpore Transformers, contact us through an issue, and we will address it promptly.

If you're interested in MindSpore Transformers technology or wish to contribute code, we welcome you to join the MindSpore Transformers SIG.

Models List

The following table lists models supported by MindSpore Transformers.

Model	Specifications	Model Type	Model Architecture	Latest Version
TeleChat3 `🔥HOT`	36B	Dense LLM	Mcore	1.9.0
TeleChat3-MoE `🔥HOT`	105B-A4.7B	Sparse LLM	Mcore	1.9.0
Qwen3 `🔥HOT`	0.6B/1.7B/4B/8B/14B/32B	Dense LLM	Mcore	1.9.0
Qwen3-MoE `🔥HOT`	30B-A3B/235B-A22B	Sparse LLM	Mcore	1.9.0
DeepSeek-V3 `🔥HOT`	671B	Sparse LLM	Mcore/Legacy	1.9.0
GLM4.5 `🔥HOT`	106B-A12B/355B-A32B	Sparse LLM	Mcore	1.9.0
GLM4 `🔥HOT`	9B	Dense LLM	Mcore/Legacy	1.9.0
Qwen2.5 `🔥HOT`	0.5B/1.5B/7B/14B/32B/72B	Dense LLM	Legacy	1.9.0
TeleChat2 `🔥HOT`	7B/35B/115B	Dense LLM	Mcore/Legacy	1.9.0
Llama3.1 `⚠️EOL`	8B/70B	Dense LLM	Legacy	1.7.0
Mixtral `⚠️EOL`	8x7B	Sparse LLM	Legacy	1.7.0
CodeLlama `⚠️EOL`	34B	Dense LLM	Legacy	1.5.0
CogVLM2-Image `⚠️EOL`	19B	MM	Legacy	1.5.0
CogVLM2-Video `⚠️EOL`	13B	MM	Legacy	1.5.0
DeepSeek-V2 `⚠️EOL`	236B	Sparse LLM	Legacy	1.5.0
DeepSeek-Coder-V1.5 `⚠️EOL`	7B	Dense LLM	Legacy	1.5.0
DeepSeek-Coder `⚠️EOL`	33B	Dense LLM	Legacy	1.5.0
GLM3-32K `⚠️EOL`	6B	Dense LLM	Legacy	1.5.0
GLM3 `⚠️EOL`	6B	Dense LLM	Legacy	1.5.0
InternLM2 `⚠️EOL`	7B/20B	Dense LLM	Legacy	1.5.0
Llama3.2 `⚠️EOL`	3B	Dense LLM	Legacy	1.5.0
Llama3.2-Vision `⚠️EOL`	11B	MM	Legacy	1.5.0
Llama3 `⚠️EOL`	8B/70B	Dense LLM	Legacy	1.5.0
Qwen2 `⚠️EOL`	0.5B/1.5B/7B/57B/57B-A14B/72B	Dense /Sparse LLM	Legacy	1.5.0
Qwen1.5 `⚠️EOL`	7B/14B/72B	Dense LLM	Legacy	1.5.0
Qwen-VL `⚠️EOL`	9.6B	MM	Legacy	1.5.0
TeleChat `⚠️EOL`	7B/12B/52B	Dense LLM	Legacy	1.5.0
Whisper `⚠️EOL`	1.5B	MM	Legacy	1.5.0
Yi `⚠️EOL`	6B/34B	Dense LLM	Legacy	1.5.0
YiZhao `⚠️EOL`	12B	Dense LLM	Legacy	1.5.0
Llama2 `⚠️EOL`	7B/13B/70B	Dense LLM	Legacy	1.3.2
Baichuan2 `⚠️EOL`	7B/13B	Dense LLM	Legacy	1.3.2
GLM2 `⚠️EOL`	6B	Dense LLM	Legacy	1.3.2
GPT2 `⚠️EOL`	124M/13B	Dense LLM	Legacy	1.3.2
InternLM `⚠️EOL`	7B/20B	Dense LLM	Legacy	1.3.2
Qwen `⚠️EOL`	7B/14B	Dense LLM	Legacy	1.3.2
CodeGeex2 `⚠️EOL`	6B	Dense LLM	Legacy	1.1.0
WizardCoder `⚠️EOL`	15B	Dense LLM	Legacy	1.1.0
Baichuan `⚠️EOL`	7B/13B	Dense LLM	Legacy	1.0
Blip2 `⚠️EOL`	8.1B	MM	Legacy	1.0
Bloom `⚠️EOL`	560M/7.1B/65B/176B	Dense LLM	Legacy	1.0
Clip `⚠️EOL`	149M/428M	MM	Legacy	1.0
CodeGeex `⚠️EOL`	13B	Dense LLM	Legacy	1.0
GLM `⚠️EOL`	6B	Dense LLM	Legacy	1.0
iFlytekSpark `⚠️EOL`	13B	Dense LLM	Legacy	1.0
Llama `⚠️EOL`	7B/13B	Dense LLM	Legacy	1.0
MAE `⚠️EOL`	86M	MM	Legacy	1.0
Mengzi3 `⚠️EOL`	13B	Dense LLM	Legacy	1.0
PanguAlpha `⚠️EOL`	2.6B/13B	Dense LLM	Legacy	1.0
SAM `⚠️EOL`	91M/308M/636M	MM	Legacy	1.0
Skywork `⚠️EOL`	13B	Dense LLM	Legacy	1.0
Swin `⚠️EOL`	88M	MM	Legacy	1.0
T5 `⚠️EOL`	14M/60M	Dense LLM	Legacy	1.0
VisualGLM `⚠️EOL`	6B	MM	Legacy	1.0
Ziya `⚠️EOL`	13B	Dense LLM	Legacy	1.0
Bert `⚠️EOL`	4M/110M	Dense LLM	Legacy	0.8

⚠️EOL indicates that the model has been offline from the main branch and can be used with the latest supported version (e.g., 1.7.0).

The model maintenance strategy follows the Life Cycle And Version Matching Strategy of the corresponding latest supported version.

Model Level Introduction

The Mcore architecture model is divided into five levels for training and inference, respectively, representing different standards for model deployment. For details on the levels of different specifications of models in the library, please refer to the model documentation.

Training

Released: Passed testing team verification, with loss and grad norm accuracy meeting benchmark alignment standards under deterministic conditions;
Validated: Passed self-verification by the development team, with loss and grad norm accuracy meeting benchmark alignment standards under deterministic conditions;
Preliminary: Passed preliminary self-verification by developers, with complete functionality and usability, normal convergence of training, but accuracy not strictly verified;
Untested: Functionality is available but has not undergone systematic testing, with accuracy and convergence not verified, and support for user-defined development enablement;
Community: Community-contributed MindSpore native models, developed and maintained by the community.

Inference

Released: Passed testing team acceptance, with evaluation accuracy aligned with benchmark standards;
Validated: Passed developer self-verification, with evaluation accuracy aligned with benchmark standards;
Preliminary: Passed preliminary self-verification by developers, with complete functionality and usable for testing; inference outputs are logically consistent but accuracy has not been strictly verified;
Untested: Functionality is available but has not undergone system testing; accuracy has not been verified; supports user-defined development enablement;
Community: Community-contributed MindSpore native models, developed and maintained by the community.

2. Installation

Version Mapping

Currently supported hardware includes Atlas 800T A2, Atlas 800I A2, and Atlas 900 A3 SuperPoD.

Python 3.11.4 is recommended for the current suite.

MindSpore Transformers	MindSpore	CANN	Driver/Firmware
1.9.0	2.9.0	9.0.0	26.0.RC1

Historical Version Supporting Relationships:

MindSpore Transformers	MindSpore	CANN	Driver/Firmware
1.8.0	2.7.2	8.5.0	25.5.0
1.7.0	2.7.1	8.3.RC1	25.3.RC1
1.6.0	2.7.0	8.2.RC1	25.2.0
1.5.0	2.6.0-rc1	8.1.RC1	25.0.RC1
1.3.2	2.4.10	8.0.0	24.1.0
1.3.0	2.4.0	8.0.RC3	24.1.RC3
1.2.0	2.3.0	8.0.RC2	24.1.RC2

Installation Using the Source Code

Currently, MindSpore Transformers can be compiled and installed using the source code. You can run the following commands to install MindSpore Transformers:

git clone -b r1.9.0 https://atomgit.com/mindspore/mindformers.git
cd mindformers
bash build.sh

3. User Guide

MindSpore Transformers supports distributed pre-training, supervised fine-tuning, and inference tasks for large models with one click. You can click the link of each model in Model List to see the corresponding documentation.

For more information about the functions of MindSpore Transformers, please refer to MindSpore Transformers Documentation.

4. Life Cycle And Version Matching Strategy

MindSpore Transformers version has the following five maintenance phases:

Status	Duration	Description
Plan	1-3 months	Planning function.
Develop	3 months	Build function.
Preserve	6 months	Incorporate all solved problems and release new versions.
No Preserve	0-3 months	Incorporate all the solved problems, there is no full-time maintenance team, and there is no plan to release a new version.
End of Life (EOL)	N/A	The branch is closed and no longer accepts any modifications.

MindSpore Transformers released version preservation policy

5. Disclaimer

scripts/examples directory is provided as reference examples and do not form part of the commercially released products. They are only for users' reference. If it needs to be used, the user should be responsible for transforming it into a product suitable for commercial use and ensuring security protection. MindSpore Transformers does not assume security responsibility for the resulting security problems.
Regarding datasets, MindSpore Transformers only provides suggestions for datasets that can be used for training. MindSpore Transformers does not provide any datasets. Users who use any dataset for training must ensure the legality and security of the training data and assume the following risks:
1. Data poisoning: Maliciously tampered training data may cause the model to produce bias, security vulnerabilities, or incorrect outputs.
2. Data compliance: Users must ensure that data collection and processing comply with relevant laws, regulations, and privacy protection requirements.
If you do not want your dataset to be mentioned in MindSpore Transformers, or if you want to update the description of your dataset in MindSpore Transformers, please submit an issue to AtomGit, and we will remove or update the description of your dataset according to your issue request. We sincerely appreciate your understanding and contribution to MindSpore Transformers.
Regarding model weights, users must verify the authenticity of downloaded and distributed model weights from trusted sources. MindSpore Transformers cannot guarantee the security of third-party weights. Weight files may be tampered with during transmission or loading, leading to unexpected model outputs or security vulnerabilities. Users should assume the risk of using third-party weights and ensure that weight files are verified for security before use.
Regarding weights, vocabularies, scripts, and other files downloaded from sources like openmind, users must verify the authenticity of downloaded and distributed model weights from trusted sources. MindSpore Transformers cannot guarantee the security of third-party files. Users should assume the risks arising from unexpected functional issues, outputs, or security vulnerabilities when using these files.
MindSpore Transformers saves weights or logs based on the path set by the user. Users should avoid using system file directories when configuring paths. If unexpected system issues arise due to improper path settings, users shall bear the risks themselves.

6. Contribution

We welcome contributions to the community. For details, see MindSpore Transformers Contribution Guidelines.

7. License

Apache 2.0 License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.9.0

May 11, 2026

1.8.0

Jan 20, 2026

1.7.0

Oct 27, 2025

1.6.0

Jul 29, 2025

1.3.2

Jan 16, 2025

1.3.0

Nov 27, 2024

1.2.0

Jul 27, 2024

1.1.0

May 17, 2024

1.1.0rc1 pre-release

Apr 25, 2024

1.0.2

Apr 25, 2024

1.0.1

Apr 13, 2024

1.0.0

Jan 30, 2024

0.8.0

Dec 14, 2023

0.3.0

Apr 24, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mindformers-1.9.0-py3-none-any.whl (1.7 MB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file mindformers-1.9.0-py3-none-any.whl.

File metadata

Download URL: mindformers-1.9.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 1.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.10

File hashes

Hashes for mindformers-1.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a67f1845a7896641d777c4ef230a637f700e9bb2f0d5f942f569096e389bbcf`
MD5	`2edb68d027498e0b18a2ac873263ea68`
BLAKE2b-256	`8e08ebb509b848f8cf9b3abc956b0b3c58337346ff749cf7a7c85d4e69ebf5ac`

See more details on using hashes here.

mindformers 1.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MindSpore Transformers (MindFormers)

1. Introduction

Models List

Model Level Introduction

Training

Inference

2. Installation

Version Mapping

Installation Using the Source Code

3. User Guide

4. Life Cycle And Version Matching Strategy

5. Disclaimer

6. Contribution

7. License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes