paddleformers

Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system.

These details have not been verified by PyPI

Project links

Homepage

Project description

News | Highlights | Installation | Quickstart | Community

PaddleFormers is a Transformer model library built on the PaddlePaddle deep learning framework, delivering both ease of use and high-performance capabilities. It provides a unified model definition interface, modular training components, and comprehensive distributed training strategies specifically designed for large language model development pipelines. This enables developers to train large models efficiently with minimal complexity, making it suitable for diverse scenarios ranging from academic research to industrial applications.

News

[2025/06/28] 🎉 PaddleFormers 0.1 is officially released! This initial version supports SFT/DPO training paradigms, configurable distributed training via unified Trainer API, and integrates PEFT, MergeKit, and Quantization APIs for diverse LLM applications.

Highlights

⚙️ Simplified Distributed Training

Implements 4D parallel strategies through unified Trainer API, lowering the barrier to distributed LLM training.

🛠 Efficient Post-Training

Integrates Packing dataflow and FlashMask operators for SFT/DPO training, eliminating padding waste and boosting throughput.

💾 Industrial Storage Solution

Features Unified Checkpoint storage tools for LLMs, enabling training resumption and dynamic resource scaling. Additionally implements asynchronous storage (up to 95% faster) and Optimizer State Quantization (78% storage reduction), ensuring industrial training meets both efficiency and stability requirements.

Installation

Requires Python 3.8+ and PaddlePaddle 3.1+.

# Install via pip
pip install paddleformers

# Install development version
git clone https://github.com/PaddlePaddle/PaddleFormers.git
cd PaddleFormers
pip install -e .

Quickstart

Text Generation

This example shows how to load Qwen model for text generation with PaddleFormers Auto API:

from paddleformers.transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="bfloat16")
input_features = tokenizer("Give me a short introduction to large language model.", return_tensors="pd")
outputs = model.generate(**input_features, max_new_tokens=128)
print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))

SFT Training

Getting started with supervised fine-tuning (SFT) using PaddleFormers:

from paddleformers.trl import SFTConfig, SFTTrainer
from datasets import load_dataset
dataset = load_dataset("ZHUI/alpaca_demo", split="train")

training_args = SFTConfig(output_dir="Qwen/Qwen2.5-0.5B-SFT", device="gpu")
trainer = SFTTrainer(
    args=training_args,
    model="Qwen/Qwen2.5-0.5B-Instruct",
    train_dataset=dataset,
)
trainer.train()

Community

We welcome all contributions! See CONTRIBUTING.md for guidelines.

License

This repository's source code is available under the Apache 2.0 License.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.1

Mar 30, 2026

1.1.0 yanked

Feb 2, 2026

Reason this release was yanked:

paddlefleet not yet release

1.0.0

Jan 21, 2026

0.4.1

Jan 15, 2026

0.4.0

Nov 14, 2025

0.3.2

Oct 14, 2025

0.3.1

Oct 13, 2025

0.3.0

Sep 18, 2025

0.2.6

Sep 28, 2025

0.2.5

Sep 18, 2025

0.2.4

Sep 11, 2025

0.2.3

Sep 7, 2025

0.2.2

Sep 6, 2025

0.2.1

Sep 5, 2025

0.2.0

Sep 3, 2025

0.1.6

Sep 28, 2025

0.1.5

Sep 7, 2025

0.1.4

Aug 28, 2025

This version

0.1.3

Aug 25, 2025

0.1.2

Aug 4, 2025

0.1.1

Jul 15, 2025

0.1

Jun 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paddleformers-0.1.3-py3-none-any.whl (1.2 MB view details)

Uploaded Aug 25, 2025 Python 3

File details

Details for the file paddleformers-0.1.3-py3-none-any.whl.

File metadata

Download URL: paddleformers-0.1.3-py3-none-any.whl
Upload date: Aug 25, 2025
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for paddleformers-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`577170a1b7cdbca3d211a6ff5af8585c04ab8205e6840ac05508b88d0a605782`
MD5	`0d2f6bc06fd8fe34cad215239b3c7a20`
BLAKE2b-256	`c9ba616dc3e5d3f1bbb7e7415377e8df05a5b7340a9603fa5847295bba511e9c`

See more details on using hashes here.

paddleformers 0.1.3

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

News | Highlights | Installation | Quickstart | Community

News

Highlights

⚙️ Simplified Distributed Training

🛠 Efficient Post-Training

💾 Industrial Storage Solution

Installation

Quickstart

Text Generation

SFT Training

Community

License

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes