Skip to main content

Easy-to-use and powerful NLP library with Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including Neural Search, Question Answering, Information Extraction and Sentiment Analysis end-to-end system.

Project description

简体中文🀄 | English🌎


Features | Supported Models | Installation | Quick Start | Community

PaddleNLP is a Large Language Model (LLM) development suite based on the PaddlePaddle deep learning framework, supporting efficient large model training, lossless compression, and high-performance inference on various hardware devices. With its simplicity and ultimate performance, PaddleNLP is dedicated to helping developers achieve efficient industrial applications of large models.

News 📢

  • 2024.06.27 PaddleNLP v3.0 Beta:Embrace large models and experience a complete upgrade. With a unified large model suite, we achieve full-process access to domestically produced computing chips. We fully support industrial-level application processes for large models, such as PaddlePaddle's 4D parallel configuration, efficient fine-tuning strategies, efficient alignment algorithms, and high-performance reasoning. Our developed RsLoRA+ algorithm, full checkpoint storage mechanism Unified Checkpoint, and generalized support for FastFNN and FusedQKV all contribute to the training and inference of large models. We continuously support updates to mainstream models for providing efficient solutions.

  • 2024.04.24 PaddleNLP v2.8:Our self-developed RsLoRA+ algorithm with extreme convergence significantly improves the convergence speed and training effectiveness of PEFT training. By introducing high-performance generation acceleration into the RLHF PPO algorithm, we have broken through the generation speed bottleneck in PPO training, achieving a significant lead in PPO training performance. We generally support multiple large model training performance optimization methods such as FastFFN and FusedQKV, making large model training faster and more stable.

Features

🔧 Integrated training and inference on multiple hardware platforms

Our development suit supports large model training and inference on multiple hardware platforms, including NVIDIA GPUs, Kunlun XPUs, Ascend NPUs, Enflame GCUs, and Hygon DCUs. The toolkit's interface allows for quick hardware switching, significantly reducing research and development costs associated with hardware transitions.

🚀 Efficient and easy-to-use pre-training

We support 4D high-performance training with data parallelism, sharding parallelism, tensor parallelism, and pipeline parallelism. The Trainer supports configurable distributed strategies, reducing the cost associated with complex distributed combinations. The Unified Checkpoint large model storage format supports dynamic scaling of model parameter distribution during training, thereby reducing the migration cost caused by hardware switching.

🤗 Efficient fine-tuning

The fine-tuning algorithms are deeply integrated with zero-padding data streams and high-performance FlashMask operators, reducing invalid data padding and computation during training, and significantly improving the throughput of fine-tuning training.

🎛️ Lossless compression and high-performance inference

The high-performance inference module of the large model toolkit incorporates dynamic insertion and operator fusion strategies throughout the entire process, greatly accelerating parallel inference speed. The underlying implementation details are encapsulated, enabling out-of-the-box high-performance parallel inference capabilities.


Support Models

Detailed list 👉 Supported Model List

Installation

Prerequisites

  • python >= 3.8
  • paddlepaddle >= 3.0.0b0

Pip Installation

pip install --upgrade paddlenlp==3.0.0b0

or you can install the latest develop branch code with the following command:

pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html

More information about PaddlePaddle installation please refer to PaddlePaddle's Website.


Quick Start

Text generation with large language model

PaddleNLP provides a convenient and easy-to-use Auto API, which can quickly load models and Tokenizers. Here, we use the Qwen/Qwen2-0.5B large model as an example for text generation:

>>> from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B")
>>> model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="float16")
>>> input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd")
>>> outputs = model.generate(**input_features, max_length=128)
>>> print(tokenizer.batch_decode(outputs[0]))
['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?']

Pre-training for large language model

mkdir -p llm/data && cd llm/data
wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.bin
wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.idx
cd .. # change folder to PaddleNLP/llm
python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_pretrain.py ./config/llama/pretrain_argument.json

SFT finetuning forlarge language model

mkdir -p llm/data && cd llm/data
wget https://bj.bcebos.com/paddlenlp/datasets/examples/AdvertiseGen.tar.gz && tar -zxvf AdvertiseGen.tar.gz
cd .. # change folder to PaddleNLP/llm
python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py ./config/llama/sft_argument.json

For more steps in the entire large model process, please refer to theLarge Model Full-Process Suite.

For more PaddleNLP content, please refer to:

  • Model Library,which includes end-to-end usage of high-quality pre-trained models.
  • Multi-scenario Examples,to understand how to use PaddleNLP to solve various NLP technical problems, including basic techniques, system applications, and extended applications.
  • Interactive Tutorial,to quickly learn PaddleNLP on the free computing platform AI Studio.

Community

Slack

To connect with other users and contributors, welcome to join our Slack channel.

WeChat

Scan the QR code below with your Wechat⬇️. You can access to official technical exchange group. Look forward to your participation.

Citation

If you find PaddleNLP useful in your research, please consider citing

@misc{=paddlenlp,
    title={PaddleNLP: An Easy-to-use and High Performance NLP Library},
    author={PaddleNLP Contributors},
    howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}},
    year={2021}
}

Acknowledge

We have borrowed from Hugging Face's Transformers🤗 excellent design on pretrained models usage, and we would like to express our gratitude to the authors of Hugging Face and its open source community.

License

PaddleNLP is provided under the Apache-2.0 License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

paddlenlp-3.0.0b1-py3-none-any.whl (2.8 MB view details)

Uploaded Python 3

File details

Details for the file paddlenlp-3.0.0b1-py3-none-any.whl.

File metadata

  • Download URL: paddlenlp-3.0.0b1-py3-none-any.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for paddlenlp-3.0.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 f9d7a92b9a17796341ad906d3b82a61afdda26fcc6d3fa886cd4efda9fa86949
MD5 ce7bee547346a72150549f035fac77d3
BLAKE2b-256 e5fb4e0abc1d23498bfd21d4e3c95ef058982069b557855bfb52c79525d22fcf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page