Skip to main content

NeMo-Aligner - a toolkit for model alignment

Project description

NVIDIA NeMo-Aligner

Latest News

  • We released Nemotron-4-340B Base, Instruct, Reward. The Instruct and Reward variants are trained in Nemo-Aligner. Please see the Helpsteer2 paper for more details on the reward model training.
  • We are excited to announce the release of accelerated generation support in our RLHF pipeline using TensorRT-LLM. For more information, please refer to our RLHF documentation.
  • NeMo-Aligner Paper is now out on arxiv!

Introduction

NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit has support for state-of-the-art model alignment algorithms such as SteerLM, DPO, and Reinforcement Learning from Human Feedback (RLHF). These algorithms enable users to align language models to be more safe, harmless, and helpful. Users can perform end-to-end model alignment on a wide range of model sizes and take advantage of all the parallelism techniques to ensure their model alignment is done in a performant and resource-efficient manner. For more technical details, please refer to our paper.

The NeMo-Aligner toolkit is built using the NeMo Framework, which enables scalable training across thousands of GPUs using tensor, data, and pipeline parallelism for all alignment components. Additionally, our checkpoints are cross-compatible with the NeMo ecosystem, facilitating inference deployment and further customization (https://github.com/NVIDIA/NeMo-Aligner).

The toolkit is currently in it's early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models.

Key Features

Learn More

Latest Release

For the latest stable release, please see the releases page. All releases come with a pre-built container. Changes within each release will be documented in CHANGELOG.

Install Your Own Environment

Requirements

NeMo-Aligner has the same requirements as the NeMo Toolkit Requirements with the addition of PyTriton.

Quick start inside NeMo container

NeMo Aligner comes included with NeMo containers. On a machine with NVIDIA GPUs and drivers installed run NeMo container:

docker run --gpus all -it --rm --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864  nvcr.io/nvidia/nemo:24.07

Once you are inside the container, NeMo-Aligner is already installed and together with NeMo and other tools can be found under /opt/ folder.

Install NeMo-Aligner

Please follow the same steps as outlined in the NeMo Toolkit Installation Guide. After installing NeMo, execute the following additional command:

pip install nemo-aligner

Alternatively, if you prefer to install the latest commit:

pip install .

Docker Containers

We provide an official NeMo-Aligner Dockerfile which is based on stable, tested versions of NeMo, Megatron-LM, and TransformerEngine. The primary objective of this Dockerfile is to ensure stability, although it might not always reflect the very latest versions of those three packages. You can access our Dockerfile here.

Alternatively, you can build the NeMo Dockerfile here NeMo Dockerfile and add RUN pip install nemo-aligner at the end.

Future work

  • We will continue improving the stability of the PPO learning phase.
  • Improve the performance of RLHF.
  • Add TRT-LLM inference support for Rejection Sampling.

Contribute to NeMo-Aligner

We welcome community contributions! Please refer to CONTRIBUTING.md for guidelines.

Cite NeMo-Aligner in Your Work

@misc{shen2024nemoaligner,
      title={NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment},
      author={Gerald Shen and Zhilin Wang and Olivier Delalleau and Jiaqi Zeng and Yi Dong and Daniel Egert and Shengyang Sun and Jimmy Zhang and Sahil Jain and Ali Taghibakhshi and Markel Sanz Ausin and Ashwath Aithal and Oleksii Kuchaiev},
      year={2024},
      eprint={2405.01481},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

This toolkit is licensed under the Apache License, Version 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nemo_aligner-0.6.0.tar.gz (133.9 kB view details)

Uploaded Source

Built Distribution

nemo_aligner-0.6.0-py3-none-any.whl (196.9 kB view details)

Uploaded Python 3

File details

Details for the file nemo_aligner-0.6.0.tar.gz.

File metadata

  • Download URL: nemo_aligner-0.6.0.tar.gz
  • Upload date:
  • Size: 133.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.12

File hashes

Hashes for nemo_aligner-0.6.0.tar.gz
Algorithm Hash digest
SHA256 ca2f7734240335af083c12caf62f5a219afa7b3f7d94b30081d7c40f0c618a76
MD5 aba1481c036355715ac3b8535368fc02
BLAKE2b-256 7f67bb501a8cdd1d5ecd172f2b7f20ca53c56f81514e1c7d62b5364025495170

See more details on using hashes here.

File details

Details for the file nemo_aligner-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: nemo_aligner-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 196.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.12

File hashes

Hashes for nemo_aligner-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e8d87324b5223817c9b93de3b6c584548fb53baafd2554b8b0a3b792fe173b91
MD5 03de873443c9318d414938e599163f25
BLAKE2b-256 bccb8bc6ee2afc2186fff5265a1005951991b5c3b2a3bf90b39b87bd33d6a643

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page