Skip to main content

turbo-alignment repository

Project description

๐Ÿš€ Turbo-Alignment

Library for industrial alignment.

Table of Contents

๐ŸŒŸ What is Turbo-Alignment?

Turbo-Alignment is a library designed to streamline the fine-tuning and alignment of large language models, leveraging advanced techniques to enhance efficiency and scalability.

โœจ Key Features

  • ๐Ÿ“Š Comprehensive Metrics and Logging: Includes a wide range of metrics such as self-bleu, KL divergence, diversity, etc. all supported out of the box.
  • ๐Ÿ› ๏ธ Streamlined Method Deployment: Simplifies the process of deploying new methods, allowing for quick development and integration of new datasets and trainers into your pipelines.
  • ๐Ÿ“š Ready-to-Use Examples: Convenient examples with configurations and instructions for basic tasks.
  • โšก Fast Inference: Optimized for quick inference using vLLM.
  • ๐Ÿ”„ End-to-End Pipelines: From data preprocessing to model alignment.
  • ๐ŸŒ Multimodal Capabilities: Extensive support for various multimodal functions like Vision Language Modeling.
  • ๐Ÿ” RAG Pipeline: Unique pipeline for end2end retrieval-augmented generation training.

๐Ÿ› ๏ธ Supported Methods

Turbo-Alignment supports a wide range of methods for model training and alignment, including:

  • ๐ŸŽฏ Supervised Fine-Tuning (SFT)
  • ๐Ÿ† Reward Modeling (RM)
  • ๐Ÿ‘ Direct Preference Optimization (DPO)
  • ๐Ÿง  Kahneman & Tversky Optimization (KTO) Paired/Unpaired
  • ๐Ÿ”„ Contrastive Preference Optimization (CPO)
  • ๐ŸŽญ Identity Preference Optimisation (IPO)
  • ๐ŸŒŸ Sequence Likelihood Calibration with Human Feedback (SLiC-HF)
  • ๐Ÿ“Š Statistical Rejection Sampling Optimization (RSO)
  • ๐ŸŒ Vision Language Modeling using MLP from (LLaVA) or C-Abstractor from (HoneyBee) trainable projection model
  • ๐Ÿ—‚๏ธ Retrieval-Augmented Generation (RAG)

๐Ÿงฎ Implemented Metrics

  • ๐Ÿ”  Distinctness
  • ๐ŸŒˆ Diversity
  • ๐Ÿ”ต Self-BLEU
  • โž— KL-divergence
  • ๐Ÿ† Reward
  • ๐Ÿ“ Length
  • ๐ŸŒ€ Perplexity
  • ๐ŸŒŸ METEOR
  • ๐Ÿ” Retrieval Utility

๐Ÿค– How to Use

Turbo-Alignment offers an intuitive interface for training and aligning large language models. Refer to the detailed examples and configuration files in the documentation to get started quickly with your specific use case. User-friendly guid available here.

The most crucial aspect is to prepare the dataset in the required format, after which the pipeline will handle everything automatically. Examples of datasets are available here.

Table of use-cases

Train

Supervised Fine-Tuning

  • ๐Ÿ“š Dataset type prepare your dataset in the ChatDataset, examples available here format.
  • ๐Ÿ“ Configs Example: sft.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment train_sft --experiment_settings_path configs/exp/train/sft/sft.json

Preference Tuning

Reward Modeling

  • ๐Ÿ“š Dataset type prepare your dataset in the PairPreferencesDataset format, examples available here
  • ๐Ÿ“ Configs Example: rm.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment train_rm --experiment_settings_path configs/exp/train/rm/rm.json

DPO, IPO, CPO, KTO (Paired)

  • ๐Ÿ“š Dataset type prepare your dataset in the PairPreferencesDataset format, examples available here
  • ๐Ÿ“ Configs Example: dpo.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment train_dpo --experiment_settings_path configs/exp/train/dpo/dpo.json

KTO (Unpaired)

  • ๐Ÿ“š Dataset type prepare your dataset in the KTODataset format, examples available here
  • ๐Ÿ“ Configs Examples: kto.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment train_kto --experiment_settings_path configs/exp/train/kto/kto.json

Multimodal train

โŒ›๏ธ in progress..

RAG (Retrieval-Augmented Generation)

SFT-RAG

  • ๐Ÿ“š Dataset type: prepare your dataset in ChatDataset, examples available here format.
  • ๐Ÿ“ Configs Example: sft_with_retrieval_utility
  • ๐Ÿ–ฅ๏ธ CLI launch command:
python -m turbo_alignment train_sft --experiment_settings_path configs/exp/train/sft/llama/sft_with_retrieval_utility.json

End2End-RAG

  • ๐Ÿ“š Dataset type: prepare your dataset in ChatDataset, examples available here format.
  • ๐Ÿ“ Configs Example: end2end_rag
  • ๐Ÿ–ฅ๏ธ CLI launch command:
python -m turbo_alignment train_rag --experiment_settings_path configs/exp/train/rag/end2end_rag.json

Inference

Chat Inference

  • ๐Ÿ“š Dataset type prepare your dataset in the ChatDataset, examples available here format.
  • ๐Ÿ“ Configs Example: sft.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment inference_chat --inference_settings_path configs/exp/inference/generation/default_llama_adapter.json

Classification Inference

  • ๐Ÿ“š Dataset type prepare your dataset in the ClassificationDataset, examples available here format.
  • ๐Ÿ“ Configs Example: classification_inference.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment inference_classification --inference_settings_path configs/exp/train/sft/sft.json

Multimodal Inference

  • ๐Ÿ“š Dataset type prepare your dataset in the MultimodalDataset, examples available here format.
  • ๐Ÿ“ Configs Example: mlp.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment inference_multimodal --inference_settings_path configs/exp/inference/multimodal/mlp.json

RAG Inference

  • ๐Ÿ“š Dataset type prepare your dataset in the ChatDataset, examples available here format.
  • ๐Ÿ“ Configs Example: rag_inference.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment inference_rag --inference_settings_path configs/exp/inference/rag/rag_inference.json

Sampling

Random Sampling

  • ๐Ÿ“š Dataset type prepare your dataset in the SamplingRMDataset, examples available here format.
  • ๐Ÿ“ Configs Example: random.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment random_sample --experiment_settings_path tests/fixtures/configs/sampling/base.json

RSO Sampling

  • ๐Ÿ“š Dataset type prepare your dataset in the SamplingRMDataset, examples available here format.
  • ๐Ÿ“ Configs Example: rso.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment rso_sample --experiment_settings_path tests/fixtures/configs/sampling/rso.json

Reward Model Sampling

  • ๐Ÿ“š Dataset type prepare your dataset in the SamplingRMDataset, examples available here format.
  • ๐Ÿ“ Configs Example: rm.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment rm_sample --experiment_settings_path tests/fixtures/configs/sampling/rm.json

Common

Merge Adapters to base model

  • ๐Ÿ“ Configs Example: llama.json
  • ๐Ÿ–ฅ๏ธ CLI launch command
python -m turbo_alignment merge_adapters_to_base --settings_path configs/utils/merge_adapters_to_base/llama.json

Preprocess Multimodal Dataset

python -m turbo_alignment preprocess_multimodal_dataset --settings_path configs/utils/preprocess/coco2014_clip.json

๐Ÿš€ Installation

๐Ÿ“ฆ Python Package

pip install turbo-alignment

๐Ÿ› ๏ธ From Source

For the latest features before an official release:

pip install git+https://github.com/turbo-llm/turbo-alignment.git

๐Ÿ“‚ Repository

Clone the repository for access to examples:

git clone https://github.com/turbo-llm/turbo-alignment.git

๐ŸŒฑ Development

Contributions are welcome! Read the contribution guide and set up the development environment:

git clone https://github.com/turbo-llm/turbo-alignment.git
cd turbo-alignment
poetry install

๐Ÿ“ Library Roadmap

  • Increasing number of tutorials
  • Enhancing test coverage
  • Implementation of Online RL methods like PPO and Reinforce
  • Facilitating distributed training
  • Incorporating low-memory training approaches

โ“ FAQ

How do I install Turbo-Alignment?

See the Installation section for detailed instructions.

Where can I find docs?

Guides and docs are available here.

Where can I find tutorials?

Tutorials are available here.

๐Ÿ“ License

This project is licensed, see the LICENSE file for details.

References

  • DPO Trainer implementation inspired by Leandro von Werra et al. (2020) TRL: Transformer Reinforcement Learning. GitHub repository, GitHub. Available at: https://github.com/huggingface/trl.

  • Registry implementation inspired by Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson F. Liu, Matthew Peters, Michael Schmitz, and Luke S. Zettlemoyer. 2017. AllenNLP: A Deep Semantic Natural Language Processing Platform. Available at: arXiv:1803.07640.

  • Liger Kernels implementation inspired by Hsu, Pin-Lun, Dai, Yun, Kothapalli, Vignesh, Song, Qingquan, Tang, Shao, and Zhu, Siyu, 2024. Liger-Kernel: Efficient Triton Kernels for LLM Training. Available at: https://github.com/linkedin/Liger-Kernel.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turbo_alignment-0.0.4.tar.gz (118.7 kB view details)

Uploaded Source

Built Distribution

turbo_alignment-0.0.4-py3-none-any.whl (212.3 kB view details)

Uploaded Python 3

File details

Details for the file turbo_alignment-0.0.4.tar.gz.

File metadata

  • Download URL: turbo_alignment-0.0.4.tar.gz
  • Upload date:
  • Size: 118.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.10.15 Linux/6.5.0-1025-azure

File hashes

Hashes for turbo_alignment-0.0.4.tar.gz
Algorithm Hash digest
SHA256 d81180f3f35ac8595326222a815fd12c328e9e7e49a28be89d8dac3b72dcec93
MD5 ffb36feef97d2915006a87708a1e81d3
BLAKE2b-256 2ccfbbfa0c6d3ad6e6b85435167759c46b30f2df642a6efd69854e6e27a8a206

See more details on using hashes here.

File details

Details for the file turbo_alignment-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: turbo_alignment-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 212.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.10.15 Linux/6.5.0-1025-azure

File hashes

Hashes for turbo_alignment-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5142ea607d6bc6bb079097fd6b91cea744d27a0b5225363a3b821584187fd31f
MD5 12cbbdd4411b462580eb594db17c85d0
BLAKE2b-256 3a1b1323abe43b96551c16b5895678a83c81ade624e917c53eaed2f96247f593

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page