aligntune

AlignTune: Multi-backend alignment and fine-tuning library. Features TRL and Unsloth backends with complete RL coverage (DPO, PPO, GRPO, BOLT), 27+ reward functions, production-ready evaluation system, and unified configuration interface for LLM alignment.

These details have not been verified by PyPI

Project links

Project description

AlignTune Banner

AlignTune is a production-ready fine-tuning library designed to simplify training and fine-tuning of Large Language Models (LLMs) with both Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) methods. It provides a high-level, unified API that abstracts away the complexities of backend selection, algorithm configuration, and training loops, letting you focus on delivering results.

Core Features

Multi-Backend Architecture: Choose between TRL (reliable, battle-tested) and Unsloth (faster) backends with intelligent auto-selection.

Complete RLHF Coverage: 12+ RL algorithms including DPO, PPO, GRPO, GSPO, DAPO, Dr. GRPO, GBMPO, Counterfactual GRPO, and PACE.

Production-Ready: No mock code, comprehensive error handling, extensive testing, and robust validation.

Quick Start

Supervised Fine-Tuning (SFT)

from aligntune.core.backend_factory import create_sft_trainer

# Create and train SFT model
trainer = create_sft_trainer(
    model_name="microsoft/DialoGPT-small",
    dataset_name="tatsu-lab/alpaca",
    backend="trl",
    num_epochs=3,
    max_steps = -1,
    batch_size=4,
    learning_rate=5e-5
)

# Train the model
trainer.train()

# Evaluate
metrics = trainer.evaluate()
print(metrics)

Reinforcement Learning (DPO)

from aligntune.core.backend_factory import create_rl_trainer

# Create and train DPO model
trainer = create_rl_trainer(
    model_name="Qwen/Qwen3-0.6B",
    dataset_name="Anthropic/hh-rlhf",
    algorithm="dpo",
    backend="trl",
    num_epochs=1,
    batch_size=4,
    learning_rate=5e-5
)

# Train the model
trainer.train()

Supported Algorithms

Algorithm	TRL Backend	Unsloth Backend	Description
SFT	Yes	Yes	Supervised Fine-Tuning
DPO	Yes	Yes	Direct Preference Optimization
PPO	Yes	Yes	Proximal Policy Optimization
GRPO	Yes	Yes	Group Relative Policy Optimization
GSPO	Yes	Yes	Group Sequential Policy Optimization
DAPO	Yes	Yes	Decouple Clip and Dynamic sAmpling Policy Optimization
Dr. GRPO	Yes	Yes	GRPO Done Right (unbiased variant)
GBMPO	Yes	No	Group-Based Mirror Policy Optimization
Counterfactual GRPO	Yes	Yes	Counterfactual GRPO variant
PACE	Yes	Yes	Baseline-Optimized Learning Technique

Installation

# Or install from source
git clone https://github.com/Lexsi-Labs/aligntune.git
cd aligntune
pip install -e .

Requirements

Python 3.12+
PyTorch 2.0+
CUDA-compatible GPU (recommended for faster training)

Demo Notebooks

Interactive Colab notebooks demonstrating various AlignTune workflows: Here are the organized tables containing the Colab links, models, and datasets provided in your text.

Supervised Fine-Tuning (SFT)

Backend	Model	Dataset
TRL	Qwen/Qwen3-4B-Instruct-2507	sohamb37lexsi/bitext-wealth-management-llm-chatbot-splits
TRL	Qwen3-4B-Instruct	sohamb37lexsi/bitext-retail-banking-llm-chatbot-splits
Unsloth	Qwen/Qwen2.5-0.5B-Instruct	bebechien/MobileGameNPC
TRL	google/txgemma-2b-predict	trialbench_adverse-event-rate-prediction
Unsloth	Qwen/Qwen2.5-0.5B-Instruct	bebechien/MobileGameNP

Reinforcement Learning (RL)

Backend	Algorithm	Model	Dataset
Unsloth	DPO	microsoft/phi-2	argilla/distilabel-intel-orca-dpo-pairs
TRL	DPO	google/gemma-2-2b-it	Anthropic/hh-rlhf
TRL	DPO	sohamb37lexsi/wealth_management_Qwen3-4B-Instruct-2507	sohamb37lexsi/bitext_wealth_management_preference_data
Unsloth	PPO	Qwen/Qwen2.5-0.5B-Instruct	HuggingFaceH4/ultrachat_200k
TRL	PPO	EleutherAI/pythia-1.4b	CarperAI/openai_summarize_tldr
TRL	GRPO (Coding)	Qwen/Qwen3-4B	google-research-datasets/mbpp
Unsloth	GRPO (Math)	meta-llama/Llama-3.2-3B-Instruct	openai/gsm8k
TRL	GRPO	meta-llama/Llama-3.2-3B-Instruct	openai/gsm8k
Unsloth	DRGRPO	Qwen/Qwen2.5-3B-Instruct	yahma/alpaca-cleaned
TRL	DRGRPO	Qwen/Qwen2-0.5B-Instruct	AI-MO/NuminaMath-TIR
Unsloth	GSPO	Qwen/Qwen3-1.7B	CyberNative/Code_Vulnerability_Security_DPO
TRL	GSPO	meta-llama/Llama-3.2-3B-Instruct	HuggingFaceH4/ultrachat_200k
Unsloth	DAPO	microsoft/Phi-3.5-mini-instruct	HuggingFaceH4/ultrachat_200k
TRL	DAPO	meta-llama/Llama-3.2-3B-Instruct	google-research-datasets/mbpp

Documentation

Getting Started: Installation, setup, and basic usage
User Guide: In-depth tutorials for SFT and RL training
API Reference: Complete Python API and class/method details
Examples: End-to-end code examples
Advanced Topics: Architecture, custom backends, and performance optimization
Notebooks: Interactive Colab notebooks and local Jupyter notebooks

Key Capabilities

Multiple Training Paradigms: Supports SFT, DPO, PPO, GRPO, and advanced RL algorithms
Backend Flexibility: TRL and Unsloth backends with automatic fallback
Reward Model Training: Train custom reward models from rule-based functions
Comprehensive Evaluation: Multi-level evaluation with lm-eval integration
Production Ready: Model serialization, reproducible training, and deployment-ready pipelines
Extensible Architecture: Modular design for easy integration of custom algorithms and backends

Architecture

AlignTune uses a flexible backend architecture:

flowchart TD
    Factory[Backend Factory] --> TRL[TRL Backend]
    Factory --> Unsloth[Unsloth Backend]
    TRL --> TRL_Algos[TRL Algorithms]
    Unsloth --> Unsloth_Algos[Unsloth Algorithms]

TRL Backend: SFT, DPO, PPO, GRPO, GSPO, DAPO, Dr. GRPO, GBMPO, Counterfactual GRPO, PACE

Unsloth Backend: SFT, DPO, PPO, GRPO, DAPO, Dr. GRPO, Counterfactual GRPO, PACE

See Architecture for details.

Contributing

We welcome contributions! See our Contributing Guide for details.

License

This project is released under the MIT License. Please cite appropriately if used in academic or production projects. See the LICENSE file for details.

Key Points:

Free for Research & Learning: Use, modify, and study for personal, academic, or research purposes
Source Available: Full access to source code
Commercial Use Restricted: Requires separate commercial license
Contact: For commercial licensing, partnership, or redistribution rights, contact support@lexsi.ai

This is not an open-source license as defined by OSI, but provides broad access for non-commercial use.

Citation

If you use AlignTune in your research, please cite:

BibTeX:

@software{alignTune2025,
  title        = {{AlignTune}: Modular Toolkit for Post-Training Alignment of Large Language Models},
  author       = {Lyngkhoi, R E Zera Marveen and Chawla, Chirag and Seth, Pratinav and Avaiya, Utsav and Bhattacharjee, Soham and Khandoga, Mykola and Yuan, Rui and Sankarapu, Vinay Kumar},
  year         = {2025},
  note         = {Equal contribution: R E Zera Marveen Lyngkhoi, Chirag Chawla, Pratinav Seth},
  organization = {Lexsi Labs},
  url          = {https://github.com/Lexsi-Labs/aligntune},
  version      = {0.0.0}
}

Plain Text:

Lyngkhoi, R. E. Z. M., Chawla, C., Seth, P., Avaiya, U., Bhattacharjee, S., Khandoga, M.,
Yuan, R., & Sankarapu, V. K. (2025). AlignTune: Modular Toolkit for Post-Training Alignment
of Large Language Models. Lexsi Labs. https://github.com/Lexsi-Labs/aligntune

*Equal contribution: R E Zera Marveen Lyngkhoi, Chirag Chawla, Pratinav Seth

Acknowledgments

AlignTune is built upon the excellent work of the following projects:

HuggingFace Transformers - Model architectures and tokenizers
TRL - Transformer Reinforcement Learning library
Unsloth - Fast and memory-efficient training
HuggingFace Datasets - Dataset loading and processing

Support

Documentation: aligntune.lexsi.ai/
GitHub Issues: github.com/Lexsi-Labs/aligntune/issues
Discussions: github.com/Lexsi-Labs/aligntune/discussions
Email: hello@lexsi.ai
Discord: Discord Lexsi Labs

Get started with AlignTune and accelerate your LLM fine-tuning workflows today!

Contact

https://www.lexsi.ai

Paris 🇫🇷 · Mumbai 🇮🇳 · London 🇬🇧

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.7

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aligntune-0.1.7.tar.gz (669.2 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aligntune-0.1.7-py3-none-any.whl (745.5 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file aligntune-0.1.7.tar.gz.

File metadata

Download URL: aligntune-0.1.7.tar.gz
Upload date: Feb 23, 2026
Size: 669.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for aligntune-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`67469c1f2cc875fde5ee0514fa8e78b8564f2a433082173e5d4e67f2de00bb7b`
MD5	`09524c72cbf8264a0e3200cd2c0efb32`
BLAKE2b-256	`d82f3270eaf0ddf7292c571b6e9d83f1ddf95363df85ae745428ba985c893a7e`

See more details on using hashes here.

File details

Details for the file aligntune-0.1.7-py3-none-any.whl.

File metadata

Download URL: aligntune-0.1.7-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 745.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for aligntune-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1660090479e2595f0d016a798865dde7c51bdc7e304e09455e3ba5f056502f9c`
MD5	`9641dc37377ba827a42c4fae388b67b5`
BLAKE2b-256	`7c93894cea4f25210709c4b5fb52b7a6df5ce8ce2f1d6ad555854fe602613da1`

See more details on using hashes here.

aligntune 0.1.7

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

Core Features

Quick Start

Supervised Fine-Tuning (SFT)

Reinforcement Learning (DPO)

Supported Algorithms

Installation

Requirements

Demo Notebooks

Supervised Fine-Tuning (SFT)

Reinforcement Learning (RL)

Documentation

Key Capabilities

Architecture

Contributing

License

Citation

Acknowledgments

Support

Contact

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes