Skip to main content

A lightweight JAX-native LLM post-training framework.

Project description

Tunix: A JAX-native LLM Post-Training Library

Tunix(Tune-in-JAX) is a JAX based library designed to streamline the post-training of Large Language Models. It provides efficient and scalable supports for:

  • Supervised Fine-Tuning
  • Reinforcement Learning (RL)
  • Knowledge Distillation

Tunix leverages the power of JAX for accelerated computation and seamless integration with JAX-based modeling framework Flax NNX.

Current Status: Early Development

Tunix is in early development. We're actively working to expand its capabilities, usability and improve its performance. Stay tuned for upcoming updates and new features!

Key Features & Highlights

Tunix is still under development, here's a glimpse of the current features:

  • Supervised Fine-Tuning:
    • Full Weights Fine-Tuning
    • Parameter-Efficient Fine-Tuning (PEFT) with LoRA/Q-LoRA Layers
  • Reinforcement Learning (RL):
    • Proximal Policy Optimization (PPO)
    • Group Relative Policy Optimization (GRPO)
    • Token-level Group Sequence Policy Optimization (GSPO-token)
  • Preference Fine-Tuning:
    • Preference alignments with Direct Preference Optimization (DPO)
  • Knowledge Distillation:
    • Logit Strategy: A classic approach where the student learns to match the teacher's output probability distribution.
    • Attention Transfer & Projection Strategies: Methods to align the attention mechanisms between the student and teacher models.
    • Feature Pooling & Projection Strategies: General techniques for matching intermediate feature representations, even between models of different architectures.
  • Modularity:
    • Components are designed to be reusable and composable
    • Easy to customize and extend
  • Efficiency:
    • Native support of common model sharding strategies such as DP, FSDP and TP
    • Designed for distributed training on accelerators (TPU)

Upcoming

  • Agentic RL Training:
    • Async Rollout
    • Multi-turn & multi-step support
    • Tool usage
  • Advanced Algorithms:
    • Addtional state-of-the-art RL and distillation algorithms
  • Scalability:
    • Multi-host distributed training
    • Optimized rollout with vLLM
  • User Guides:
    • More advanced RL recipe

Installation

Tunix doesn't have a PyPI package yet. To use Tunix, you need to install from GitHub directly.

pip install git+https://github.com/google/tunix

Getting Started

To get started, we have a bunch of detailed examples and tutorials.

To setup Jupyter notebook on single host GCP TPU VM, please refer to the setup script.

We plan to provide clear, concise documentation and more examples in the near future.

Contributing and Feedbacks

We welcome contributions! As Tunix is in early development, the contribution process is still being formalized. A rough draft of the contribution process is present here. In the meantime, you can make feature requests, report issues and ask questions in our Tunix GitHub discussion forum.

Collaborations and Partnership

GRL (Game Reinforcement Learning), developed by Hao AI Lab from UCSD, is an open-source framework for post-training large language models through multi-turn RL on challenging games. In collaboration with Tunix, GRL integrates seamless TPU support—letting users quickly run scalable, reproducible RL experiments (like PPO rollouts on Qwen2.5-0.5B-Instruct) on TPU v4 meshes with minimal setup. This partnership empowers the community to push LLM capabilities further, combining Tunix’s optimized TPU runtime with GRL’s flexible game RL pipeline for cutting-edge research and easy reproducibility.

Stay Tuned!

Thank you for your interest in Tunix. We're working hard to bring you a powerful and efficient library for LLM post-training. Please follow our progress and check back for updates!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_tunix-0.0.1.tar.gz (150.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

google_tunix-0.0.1-py3-none-any.whl (209.4 kB view details)

Uploaded Python 3

File details

Details for the file google_tunix-0.0.1.tar.gz.

File metadata

  • Download URL: google_tunix-0.0.1.tar.gz
  • Upload date:
  • Size: 150.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for google_tunix-0.0.1.tar.gz
Algorithm Hash digest
SHA256 1759ddd4ab46a3a823afaac6ebe36d52dd00f57cb281d9d56f950a19f0de6548
MD5 c5ee6fa6bff8b21673cd247dcb445cde
BLAKE2b-256 0ad43de4204e298db14660a8a4ada97f1ced3efde54110fd10f50f1b8b898360

See more details on using hashes here.

File details

Details for the file google_tunix-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: google_tunix-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 209.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for google_tunix-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b2dbab62647231d431cdd014ade41153cfcb84f7d9a468ef596baf590a4bbddd
MD5 d397dc72f72b06f25b4b8b3bb84f2ab3
BLAKE2b-256 b08bf160953a9e7347013520df7d5c1f2cf493a7c3a2fbe6e2f8d42fc5537c77

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page