A package for finetuning text models.

Project description

Langtune: Efficient LoRA Fine-Tuning for Text LLMs

Langtune Logo

Langtune is a Python package for fine-tuning large language models on text data using LoRA.
Provides modular components for adapting language models to various NLP tasks.

Features

LoRA adapters for efficient fine-tuning
Modular transformer backbone
Model zoo for language models
Configurable and extensible codebase
Checkpointing and resume
Mixed precision and distributed training
Metrics and visualization tools
CLI for training and evaluation
Callback support (early stopping, logging, etc.)

Showcase

Langtune is intended for building and fine-tuning large language models with LoRA. It can be used for text classification, summarization, question answering, and other NLP tasks.

Getting Started

Install:

pip install langtune

Example usage:

import torch
from langtune.models.llm import LanguageModel
from langtune.utils.config import default_config

input_ids = torch.randint(0, 1000, (2, 128))
model = LanguageModel(
    vocab_size=default_config['vocab_size'],
    embed_dim=default_config['embed_dim'],
    num_layers=default_config['num_layers'],
    num_heads=default_config['num_heads'],
    mlp_ratio=default_config['mlp_ratio'],
    lora_config=default_config['lora'],
)

with torch.no_grad():
    out = model(input_ids)
    print('Output shape:', out.shape)

See the Documentation and src/langtune/cli/finetune.py for more details.

Supported Python Versions

Python 3.8 or newer

Why langtune?

Fine-tuning with LoRA adapters
Modular transformer design
Unified interface for language models
Suitable for research and production
Efficient memory usage

Architecture Overview

Langtune uses a transformer backbone with LoRA adapters in attention and MLP layers. This enables adaptation of pre-trained models with fewer trainable parameters.

Model Data Flow

---
config:
  layout: dagre
---
flowchart TD
 subgraph LoRA_Adapters["LoRA Adapters in Attention and MLP"]
        LA1(["LoRA Adapter 1"])
        LA2(["LoRA Adapter 2"])
        LA3(["LoRA Adapter N"])
  end
    A(["Input Tokens"]) --> B(["Embedding Layer"])
    B --> C(["Positional Encoding"])
    C --> D1(["Encoder Layer 1"])
    D1 --> D2(["Encoder Layer 2"])
    D2 --> D3(["Encoder Layer N"])
    D3 --> E(["LayerNorm"])
    E --> F(["MLP Head"])
    F --> G(["Output Logits"])
    LA1 -.-> D1
    LA2 -.-> D2
    LA3 -.-> D3
     LA1:::loraStyle
     LA2:::loraStyle
     LA3:::loraStyle
    classDef loraStyle fill:#e1f5fe,stroke:#0277bd,stroke-width:2px

Core Modules

Module	Description	Key Features
Embedding	Token embedding and positional encoding	Configurable vocab size, position embeddings
TransformerEncoder	Multi-layer transformer backbone	Self-attention, LoRA integration, checkpointing
LoRALinear	Low-rank adaptation layers	Configurable rank, memory-efficient updates
MLPHead	Output projection layer	Classification, regression, dropout
Config System	Centralized configuration	YAML/JSON config, CLI overrides
Data Utils	Preprocessing and augmentation	Built-in tokenization, custom loaders

Performance & Efficiency

Metric	Full Fine-tuning	LoRA Fine-tuning	Improvement
Trainable Parameters	125M	3.2M	97% reduction
Memory Usage	16GB	5GB	69% reduction
Training Time	6h	2h	67% faster
Storage per Task	500MB	12MB	98% smaller

Benchmarks: Transformer-Base, WikiText-103, RTX 3090

Supported model sizes: Transformer-Tiny, Transformer-Small, Transformer-Base, Transformer-Large

Advanced Configuration

Example LoRA config:

lora_config = {
    "rank": 16,
    "alpha": 32,
    "dropout": 0.1,
    "target_modules": ["attention.qkv", "attention.proj", "mlp.fc1", "mlp.fc2"],
    "merge_weights": False
}

Example training config:

model:
  name: "transformer_base"
  vocab_size: 50257
  embed_dim: 768
  num_layers: 12
  num_heads: 12
training:
  epochs: 10
  batch_size: 32
  learning_rate: 1e-4
  weight_decay: 0.01
  warmup_steps: 1000
lora:
  rank: 16
  alpha: 32
  dropout: 0.1

Documentation & Resources

Research Papers

Testing & Quality

Run tests:

pytest tests/

Code quality tools:

flake8 src/
black src/ --check
mypy src/
bandit -r src/

Examples & Use Cases

Text classification:

from langtune import LanguageModel
from langtune.datasets import TextClassificationDataset

model = LanguageModel.from_pretrained("transformer_base")
dataset = TextClassificationDataset(train=True, tokenizer=model.tokenizer)
model.finetune(dataset, epochs=10, lora_rank=16)

Custom dataset:

from langtune.datasets import CustomTextDataset

dataset = CustomTextDataset(
    file_path="/path/to/dataset.txt",
    split="train",
    tokenizer=model.tokenizer
)
model.finetune(dataset, config_path="configs/custom_config.yaml")

Extending the Framework

Add datasets in src/langtune/data/datasets.py
Add callbacks in src/langtune/callbacks/
Add models in src/langtune/models/
Add CLI tools in src/langtune/cli/

Documentation

See code comments and docstrings for details.
For advanced usage, see src/langtune/cli/finetune.py.

Contributing

Contributions are welcome. See the Contributing Guide for details.

License

This project is licensed under the MIT License. See LICENSE for details.

Citation

If you use langtune in your research, please cite:

@software{langtune2025,
  author = {Pritesh Raj},
  title = {langtune: LLMs with Efficient LoRA Fine-Tuning},
  url = {https://github.com/langtrain-ai/langtune},
  year = {2025},
  version = {0.1.0}
}

Acknowledgements

We thank the following projects and communities:

Made in India 🇮🇳 with ❤️ by the langtune team
Star ⭐ this repo if you find it useful!

Project details

Release history Release notifications | RSS feed

0.1.42

May 17, 2026

0.1.41

Feb 24, 2026

0.1.40

Feb 18, 2026

0.1.39

Feb 18, 2026

0.1.38

Feb 18, 2026

0.1.37

Feb 17, 2026

0.1.36

Feb 16, 2026

0.1.35

Jan 20, 2026

0.1.34

Jan 19, 2026

0.1.33

Jan 18, 2026

0.1.32

Jan 10, 2026

0.1.31

Jan 10, 2026

0.1.30

Jan 10, 2026

0.1.29

Jan 10, 2026

0.1.28

Jan 10, 2026

0.1.27

Jan 10, 2026

0.1.26

Jan 10, 2026

0.1.25

Jan 10, 2026

0.1.24

Jan 10, 2026

0.1.23

Jan 10, 2026

0.1.22

Jan 9, 2026

0.1.21

Jan 4, 2026

0.1.20

Jan 4, 2026

0.1.19

Jan 4, 2026

0.1.18

Jan 4, 2026

0.1.17

Jan 4, 2026

0.1.16

Jan 4, 2026

0.1.15

Jan 4, 2026

0.1.14

Jan 4, 2026

0.1.13

Jan 4, 2026

0.1.12

Jan 4, 2026

0.1.11

Jan 4, 2026

0.1.1

Jul 3, 2025

0.1.0

Jul 3, 2025

This version

0.0.3

Jul 3, 2025

0.0.2

Jul 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langtune-0.0.3.tar.gz (6.9 kB view details)

Uploaded Jul 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langtune-0.0.3-py3-none-any.whl (6.6 kB view details)

Uploaded Jul 3, 2025 Python 3

File details

Details for the file langtune-0.0.3.tar.gz.

File metadata

Download URL: langtune-0.0.3.tar.gz
Upload date: Jul 3, 2025
Size: 6.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for langtune-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`cb7417e99493ce1f99faed1fb102c745eed4eb9b41cb89a3f477df3ab48ae2d0`
MD5	`b370ce2219a03b4a8128975e815ab8aa`
BLAKE2b-256	`9632be7b7558ca66e6510cbff624aab33cda26ee4d90055b08601223a8e70d41`

See more details on using hashes here.

File details

Details for the file langtune-0.0.3-py3-none-any.whl.

File metadata

Download URL: langtune-0.0.3-py3-none-any.whl
Upload date: Jul 3, 2025
Size: 6.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for langtune-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a247b0423e2e6102bec5c9edeaef3cf29de3cc9c56395893d94e3dd218384c4c`
MD5	`3efc0fe69800ba7f342c9f1009314198`
BLAKE2b-256	`56bb0ec636d5eee3fc20f5f2fbb2ce9531334c02ac547776d2d68f4d94ca8318`

See more details on using hashes here.

langtune 0.0.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Langtune: Efficient LoRA Fine-Tuning for Text LLMs

Quick Links

Table of Contents

Features

Showcase

Getting Started

Supported Python Versions

Why langtune?

Architecture Overview

Model Data Flow

Core Modules

Performance & Efficiency

Advanced Configuration

Documentation & Resources

Research Papers

Testing & Quality

Examples & Use Cases

Extending the Framework

Documentation

Contributing

License

Citation

Acknowledgements

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes