A package for finetuning text models.

Project description

Langtune: Large Language Models (LLMs) with Efficient LoRA Fine-Tuning for Text

Langtune Logo

Langtune provides modular components for text models and LoRA-based fine-tuning.
Adapt and fine-tune language models for a range of NLP tasks.

Features

LoRA adapters for parameter-efficient fine-tuning of LLMs
Modular transformer backbone
Model zoo for open-source language models
Configurable and extensible codebase
Checkpointing and resume support
Mixed precision and distributed training
Built-in metrics and visualization tools
CLI for fine-tuning and evaluation
Extensible callbacks (early stopping, logging, etc.)

Showcase

Langtune is a framework for building and fine-tuning large language models with LoRA support. It is suitable for tasks such as text classification, summarization, question answering, and other NLP applications.

Getting Started

Install with pip:

pip install langtune

Minimal example:

import torch
from langtune.models.llm import LanguageModel
from langtune.utils.config import default_config

input_ids = torch.randint(0, 1000, (2, 128))
model = LanguageModel(
    vocab_size=default_config['vocab_size'],
    embed_dim=default_config['embed_dim'],
    num_layers=default_config['num_layers'],
    num_heads=default_config['num_heads'],
    mlp_ratio=default_config['mlp_ratio'],
    lora_config=default_config['lora'],
)

with torch.no_grad():
    out = model(input_ids)
    print('Output shape:', out.shape)

For more details, see the Documentation and src/langtune/cli/finetune.py.

Supported Python Versions

Python 3.8+

Why langtune?

Parameter-efficient fine-tuning with LoRA adapters
Modular transformer backbone for flexible model design
Unified interface for open-source language models
Designed for both research and production
Efficient memory usage for large models

Architecture Overview

Langtune uses a modular transformer backbone with LoRA adapters in attention and MLP layers. This allows adaptation of pre-trained models with fewer trainable parameters.

Model Data Flow

---
config:
  layout: dagre
---
flowchart TD
 subgraph LoRA_Adapters["LoRA Adapters in Attention and MLP"]
        LA1(["LoRA Adapter 1"])
        LA2(["LoRA Adapter 2"])
        LA3(["LoRA Adapter N"])
  end
    A(["Input Tokens"]) --> B(["Embedding Layer"])
    B --> C(["Positional Encoding"])
    C --> D1(["Encoder Layer 1"])
    D1 --> D2(["Encoder Layer 2"])
    D2 --> D3(["Encoder Layer N"])
    D3 --> E(["LayerNorm"])
    E --> F(["MLP Head"])
    F --> G(["Output Logits"])
    LA1 -.-> D1
    LA2 -.-> D2
    LA3 -.-> D3
     LA1:::loraStyle
     LA2:::loraStyle
     LA3:::loraStyle
    classDef loraStyle fill:#e1f5fe,stroke:#0277bd,stroke-width:2px

Core Modules

Module	Description	Key Features
Embedding	Token embedding and positional encoding	Configurable vocab size, position embeddings
TransformerEncoder	Multi-layer transformer backbone	Self-attention, LoRA integration, checkpointing
LoRALinear	Low-rank adaptation layers	Configurable rank, memory-efficient updates
MLPHead	Output projection layer	Classification, regression, dropout
Config System	Centralized configuration	YAML/JSON config, CLI overrides
Data Utils	Preprocessing and augmentation	Built-in tokenization, custom loaders

Performance & Efficiency

Metric	Full Fine-tuning	LoRA Fine-tuning	Improvement
Trainable Parameters	125M	3.2M	97% reduction
Memory Usage	16GB	5GB	69% reduction
Training Time	6h	2h	67% faster
Storage per Task	500MB	12MB	98% smaller

Benchmarks: Transformer-Base, WikiText-103, RTX 3090

Supported model sizes: Transformer-Tiny, Transformer-Small, Transformer-Base, Transformer-Large

Advanced Configuration

Example LoRA config:

lora_config = {
    "rank": 16,
    "alpha": 32,
    "dropout": 0.1,
    "target_modules": ["attention.qkv", "attention.proj", "mlp.fc1", "mlp.fc2"],
    "merge_weights": False
}

Example training config:

model:
  name: "transformer_base"
  vocab_size: 50257
  embed_dim: 768
  num_layers: 12
  num_heads: 12
training:
  epochs: 10
  batch_size: 32
  learning_rate: 1e-4
  weight_decay: 0.01
  warmup_steps: 1000
lora:
  rank: 16
  alpha: 32
  dropout: 0.1

Documentation & Resources

Research Papers

Testing & Quality

Run tests:

pytest tests/

Code quality tools:

flake8 src/
black src/ --check
mypy src/
bandit -r src/

Examples & Use Cases

Text classification:

from langtune import LanguageModel
from langtune.datasets import TextClassificationDataset

model = LanguageModel.from_pretrained("transformer_base")
dataset = TextClassificationDataset(train=True, tokenizer=model.tokenizer)
model.finetune(dataset, epochs=10, lora_rank=16)

Custom dataset:

from langtune.datasets import CustomTextDataset

dataset = CustomTextDataset(
    file_path="/path/to/dataset.txt",
    split="train",
    tokenizer=model.tokenizer
)
model.finetune(dataset, config_path="configs/custom_config.yaml")

Extending the Framework

Add datasets in src/langtune/data/datasets.py
Add callbacks in src/langtune/callbacks/
Add models in src/langtune/models/
Add CLI tools in src/langtune/cli/

Documentation

See code comments and docstrings for details.
For advanced usage, see src/langtune/cli/finetune.py.

Contributing

We welcome contributions. See the Contributing Guide for details.

License & Citation

This project is licensed under the MIT License. See LICENSE for details.

If you use langtune in your research, please cite:

@software{langtune2025,
  author = {Pritesh Raj},
  title = {langtune: LLMs with Efficient LoRA Fine-Tuning},
  url = {https://github.com/langtrain-ai/langtune},
  year = {2025},
  version = {0.1.0}
}

Acknowledgements

We thank the following projects and communities:

Made in India 🇮🇳 with ❤️ by the langtune team
Star ⭐ this repo if you find it useful!

Project details

Release history Release notifications | RSS feed

0.1.42

May 17, 2026

0.1.41

Feb 24, 2026

0.1.40

Feb 18, 2026

0.1.39

Feb 18, 2026

0.1.38

Feb 18, 2026

0.1.37

Feb 17, 2026

0.1.36

Feb 16, 2026

0.1.35

Jan 20, 2026

0.1.34

Jan 19, 2026

0.1.33

Jan 18, 2026

0.1.32

Jan 10, 2026

0.1.31

Jan 10, 2026

0.1.30

Jan 10, 2026

0.1.29

Jan 10, 2026

0.1.28

Jan 10, 2026

0.1.27

Jan 10, 2026

0.1.26

Jan 10, 2026

0.1.25

Jan 10, 2026

0.1.24

Jan 10, 2026

0.1.23

Jan 10, 2026

0.1.22

Jan 9, 2026

0.1.21

Jan 4, 2026

0.1.20

Jan 4, 2026

0.1.19

Jan 4, 2026

0.1.18

Jan 4, 2026

0.1.17

Jan 4, 2026

0.1.16

Jan 4, 2026

0.1.15

Jan 4, 2026

0.1.14

Jan 4, 2026

0.1.13

Jan 4, 2026

0.1.12

Jan 4, 2026

0.1.11

Jan 4, 2026

0.1.1

Jul 3, 2025

0.1.0

Jul 3, 2025

0.0.3

Jul 3, 2025

This version

0.0.2

Jul 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langtune-0.0.2.tar.gz (7.0 kB view details)

Uploaded Jul 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langtune-0.0.2-py3-none-any.whl (6.7 kB view details)

Uploaded Jul 3, 2025 Python 3

File details

Details for the file langtune-0.0.2.tar.gz.

File metadata

Download URL: langtune-0.0.2.tar.gz
Upload date: Jul 3, 2025
Size: 7.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for langtune-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`27c004bb455c8a648776e7b0fa0035fd066d1c5308635499418ecd78e43c4646`
MD5	`a83c308056d3d92cf82b02349f893e0f`
BLAKE2b-256	`ce815778e6cf5f6b2edb53034f753d93463096c202a75ad934f623399bcafcfb`

See more details on using hashes here.

File details

Details for the file langtune-0.0.2-py3-none-any.whl.

File metadata

Download URL: langtune-0.0.2-py3-none-any.whl
Upload date: Jul 3, 2025
Size: 6.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for langtune-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`15949b3748b6edbdff93b350096fc10da6e67c20c6b7329fdf575f2fcec5e1de`
MD5	`22c180bebedbecb2aae64cf90e6e4cf1`
BLAKE2b-256	`befab5662c01a1ffaf6a6c14f547c1c8c4c13be4761877ec2e2c810143e89646`

See more details on using hashes here.

langtune 0.0.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Langtune: Large Language Models (LLMs) with Efficient LoRA Fine-Tuning for Text

Quick Links

Table of Contents

Features

Showcase

Getting Started

Supported Python Versions

Why langtune?

Architecture Overview

Model Data Flow

Core Modules

Performance & Efficiency

Advanced Configuration

Documentation & Resources

Research Papers

Testing & Quality

Examples & Use Cases

Extending the Framework

Documentation

Contributing

License & Citation

Acknowledgements

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes