OpenLanguageModel (OLM): a modular PyTorch LLM library for building, training, teaching, and researching transformer language models.
Project description
OpenLanguageModel (OLM)
OpenLanguageModel is a PyTorch-native library for building, training, teaching, and researching transformer language models. It is designed for people who want the model architecture to stay visible while the training stack stays manageable.
OLM gives you:
- readable transformer components in
olm.nn - implemented model families in
olm.models - local and Hugging Face dataset streams in
olm.data - single-device, single-node multi-GPU DDP/FSDP, AMP, checkpointing, callbacks, and automatic trainer selection in
olm.train
Website · Docs · Install · Colab Notebooks · API Reference · Examples · Issues
Why OLM
Most language-model libraries either hide the architecture behind configuration, or make you rebuild the whole training path from scratch. OLM sits in the middle: every block is an ordinary torch.nn.Module, but data loading, optimization, mixed precision, single-node multi-GPU training, checkpointing, and logging are already wired into a clean path.
That makes it useful for:
- students learning how language models are assembled and trained
- researchers running ablations on attention, norms, feed-forward layers, and residual structure
- practitioners who want existing PyTorch workflows without a hidden runtime
Llama 3 Block In OLM
Model code in OLM is meant to read like the architecture it represents. For example, the Llama 3 block is built from RMSNorm, grouped-query attention, SwiGLU, and explicit residual structure:
from olm.nn.structure import Block
from olm.nn.structure.combinators import Residual
from olm.nn.attention import GroupedQueryAttention
from olm.nn.feedforward import SwiGLUFFN
from olm.nn.norms import RMSNorm
class Llama3Block(Block):
def __init__(
self,
embed_dim: int,
intermediate_size: int,
num_heads: int,
num_kv_heads: int,
max_seq_len: int,
dropout: float,
rope_theta: float,
):
super().__init__([
Residual(Block([
RMSNorm(embed_dim, eps=1e-5),
GroupedQueryAttention(
embed_dim,
num_heads,
num_kv_heads,
max_seq_len,
dropout=dropout,
rope_theta=rope_theta,
use_bias=False,
),
])),
Residual(Block([
RMSNorm(embed_dim, eps=1e-5),
SwiGLUFFN(
embed_dim,
hidden_dim=intermediate_size,
dropout=dropout,
bias=False,
),
])),
])
Source: src/olm/models/meta/llama3.py
Train With The Stack Connected
You can keep the model and optimizer as normal PyTorch objects while OLM handles the training loop details:
import torch
from olm.data.datasets import DataLoader, FineWebEduDataset
from olm.data.tokenization import HFTokenizer
from olm.models.openai import GPT2Model
from olm.train import AutoTrainer
from olm.train.optim import AdamW
tokenizer = HFTokenizer("gpt2")
dataset = FineWebEduDataset(tokenizer, context_length=1024, streaming=True)
loader = DataLoader(dataset, batch_size=8, num_workers=4)
model = GPT2Model(
vocab_size=tokenizer.vocab_size,
embed_dim=768,
num_layers=12,
num_heads=12,
max_seq_len=1024,
)
trainer = AutoTrainer(
model,
AdamW,
loader,
device="auto",
context_length=1024,
learning_rate=3e-4,
grad_accum_steps=8,
)
trainer.train(epochs=1, max_steps=1000)
AutoTrainer chooses between CPU, single-GPU, and single-node multi-GPU DDP/FSDP paths based on the hardware and model. You can still use Trainer, DDPTrainer, or FSDPTrainer directly when you want explicit control.
Implemented Model Families
OLM includes named presets and configurable base classes for common transformer families:
| Family | Source |
|---|---|
| GPT-2 | src/olm/models/openai/gpt2.py |
| Llama 2 | src/olm/models/meta/llama2.py |
| Llama 3 / 3.1 / 3.2 | src/olm/models/meta/llama3.py |
| Qwen 2.5 | src/olm/models/alibaba/qwen2.py |
| Phi-3 / Phi-3.5 | src/olm/models/microsoft/phi3.py |
| Phi-4 | src/olm/models/microsoft/phi4.py |
| Gemma 2 | src/olm/models/google/gemma2.py |
| OLMo | src/olm/models/allenai/olmo.py |
| OPT | src/olm/models/facebook/opt.py |
See docs/api.md for the generated API reference and examples/ for training scripts.
Installation
Use Python 3.10, 3.11, or 3.12.
git clone https://github.com/openlanguagemodel/openlanguagemodel.git
cd openlanguagemodel
pip install -e .
For development:
pip install -e ".[dev]"
pytest tests
Optional extras:
pip install -e ".[wandb]" # Weights & Biases logging
pip install -e ".[docs]" # documentation tooling
See docs/installation.md for dependency and release-build details.
Documentation Flow
- Install from
docs/installation.md - Start with
docs/getting-started.md - Review architecture concepts in
docs/architecture.md - Run guided Colabs from
docs/colab-notebooks.md - Train from examples in
examples/ - Use
docs/datasets-and-training.mdfor data, trainer, AutoTrainer, callbacks, and checkpointing - Use
docs/api.mdwhen you need exact signatures and source-defined methods - Read
docs/release-v2.2.0.mdfor the v2.2 release notes
Project Status
OLM v2.2 is the stabilization and release-readiness pass: tied output embeddings by default, model-family smoke coverage, AutoTrainer, streaming datasets, AMP, checkpointing, single-node DDP/FSDP paths, clearer installation docs, and a stronger generated API reference. Multi-node training remains a v4 roadmap item.
Citation
@software{openlanguagemodel2026,
title = {OpenLanguageModel},
author = {Tavish Mankash and Vardhaman Kalloli and Keshava Prasad},
year = {2026},
url = {https://github.com/openlanguagemodel/openlanguagemodel}
}
License
MIT. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openlanguagemodel-2.2.0.tar.gz.
File metadata
- Download URL: openlanguagemodel-2.2.0.tar.gz
- Upload date:
- Size: 91.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7593f753415e9f961104037ab841d6111143c3b1943aae81823d5e63f920b5a7
|
|
| MD5 |
fb45313baaf1f44595ccc40c8e31f0b6
|
|
| BLAKE2b-256 |
f93da738e2e4dc1f9067a64a478b090f18cb29b622e5773c0f62f9b60bed9f94
|
Provenance
The following attestation bundles were made for openlanguagemodel-2.2.0.tar.gz:
Publisher:
publish.yml on openlanguagemodel/openlanguagemodel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openlanguagemodel-2.2.0.tar.gz -
Subject digest:
7593f753415e9f961104037ab841d6111143c3b1943aae81823d5e63f920b5a7 - Sigstore transparency entry: 1882074685
- Sigstore integration time:
-
Permalink:
openlanguagemodel/openlanguagemodel@3b153550bede838b16af8b5288d5875d38cfad04 -
Branch / Tag:
refs/tags/v2.2.0 - Owner: https://github.com/openlanguagemodel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3b153550bede838b16af8b5288d5875d38cfad04 -
Trigger Event:
release
-
Statement type:
File details
Details for the file openlanguagemodel-2.2.0-py3-none-any.whl.
File metadata
- Download URL: openlanguagemodel-2.2.0-py3-none-any.whl
- Upload date:
- Size: 134.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16a2ec8547899790461598f222bf56a858e8b13cfcc03c4ef037b95b223de495
|
|
| MD5 |
44c9fd1a8bd97bf49f53dd965df4a031
|
|
| BLAKE2b-256 |
b10880f0a62316b6548093f781f01263fe86c37519a0966bd2d5de3213b19e87
|
Provenance
The following attestation bundles were made for openlanguagemodel-2.2.0-py3-none-any.whl:
Publisher:
publish.yml on openlanguagemodel/openlanguagemodel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openlanguagemodel-2.2.0-py3-none-any.whl -
Subject digest:
16a2ec8547899790461598f222bf56a858e8b13cfcc03c4ef037b95b223de495 - Sigstore transparency entry: 1882074763
- Sigstore integration time:
-
Permalink:
openlanguagemodel/openlanguagemodel@3b153550bede838b16af8b5288d5875d38cfad04 -
Branch / Tag:
refs/tags/v2.2.0 - Owner: https://github.com/openlanguagemodel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3b153550bede838b16af8b5288d5875d38cfad04 -
Trigger Event:
release
-
Statement type: