A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning

These details have not been verified by PyPI

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

Language Modeling using Transformers (LMT)

A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning. This project is designed for educational and research purposes to help users understand how the attention mechanism and Transformer architecture work in Large Language Models (LLMs).

🚀 Features

GPT Architecture: Complete implementation of decoder-only transformer models
Attention Mechanisms: Multi-head self-attention with causal masking
Tokenization: Multiple tokenizer implementations (BPE, Naive)
Training Pipeline: Comprehensive trainer with pretraining and fine-tuning support
Educational Focus: Well-documented code for learning transformer internals
Modern Stack: Built with PyTorch 2.7+, Python 3.11+

📦 Installation

Prerequisites

Python 3.11 or 3.12
PyTorch 2.7+

Install from PyPI

pip install language-modeling-transformers

Install from GitHub

pip install git+https://github.com/michaelellis003/LMT.git

🏃‍♂️ Quick Start

Basic Model Usage

from lmt import GPT, ModelConfig
from lmt.models.config import ModelConfigPresets
import torch

# Create a small GPT model
config = ModelConfigPresets.small_gpt()
model = GPT(config)

# Generate some text
input_ids = torch.randint(0, config.vocab_size, (1, 10))
with torch.no_grad():
    logits = model(input_ids)
    print(f"Output shape: {logits.shape}")  # (1, 10, vocab_size)

Training a Model

from lmt import Trainer, GPT
from lmt.training import BaseTrainingConfig
from lmt.models.config import ModelConfigPresets

# Configure model and training
model_config = ModelConfigPresets.small_gpt()
training_config = BaseTrainingConfig(
    num_epochs=10,
    batch_size=4,
    learning_rate=1e-4
)

# Initialize model and trainer
model = GPT(model_config)
trainer = Trainer(
    model=model,
    train_loader=your_train_loader,
    val_loader=your_val_loader,
    config=training_config
)

# Start training
trainer.train()

Using the Training Script

# Pretraining
python scripts/train.py --task pretraining --num_epochs 20 --batch_size 4

# Classification fine-tuning
python scripts/train.py --task classification --download_model --learning_rate 1e-5

📚 Documentation

Model Components

GPT: Main model class implementing decoder-only transformer
TransformerBlock: Individual transformer layer with attention and feed-forward
MultiHeadAttention: Multi-head self-attention mechanism
CausalAttention: Attention with causal masking for autoregressive generation

Tokenizers

BPETokenizer: Byte-Pair Encoding tokenizer
NaiveTokenizer: Simple character-level tokenizer
BaseTokenizer: Abstract base class for custom tokenizers

Training

Trainer: Main training orchestrator with support for pretraining and fine-tuning
BaseTrainingConfig: Configuration class for training parameters
Custom datasets and dataloaders: Support for various text datasets

🗂️ Project Structure

src/lmt/
├── __init__.py              # Main package exports
├── models/                  # Model architectures
│   ├── gpt/                # GPT implementation
│   ├── config.py           # Model configuration
│   └── utils.py            # Model utilities
├── layers/                  # Neural network layers
│   ├── attention/          # Attention mechanisms
│   └── transformers/       # Transformer blocks
├── tokenizer/              # Tokenization implementations
├── training/               # Training pipeline
└── generate.py             # Text generation utilities

scripts/
├── train.py                # Main training script
└── utils.py                # Training utilities

tests/                      # Comprehensive test suite
notebooks/                  # Educational Jupyter notebooks
docs/                       # Sphinx documentation

📊 Examples and Notebooks

Explore the interactive notebooks in the notebooks/ directory:

attention.ipynb: Understanding attention mechanisms
pretraining_gpt.ipynb: GPT pretraining walkthrough
tokenizer.ipynb: Tokenization techniques

🔧 Configuration

Model Configuration

from lmt.models.config import ModelConfig

config = ModelConfig(
    vocab_size=50257,
    embed_dim=768,
    context_length=1024,
    num_layers=12,
    num_heads=12,
    dropout=0.1
)

Training Configuration

from lmt.training.config import BaseTrainingConfig

training_config = BaseTrainingConfig(
    num_epochs=10,
    batch_size=8,
    learning_rate=3e-4,
    weight_decay=0.1,
    print_every=100,
    eval_every=500
)

📄 License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.8

Sep 19, 2025

0.2.7

Sep 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

language_modeling_transformers-0.2.8.tar.gz (24.5 kB view details)

Uploaded Sep 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

language_modeling_transformers-0.2.8-py3-none-any.whl (37.5 kB view details)

Uploaded Sep 19, 2025 Python 3

File details

Details for the file language_modeling_transformers-0.2.8.tar.gz.

File metadata

Download URL: language_modeling_transformers-0.2.8.tar.gz
Upload date: Sep 19, 2025
Size: 24.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for language_modeling_transformers-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`33af675f9a3930cce48c1deb1ee8fe331fbc683dd9a646bba224989690131c41`
MD5	`9afa6bbe615067a732c05069ada8a6dd`
BLAKE2b-256	`8e611d8dff707dd3be702e253b5479315eb446c46ef0f0bcf161a9ffb17e5cde`

See more details on using hashes here.

File details

Details for the file language_modeling_transformers-0.2.8-py3-none-any.whl.

File metadata

Download URL: language_modeling_transformers-0.2.8-py3-none-any.whl
Upload date: Sep 19, 2025
Size: 37.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for language_modeling_transformers-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a75b3c48c1ea9b8782928af88040b885a0af145120da81bac972ae6dbc4716f2`
MD5	`1a9398e520b02811b5e085dd627096b6`
BLAKE2b-256	`56f40769d5b5c8637abddce0c0d17c5296ca623c52f6c6f0adc6941b86a49fbb`

See more details on using hashes here.

language-modeling-transformers 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Language Modeling using Transformers (LMT)

🚀 Features

📦 Installation

Prerequisites

Install from PyPI

Install from GitHub

🏃‍♂️ Quick Start

Basic Model Usage

Training a Model

Using the Training Script

📚 Documentation

Model Components

Tokenizers

Training

🗂️ Project Structure

📊 Examples and Notebooks

🔧 Configuration

Model Configuration

Training Configuration

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes