Skip to main content

A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning

Project description

Language Modeling using Transformers (LMT)

Python PyTorch License Code style: ruff

A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning. This project is designed for educational and research purposes to help users understand how the attention mechanism and Transformer architecture work in Large Language Models (LLMs).

๐Ÿš€ Features

  • GPT Architecture: Complete implementation of decoder-only transformer models
  • Attention Mechanisms: Multi-head self-attention with causal masking
  • Tokenization: Multiple tokenizer implementations (BPE, Naive)
  • Training Pipeline: Comprehensive trainer with pretraining and fine-tuning support
  • Educational Focus: Well-documented code for learning transformer internals
  • Modern Stack: Built with PyTorch 2.7+, Python 3.11+

๐Ÿ“ฆ Installation

Prerequisites

  • Python 3.11 or 3.12
  • PyTorch 2.7+

Install from PyPI

pip install language-modeling-transformers

Install from GitHub

pip install git+https://github.com/michaelellis003/LMT.git

๐Ÿƒโ€โ™‚๏ธ Quick Start

Basic Model Usage

from lmt import GPT, ModelConfig
from lmt.models.config import ModelConfigPresets
import torch

# Create a small GPT model
config = ModelConfigPresets.small_gpt()
model = GPT(config)

# Generate some text
input_ids = torch.randint(0, config.vocab_size, (1, 10))
with torch.no_grad():
    logits = model(input_ids)
    print(f"Output shape: {logits.shape}")  # (1, 10, vocab_size)

Training a Model

from lmt import Trainer, GPT
from lmt.training import BaseTrainingConfig
from lmt.models.config import ModelConfigPresets

# Configure model and training
model_config = ModelConfigPresets.small_gpt()
training_config = BaseTrainingConfig(
    num_epochs=10,
    batch_size=4,
    learning_rate=1e-4
)

# Initialize model and trainer
model = GPT(model_config)
trainer = Trainer(
    model=model,
    train_loader=your_train_loader,
    val_loader=your_val_loader,
    config=training_config
)

# Start training
trainer.train()

Using the Training Script

# Pretraining
python scripts/train.py --task pretraining --num_epochs 20 --batch_size 4

# Classification fine-tuning
python scripts/train.py --task classification --download_model --learning_rate 1e-5

๐Ÿ“š Documentation

Model Components

  • GPT: Main model class implementing decoder-only transformer
  • TransformerBlock: Individual transformer layer with attention and feed-forward
  • MultiHeadAttention: Multi-head self-attention mechanism
  • CausalAttention: Attention with causal masking for autoregressive generation

Tokenizers

  • BPETokenizer: Byte-Pair Encoding tokenizer
  • NaiveTokenizer: Simple character-level tokenizer
  • BaseTokenizer: Abstract base class for custom tokenizers

Training

  • Trainer: Main training orchestrator with support for pretraining and fine-tuning
  • BaseTrainingConfig: Configuration class for training parameters
  • Custom datasets and dataloaders: Support for various text datasets

๐Ÿ—‚๏ธ Project Structure

src/lmt/
โ”œโ”€โ”€ __init__.py              # Main package exports
โ”œโ”€โ”€ models/                  # Model architectures
โ”‚   โ”œโ”€โ”€ gpt/                # GPT implementation
โ”‚   โ”œโ”€โ”€ config.py           # Model configuration
โ”‚   โ””โ”€โ”€ utils.py            # Model utilities
โ”œโ”€โ”€ layers/                  # Neural network layers
โ”‚   โ”œโ”€โ”€ attention/          # Attention mechanisms
โ”‚   โ””โ”€โ”€ transformers/       # Transformer blocks
โ”œโ”€โ”€ tokenizer/              # Tokenization implementations
โ”œโ”€โ”€ training/               # Training pipeline
โ””โ”€โ”€ generate.py             # Text generation utilities

scripts/
โ”œโ”€โ”€ train.py                # Main training script
โ””โ”€โ”€ utils.py                # Training utilities

tests/                      # Comprehensive test suite
notebooks/                  # Educational Jupyter notebooks
docs/                       # Sphinx documentation

๐Ÿ“Š Examples and Notebooks

Explore the interactive notebooks in the notebooks/ directory:

  • attention.ipynb: Understanding attention mechanisms
  • pretraining_gpt.ipynb: GPT pretraining walkthrough
  • tokenizer.ipynb: Tokenization techniques

๐Ÿ”ง Configuration

Model Configuration

from lmt.models.config import ModelConfig

config = ModelConfig(
    vocab_size=50257,
    embed_dim=768,
    context_length=1024,
    num_layers=12,
    num_heads=12,
    dropout=0.1
)

Training Configuration

from lmt.training.config import BaseTrainingConfig

training_config = BaseTrainingConfig(
    num_epochs=10,
    batch_size=8,
    learning_rate=3e-4,
    weight_decay=0.1,
    print_every=100,
    eval_every=500
)

๐Ÿ“„ License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylmt-0.2.9.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylmt-0.2.9-py3-none-any.whl (37.3 kB view details)

Uploaded Python 3

File details

Details for the file pylmt-0.2.9.tar.gz.

File metadata

  • Download URL: pylmt-0.2.9.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylmt-0.2.9.tar.gz
Algorithm Hash digest
SHA256 ce773c52c3c33fb8966754d85ecfc4e56de3b94f8bf78bd9d155ecd8a691c41b
MD5 9df893adbfb39858698ead6c581a758c
BLAKE2b-256 98aeef3895d8887a63e6818e6942c9d06b9629baaca1cb383bd03e7931e6047d

See more details on using hashes here.

File details

Details for the file pylmt-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: pylmt-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 37.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylmt-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 31b8f1a086fb67c6ddd8032abf4f5b366287617f46551a5b558a71b0eff995a5
MD5 84c794276a5eb0a42b15a298ec6fab54
BLAKE2b-256 89c46674b570fa42699ed3b06121b8468a66a6e99b4714dbb5e8ebaeb9068f32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page