A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning
Project description
Language Modeling using Transformers (LMT)
A PyTorch implementation of transformer-based language models including GPT architecture for pretraining and fine-tuning. This project is designed for educational and research purposes to help users understand how the attention mechanism and Transformer architecture work in Large Language Models (LLMs).
๐ Features
- GPT Architecture: Complete implementation of decoder-only transformer models
- Attention Mechanisms: Multi-head self-attention with causal masking
- Tokenization: Multiple tokenizer implementations (BPE, Naive)
- Training Pipeline: Comprehensive trainer with pretraining and fine-tuning support
- Educational Focus: Well-documented code for learning transformer internals
- Modern Stack: Built with PyTorch 2.7+, Python 3.11+
๐ฆ Installation
Prerequisites
- Python 3.11 or 3.12
- PyTorch 2.7+
Install from PyPI
pip install language-modeling-transformers
Install from GitHub
pip install git+https://github.com/michaelellis003/LMT.git
๐โโ๏ธ Quick Start
Basic Model Usage
from lmt import GPT, ModelConfig
from lmt.models.config import ModelConfigPresets
import torch
# Create a small GPT model
config = ModelConfigPresets.small_gpt()
model = GPT(config)
# Generate some text
input_ids = torch.randint(0, config.vocab_size, (1, 10))
with torch.no_grad():
logits = model(input_ids)
print(f"Output shape: {logits.shape}") # (1, 10, vocab_size)
Training a Model
from lmt import Trainer, GPT
from lmt.training import BaseTrainingConfig
from lmt.models.config import ModelConfigPresets
# Configure model and training
model_config = ModelConfigPresets.small_gpt()
training_config = BaseTrainingConfig(
num_epochs=10,
batch_size=4,
learning_rate=1e-4
)
# Initialize model and trainer
model = GPT(model_config)
trainer = Trainer(
model=model,
train_loader=your_train_loader,
val_loader=your_val_loader,
config=training_config
)
# Start training
trainer.train()
Using the Training Script
# Pretraining
python scripts/train.py --task pretraining --num_epochs 20 --batch_size 4
# Classification fine-tuning
python scripts/train.py --task classification --download_model --learning_rate 1e-5
๐ Documentation
Model Components
- GPT: Main model class implementing decoder-only transformer
- TransformerBlock: Individual transformer layer with attention and feed-forward
- MultiHeadAttention: Multi-head self-attention mechanism
- CausalAttention: Attention with causal masking for autoregressive generation
Tokenizers
- BPETokenizer: Byte-Pair Encoding tokenizer
- NaiveTokenizer: Simple character-level tokenizer
- BaseTokenizer: Abstract base class for custom tokenizers
Training
- Trainer: Main training orchestrator with support for pretraining and fine-tuning
- BaseTrainingConfig: Configuration class for training parameters
- Custom datasets and dataloaders: Support for various text datasets
๐๏ธ Project Structure
src/lmt/
โโโ __init__.py # Main package exports
โโโ models/ # Model architectures
โ โโโ gpt/ # GPT implementation
โ โโโ config.py # Model configuration
โ โโโ utils.py # Model utilities
โโโ layers/ # Neural network layers
โ โโโ attention/ # Attention mechanisms
โ โโโ transformers/ # Transformer blocks
โโโ tokenizer/ # Tokenization implementations
โโโ training/ # Training pipeline
โโโ generate.py # Text generation utilities
scripts/
โโโ train.py # Main training script
โโโ utils.py # Training utilities
tests/ # Comprehensive test suite
notebooks/ # Educational Jupyter notebooks
docs/ # Sphinx documentation
๐ Examples and Notebooks
Explore the interactive notebooks in the notebooks/ directory:
attention.ipynb: Understanding attention mechanismspretraining_gpt.ipynb: GPT pretraining walkthroughtokenizer.ipynb: Tokenization techniques
๐ง Configuration
Model Configuration
from lmt.models.config import ModelConfig
config = ModelConfig(
vocab_size=50257,
embed_dim=768,
context_length=1024,
num_layers=12,
num_heads=12,
dropout=0.1
)
Training Configuration
from lmt.training.config import BaseTrainingConfig
training_config = BaseTrainingConfig(
num_epochs=10,
batch_size=8,
learning_rate=3e-4,
weight_decay=0.1,
print_every=100,
eval_every=500
)
๐ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pylmt-0.2.10.tar.gz.
File metadata
- Download URL: pylmt-0.2.10.tar.gz
- Upload date:
- Size: 24.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93b4a721d706ddf1a21ffa06a78e36a7a022f64f51eb0b7d4c9e6129d49f5816
|
|
| MD5 |
21972ffa65d27d1f5df32ac10e1fd60a
|
|
| BLAKE2b-256 |
eccb31338efc9ca87781f8cbaac136d3805fae5e669624ce77eab378dbc94c3f
|
File details
Details for the file pylmt-0.2.10-py3-none-any.whl.
File metadata
- Download URL: pylmt-0.2.10-py3-none-any.whl
- Upload date:
- Size: 37.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a572b1f18c36e4ee1ad92809018be02bf6087d61b329bc5ede568a71275d573
|
|
| MD5 |
27c64a7470937b5b292928187bf09160
|
|
| BLAKE2b-256 |
9f675e0279e78e159457c4d2d76e3d3c13f7963f1b1966b622ee412a5e8fda21
|