Awesome kan_gpt created by AdityaNG
Project description
KAN-GPT
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
Install it from PyPI
pip install kan_gpt
Usage
from kan_gpt.model import GPT
model_config = GPT.get_default_config()
model_config.model_type = "gpt2"
model_config.vocab_size = 5
model_config.block_size = 10
model = GPT(model_config)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
x = torch.zeros((1, 10), dtype=torch.long)
y = torch.zeros((1, 10), dtype=torch.long)
prompt = "Bangalore is often described as the "
prompt_encoded = tokenizer.encode(
text=prompt, add_special_tokens=False
)
result = prompt
x = torch.tensor(prompt_encoded).unsqueeze(0)
for _ in range(50): # sample 50 tokens
logits, loss = model(x)
x = torch.cat(
(x[:, 1:-2], logits[:, -2:-1]), dim=0
)
result += tokenizer.decode(logits[0, -2:-1])
print(result)
Setup for Development
# Download Repo
git clone https://github.com/AdityaNG/kan-gpt
cd kan-gpt
git pull
# Download Dataset
./scripts/download_webtext.sh
# Install dependencies for development
pip install -r requirements.txt
pip install -e .
Train
Use the following dummy script to make sure everything is working as expected
WANDB_MODE=offline CUDA_VISIBLE_DEVICE="" python3 -m kan_gpt.train --architecture MLP --batch_size 1 --dummy_dataset --device cpu
WANDB_MODE=offline CUDA_VISIBLE_DEVICE="" python3 -m kan_gpt.train --architecture KAN --batch_size 1 --dummy_dataset --device cpu
Then make use of the training script
python -m kan_gpt.train
TODOs
- Integrate minGPT and pykan
- Dataset downloading script for WebText
- PyTorch Dataset parser for WebText
- Mini training POC for KAN-GPT
- Integrate KAN training logic from
KAN.train_kan
- Train a dummy batch
- Integrate KAN training logic from
- Mini training POC for MLP-GPT
- Train MLP-GPT on the webtext dataset as a baseline
- Auto Save checkpoints
- Auto Save checkpoints to W&B
- Script to load checkpoint in interactive mode
- Training script to PyTorch Lighting
- Test Cases
- KAN: Forward-Backward test
- GPT: Forward-Backward test
- KAN_GPT: Forward-Backward test
Development
Read the CONTRIBUTING.md file.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kan_gpt-0.1.1.tar.gz
(158.7 kB
view hashes)
Built Distribution
kan_gpt-0.1.1-py3-none-any.whl
(55.0 kB
view hashes)