Awesome kan_gpt created by AdityaNG

Project description

KAN-GPT

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling

Install it from PyPI

pip install kan_gpt

Usage

Refer to the KAN_GPT.ipynb and kan_gpt/prompt.py for usage examples. The following is an ourtine of how to use the model:

from kan_gpt.model import GPT
from transformers import GPT2Tokenizer

model_config = GPT.get_default_config()
model_config.model_type = "gpt2"
model_config.vocab_size = 50257
model_config.block_size = 1024
model = GPT(model_config)

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

prompt = "Bangalore is often described as the "

prompt_encoded = tokenizer.encode(
  text=prompt, add_special_tokens=False
)

x = torch.tensor(prompt_encoded).unsqueeze(0)

model.eval()
y = model.generate(x, 50)  # sample 50 tokens

result = tokenizer.decode(y)

print(result)

# Bangalore is often described as the Silicon Valley of India.
# The city has witnessed rapid growth in the past two decades.....

Setup for Development

# Download Repo
git clone https://github.com/AdityaNG/kan-gpt
cd kan-gpt
git pull

# Download Dataset
./scripts/download_webtext.sh

# Install dependencies for development
pip install -r requirements.txt
pip install -e .

Train

Use the following dummy script to make sure everything is working as expected

WANDB_MODE=offline CUDA_VISIBLE_DEVICE="" python3 -m kan_gpt.train --architecture MLP --batch_size 1 --dummy_dataset --device cpu --max_iters 200
WANDB_MODE=offline CUDA_VISIBLE_DEVICE="" python3 -m kan_gpt.train --architecture KAN --batch_size 1 --dummy_dataset --device cpu --max_iters 200

Then make use of the training script

python -m kan_gpt.train

Prompt

You can prompt the model to produce text as follows

python -m kan_gpt.prompt --prompt "Bangalore is often described as the " --model_path (checkpoint)

TODOs

Integrate minGPT and pykan
Dataset downloading script for WebText
PyTorch Dataset parser for WebText
PyTorch Dataset parser for tinyshakespeare
Mini training POC for KAN-GPT
- Integrate KAN training logic from KAN.train_kan
- Train a dummy batch w/o any memory issues
Mini training POC for MLP-GPT
Train MLP-GPT on the webtext dataset as a baseline
Train KAN-GPT on the webtext dataset as a baseline
Metrics comparing KAN-GPT and MLP-GPT
Auto Save checkpoints
Auto Save checkpoints to W&B
Auto Download model weights from git / huggingface
W&B hyperparam sweep script
Script to load checkpoint in interactive mode
Training script to PyTorch Lighting
Integrate with efficient-kan
Test Cases
- KAN: Forward-Backward test
- GPT: Forward-Backward test
- KAN_GPT: Forward-Backward test
- EFFICIENT_KAN: Forward-Backward test

Development

Read the CONTRIBUTING.md file.

References

Project details

Release history Release notifications | RSS feed

1.1.0

Jul 14, 2024

1.0.5

May 29, 2024

1.0.4

May 17, 2024

1.0.3

May 17, 2024

1.0.2

May 15, 2024

1.0.1

May 9, 2024

1.0.0

May 9, 2024

0.4.0

May 8, 2024

This version

0.3.0

May 7, 2024

0.2.0

May 4, 2024

0.1.3

May 4, 2024

0.1.2

May 4, 2024

0.1.1

May 4, 2024

0.1.0

May 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kan_gpt-0.3.0.tar.gz (165.2 kB view hashes)

Uploaded May 7, 2024 Source

Built Distribution

kan_gpt-0.3.0-py3-none-any.whl (58.1 kB view hashes)

Uploaded May 7, 2024 Python 3

Hashes for kan_gpt-0.3.0.tar.gz

Hashes for kan_gpt-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`ca1650b5292f142800f94aa21a773d383cad615153321e73b27a0f1773bdf688`
MD5	`79783e7d6cf58ccdea51fe617db70cfc`
BLAKE2b-256	`0a10b1099071cc1b61daf8ed91bb310b6d94e237a1995d9e8bf7e50030ddab42`

Hashes for kan_gpt-0.3.0-py3-none-any.whl

Hashes for kan_gpt-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`27dc5406bc142739a889912bb055e8957908618f68b62cd4bbe7b9c5641cadd9`
MD5	`b5e35c5ab4ea633764cf59cb43734948`
BLAKE2b-256	`995be66e12b15114cbbd752b1a9997bee854379d2ddb31aacfe1678d7cf40798`