Lightweight GPT2 training and deployment toolkit
Project description
LightChat
LightChat is a lightweight GPT-2–based toolkit built on top of DistilGPT2. It enables anyone to train, deploy, and interact with a custom chatbot on low‑end devices using simple CLI commands.
🌐 Links & Community
- 🔗 GitHub Repository: github.com/reprompts/lightchat
- 💼 LinkedIn Group: LightChat Dev Group
- 📰 Dev.to Profile: @repromptsquest
- 🐦 Twitter: @repromptsquest
🔧 Features
- Train your own language model on plain text files
- Chat interactively with your fine‑tuned model
- List & delete saved models
- Supports top‑k and top‑p (nucleus) sampling
📋 Dataset Preparation
- Provide a plain text file (
.txt) with one sentence per line. - Aim for at least 1,000–10,000 lines for reasonable results on CPU.
- Clean, focused content yields better chat relevance.
Example (data.txt):
Hello, how can I help you today?
I love reading sci‑fi novels.
What's the weather like?
⚙️ Installation
pip install lightchat
⚠️ CPU install note: Transformers and PyTorch may take several minutes to compile on CPU.
🚀 Training
lightchat train <model_name> <data.txt> \
--epochs 3 \
--batch-size 8 \
--learning-rate 5e-5
Example Command:
lightchat train newmodel data.txt --epochs 1 --batch-size 8 --learning-rate 5e-5
> **⚠️ Data file path <data.txt>:** Give proper path to the dataset or keep dataset inside the root directory of project where library is installed.
- model_name: directory under
models/to save to - epochs: full passes over your data
- batch-size: number of samples per step
- learning-rate: step size for optimizer
⚠️ CPU training note: Training on CPU is slow. More epochs/bigger batch sizes = longer time but better fit.
💬 Chatting
lightchat chat <model_name> \
--max-length 100 \
--top-k 50 \
--top-p 0.9 \
--temperature 1.0
Example Command:
lightchat chat newmodel --max-length 100 --top-k 50 --temperature 0.9
- max-length: max generated tokens per reply
- top-k: sample from top k tokens
- top-p: sample from top cumulative probability p
- temperature: randomness control (higher = more creative)
Give "exit" as an prompt to the model to exit the conversation and you can load the trained models anytime by following the instructions given below.
Trained models live in
models/<model_name>/.
📂 Model Management
- List saved models:
lightchat list-models - Delete a model:
lightchat delete-model <model_name> - Or manually remove
models/<model_name>/directory.
🙌 Contributions
Contributions are welcome! Please see CONTRIBUTING.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lightchat-0.1.2.tar.gz.
File metadata
- Download URL: lightchat-0.1.2.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
270bf0695e983c1aaacf8f474ea54c4561866fbe48747737f116550e0c5bbea1
|
|
| MD5 |
b242b0695baa070d938394baf0e21e47
|
|
| BLAKE2b-256 |
a337b101442822a32da1f38af0ecc87d892ec2be20d63e5919b7fcd45502e19f
|
File details
Details for the file lightchat-0.1.2-py3-none-any.whl.
File metadata
- Download URL: lightchat-0.1.2-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
917e23c46233e1c2bf0d4db5e2bfe154f50ce532dbf30048b6443037bdfc5f17
|
|
| MD5 |
9d57bdfb260329790c214ae00e35e051
|
|
| BLAKE2b-256 |
e780e3ee33437fb14e3de51503dbfb1d9240880faa0e79419a7899c48bbda4a1
|