Skip to main content

LightGPT lightweight GPT

Project description

LightGPT – Simple Hugging Face Wrapper

🎉 LightGPT 1.0.0 – Celebration 🎉

We’re thrilled to announce the first stable release of LightGPT! This version marks the end of the beta phase and brings a polished, production‑ready package that:

  • Uses the lightweight EleutherAI/gpt‑neo‑125M model by default.
  • Provides holiday personas for fun themed interactions.
  • Includes a quick Wikipedia data collector for easy finetuning.
  • Offers a simple CLI, finetuning script, and ONNX export workflow.

Quickstart

# Install dependencies (including Wikipedia support)
pip install -r requirements.txt
from lightgpt.model import LightGPT

lgpt = LightGPT()  # loads EleutherAI/gpt-neo-125M
print(lgpt.generate("The future of AI is", max_new_tokens=30))

Command‑line interface

python -m lightgpt.cli \
    --model EleutherAI/gpt-neo-125M \
    --prompt "Once upon a time" \
    --max_new_tokens 40 \
    --temperature 0.9 \
    --do_sample

Finetuning a model

A minimal finetuning script is provided in src/lightgpt/train.py. It uses the standard transformers training loop.

python -m lightgpt.train \
    --model EleutherAI/gpt-neo-125M \
    --train_file data/my_corpus.txt \
    --output_dir finetuned_gptneo \
    --epochs 3

The script writes a new directory containing a pytorch_model.bin and tokenizer files that can be loaded with LightGPT(model_name="finetuned_gptneo").

Export to ONNX (for Hugging Face Hub)

python -m lightgpt.export_onnx \
    --model finetuned_gptneo \
    --output lightgpt_neo.onnx

The resulting lightgpt_neo.onnx can be uploaded to the Hugging Face Model Hub alongside the saved model folder.

Wikipedia data collection

Use the provided script to fetch articles for training:

python scripts/download_wiki.py \
    --topics "Artificial intelligence" "Machine learning" "Natural language processing" \
    --output wiki_corpus.txt

Holiday Personas

from lightgpt.holiday_personas import get_persona_prompt

prompt = get_persona_prompt("may_the_4th") + " What is the Force?"
print(LightGPT().generate(prompt))

License

MIT – see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightgpt-1.0.0.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lightgpt-1.0.0-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file lightgpt-1.0.0.tar.gz.

File metadata

  • Download URL: lightgpt-1.0.0.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for lightgpt-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ba7912f4578186ab81ad828bb1e83c6bffdf3a81141555cafb4e8036eb7326e1
MD5 2c4c7e6d7c84a1c3499a567f99e6aa0f
BLAKE2b-256 ffb86efd3f0defa2ff9a71085358c42146aece04c3747abf87ed68524d31e024

See more details on using hashes here.

File details

Details for the file lightgpt-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: lightgpt-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for lightgpt-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 153e7ccd9310d01c9d0e87045cca136d30be0485c824797b47fe0065c17e3f6d
MD5 aab7a0e3714bd7ea38cccc2640ae41f5
BLAKE2b-256 10c904afaacf546d058994c28fef88a89feb253e15d45e82760a5c63a8062855

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page