Skip to main content

Tiny causal language model (context_len=3) with training and evaluation tools.

Project description

NanoTransformer

Stage-wise implementation of a tiny causal LM (context_len=3, ~2.1M params).

PyPI install

pip install nanotransformer

CLI entry points (you supply a config JSON):

nanotransformer-train --config /path/to/config.json
nanotransformer-eval --config /path/to/config.json --checkpoint /path/to/checkpoint.pt
nanotransformer-baseline --config /path/to/config.json --k 0.1
nanotransformer-latency --config /path/to/config.json --checkpoint /path/to/checkpoint.pt

Stage 1: Environment

python -m venv venv
source venv/bin/activate
python scripts/install_deps.py

Stage 2: Tokenizer and data

  1. Download a 50MB English corpus (optional helper):
python scripts/download_corpus.py --out data/corpus.txt --target-mb 50
  1. Or place your UTF-8 corpus at data/corpus.txt
  2. Train BPE tokenizer:
python scripts/train_tokenizer.py --corpus data/corpus.txt --vocab-size 10000 --out artifacts/tokenizer.json
  1. Build token ids and splits:
python scripts/prepare_data.py --corpus data/corpus.txt --tokenizer artifacts/tokenizer.json --out artifacts/data_ids.pt --splits-out artifacts/splits.pt

Stage 3: Train

python train.py --config configs/base.json

Stage 4: Evaluate

python eval.py --config configs/base.json --checkpoint checkpoints/best.pt

Stage 5: Trigram baseline (add-k smoothing)

python baseline_trigram.py --config configs/base.json --k 0.1

Stage 6: Latency

python measure_latency.py --config configs/base.json --checkpoint checkpoints/best.pt --iters 200 --warmup 10

Notes

  • Adjust hyperparameters in configs/base.json
  • Outputs: artifacts/ (tokenizer, data, metrics) and checkpoints/ (model)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nanotransformer-0.1.0.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nanotransformer-0.1.0-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file nanotransformer-0.1.0.tar.gz.

File metadata

  • Download URL: nanotransformer-0.1.0.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for nanotransformer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7aa02c234aed6332120bd8c151385768357950a8082c52b01c7d708b405d6a38
MD5 db1064e2bc9d15282e808a5704a5afff
BLAKE2b-256 227322102dfcddaa5b3fa8455923bf85f8e2dcc3bab98d72393efd9f861c4f50

See more details on using hashes here.

File details

Details for the file nanotransformer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nanotransformer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bdce5767181d952b69a938009080702de7ee8506f8f26c3696a82ab85a7a531f
MD5 44c91de5d51269483a093c0f8434149b
BLAKE2b-256 ad278ab255017306fce6b45af311f8d5e65e5bc3c7dcc75e51440905b37874c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page