Karpathy's pure-Python microGPT as a Strands Agents model provider — zero dependencies, pure autograd, from scratch.

These details have not been verified by PyPI

Project links

Project description

strands-microgpt

Karpathy's pure-Python microGPT as a Strands Agents model provider — zero dependencies, pure autograd, trained from scratch.

Based on @karpathy's atomic GPT gist: "The most atomic way to train and run inference for a GPT in pure, dependency-free Python. This file is the complete algorithm. Everything else is just efficiency."

What is this?

A complete GPT implementation — autograd engine, transformer architecture, tokenizer, Adam optimizer, training loop, and inference — in pure Python with zero dependencies. No PyTorch, no NumPy, no CUDA. Just math, random, and the algorithm.

Packaged as a proper Strands model provider so you can:

Use it as a Model — drop-in Strands Model interface for character-level generation
Use it as a Tool — train and generate from any Strands agent via tool calls
Learn from it — the entire algorithm is readable, hackable, and documented

Install

pip install strands-microgpt

Requirements: Python ≥3.10, strands-agents. That's it. No GPU needed.

Quick Start

As a standalone engine (zero deps)

from strands_microgpt import MicroGPT

# Load dataset, build tokenizer, create model
model, tokenizer, docs = MicroGPT.from_dataset()

# Train (1000 steps on names.txt)
model.train_on_docs(docs, tokenizer, num_steps=1000)

# Generate new names
for name in model.generate(tokenizer, num_samples=10):
    print(name)

As a Strands Model provider

from strands import Agent
from strands_microgpt import MicroGPTModel

model = MicroGPTModel(num_steps=1000, temperature=0.5)
agent = Agent(model=model)
agent("Generate some names")

As a Tool (in any agent)

from strands import Agent
from strands_microgpt import microgpt_train, microgpt_generate

# Use with Bedrock, OpenAI, or any model
agent = Agent(tools=[microgpt_train, microgpt_generate])
agent("Train a GPT on the names dataset for 500 steps, then generate 10 names")

Architecture

The complete algorithm in ~300 lines:

strands_microgpt/
├── engine.py           # Value (autograd), Tokenizer, MicroGPT (transformer)
├── microgpt_model.py   # Strands Model interface
└── tools/
    ├── microgpt_train.py     # Training tool
    └── microgpt_generate.py  # Generation tool

The Autograd Engine

from strands_microgpt import Value

a = Value(2.0)
b = Value(3.0)
c = a * b + a  # builds computation graph
c.backward()   # backpropagate gradients
print(a.grad)  # 4.0 (dc/da = b + 1)

Supports: +, *, -, /, **, relu(), exp(), log(), backward()

The Transformer

GPT-2 architecture with:

RMSNorm (instead of LayerNorm)
No biases
ReLU (instead of GeLU)
Multi-head causal attention
Adam optimizer with linear LR decay

Parameters

Config	Default	Description
`n_layer`	1	Transformer depth
`n_embd`	16	Embedding dimension
`block_size`	16	Context window
`n_head`	4	Attention heads
`num_steps`	1000	Training steps
`learning_rate`	0.01	Initial LR (linear decay)
`temperature`	0.5	Generation temperature

Custom Datasets

Train on anything — names, poems, molecules, DNA, code:

from strands_microgpt import MicroGPT, Tokenizer

docs = ["the cat sat on the mat", "the dog sat on the log"] * 100
tokenizer = Tokenizer.from_docs(docs)
model = MicroGPT(vocab_size=tokenizer.vocab_size, n_embd=32, block_size=32)

model.train_on_docs(docs, tokenizer, num_steps=2000)
samples = model.generate(tokenizer, num_samples=10, temperature=0.7)

Checkpoints

# Save
model.save_checkpoint("model.json", tokenizer)

# Load
model, tokenizer, metadata = MicroGPT.load_checkpoint("model.json")
samples = model.generate(tokenizer, num_samples=10)

Examples

Example	Description
01_basic_training.py	Train on names, generate new ones
02_strands_agent.py	Use as a Strands Model provider
03_tool_usage.py	Train/generate via tool calls
04_custom_dataset.py	Train on custom text data
05_autograd_exploration.py	Explore the autograd engine

Why?

"Everything else is just efficiency." — @karpathy

This package exists to show that:

A GPT is just math. No magic, no black boxes. The entire algorithm fits in your head.
Strands Model interface is universal. If it can generate tokens, it can be a Strands model.
Understanding > Using. Train a transformer from scratch to truly grok what LLMs do.

The model is tiny and slow (pure Python, no vectorization). For production, use Bedrock, OpenAI, or any real provider. For learning, this is the best code to read.

API Reference

`MicroGPT`

MicroGPT(vocab_size, n_layer=1, n_embd=16, block_size=16, n_head=4, seed=42)

.train_on_docs(docs, tokenizer, num_steps, learning_rate, log_every, callback) → List[float]
.generate(tokenizer, num_samples, temperature, max_length) → List[str]
.save_checkpoint(path, tokenizer, metadata) → None
.load_checkpoint(path) → (MicroGPT, Tokenizer, Dict) (classmethod)
.from_dataset(dataset_url, dataset_path, **kwargs) → (MicroGPT, Tokenizer, docs) (classmethod)

`MicroGPTModel` (Strands Model)

MicroGPTModel(dataset_url, num_steps=1000, temperature=0.5, ...)

Drop-in replacement for any Strands model. Trains on first use, then generates.

`Value` (Autograd)

Value(data)  # scalar autograd node

Supports: +, *, -, /, **, .relu(), .exp(), .log(), .backward()

Tools

microgpt_train(dataset_url, num_steps, n_layer, ...) — Train a model
microgpt_generate(checkpoint_path, num_samples, temperature) — Generate from checkpoint

Resources

Karpathy's GPT gist — The original
micrograd — Karpathy's autograd engine
makemore — Character-level language modeling
Strands Agents — The agent framework
strands-cosmos — NVIDIA Cosmos VLM provider

License

MIT | Based on @karpathy's work | Built with Strands Agents

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strands_microgpt-0.1.0.tar.gz (18.1 kB view details)

Uploaded Mar 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

strands_microgpt-0.1.0-py3-none-any.whl (16.9 kB view details)

Uploaded Mar 22, 2026 Python 3

File details

Details for the file strands_microgpt-0.1.0.tar.gz.

File metadata

Download URL: strands_microgpt-0.1.0.tar.gz
Upload date: Mar 22, 2026
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for strands_microgpt-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`73f9f5879ad925f313bb3cce2bdeb278ba225419d6399a793990db23aeaf83b6`
MD5	`1dce1fa7504a53e091a8b03b086425ae`
BLAKE2b-256	`892c083ed4630a7f44efefd8bb09d2b5026ed21de4010c70c58636f9a20f4197`

See more details on using hashes here.

File details

Details for the file strands_microgpt-0.1.0-py3-none-any.whl.

File metadata

Download URL: strands_microgpt-0.1.0-py3-none-any.whl
Upload date: Mar 22, 2026
Size: 16.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for strands_microgpt-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`65abb9ab2a09f8aaaa5c619e076e92ffd86eef14fd014efce5cf633c978be699`
MD5	`82743e6ccb61a73ca0d8abdcc5e76b20`
BLAKE2b-256	`f5810457337dabf78003dee00261471c16e14a732a4a7cf41e01d6ec0a46e516`

See more details on using hashes here.

strands-microgpt 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

strands-microgpt

What is this?

Install

Quick Start

As a standalone engine (zero deps)

As a Strands Model provider

As a Tool (in any agent)

Architecture

The Autograd Engine

The Transformer

Parameters

Custom Datasets

Checkpoints

Examples

Why?

API Reference

MicroGPT

MicroGPTModel (Strands Model)

Value (Autograd)

Tools

Resources

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`MicroGPT`

`MicroGPTModel` (Strands Model)

`Value` (Autograd)