Transparent, NumPy-only deep learning framework for teaching, small-scale projects, prototyping, and reproducible experiments.
Project description
PureML — a tiny, transparent deep-learning framework in NumPy
PureML is a learning-friendly deep-learning framework built entirely on top of NumPy. It aims to be small, readable, and hackable while still being practical for real experiments and teaching.
- No hidden magic — a Tensor class + autodiff engine with dynamic computation graph and efficient VJPs for backward passes
- Batteries included — core layers (Affine, Dropout, BatchNorm1d), common losses, common optimizers, and a
DataLoader - Self-contained dataset demo — a ready-to-use MNIST reader and an end-to-end “MNIST Beater” model
- Portable persistence — zarr-backed
ArrayStoragewith zip compression for saving/loading model state
If you like scikit-learn’s simplicity and wish deep learning felt the same way for small/medium projects, PureML is for you.
Install
PureML targets Python 3.11+ and NumPy 2.x.
pip install ym-pure-ml
The only runtime deps are: numpy, zarr
Quickstart: Fit MNIST in a few lines
from pureml.models.neural_networks import MNIST_BEATER
from pureml.datasets import MnistDataset
# 1) Load data (train uses one-hot labels; test gives class indices)
with MnistDataset("train") as train, MnistDataset("test") as test:
# 2) Build the tiny network: Affine(784→256) → ReLU → Affine(256→10)
model = MNIST_BEATER().train()
# 3) Fit on the training set
model.fit(train, batch_size=128, num_epochs=5)
# 4) Switch to eval: model.predict returns class indices
model.eval()
# Example: run on one batch from the test set
X_test, y_test = test[:128]
preds = model(X_test)
print(preds.data[:10]) # class ids
What you get out of the box:
- A tiny network that learns MNIST
- Clean logging of epoch loss
- An inference mode (
.eval()) that returns class indices directly
Core concepts
1) Tensors & Autodiff
PureML wraps NumPy arrays in Tensor objects that record operations and expose .backward() for gradient calculation. The Tensor supports:
- Elementwise + matmul ops (
+ - * / **,@,.T) - Reshaping helpers like
.reshape(...)and.flatten(...) - Non-grad ops like
.argmax(...) - A
no_gradcontext manager for inference/metrics
The goal is clarity: gradients are implemented as explicit vector-Jacobian products (VJPs) you can read in one file.
2) Layers
- Affine (Linear) —
Y = X @ W + b(with sensible init) - Dropout
- BatchNorm1d — with running mean/variance buffers and momentum
Layers expose:
.parameters(trainables).named_buffers()(non-trainable state).train()/.eval()modes
3) Losses
MSEBCE(probabilities) andSigmoid+BCE(logits)CCE(categorical cross-entropy; supportsfrom_logits=True)
4) Optimizers & Schedulers
PureML ships with four optimizers and three lightweight LR schedulers. All optimizers share the same interface:
- Construct with a flat list of model params (
model.parameters) and a base learning rate. - Call
optim.zero_grad()→ backprop →optim.step()each iteration. - Optional weight decay is supported in both classic (coupled L2) and AdamW-style decoupled forms via
decoupled_wd(defaults toTrue). - All have robust checkpointing:
save_state("path")writes a single.pureml.zip;load_state("path")restores hyperparameters, per-parameter slots (e.g., momentums), and even current parameter values for deterministic resume.
Available optimizers
-
SGD — stochastic gradient descent with optional momentum.
- Args:
lr,beta=0.0(momentum),weight_decay=0.0,decoupled_wd=True - Update (with momentum):
v ← β·v + (1−β)·g, then (AdamW-style if decoupled)w ← w − lr·(wd·w) − lr·v
- Args:
-
AdaGrad — per-parameter adaptive rates via accumulated squared grads.
- Args:
lr,weight_decay=0.0,delta=1e-7,decoupled_wd=True - Accumulator:
r ← r + g⊙g; update:w ← w − lr·g / (sqrt(r)+δ)
- Args:
-
RMSProp — EMA of squared grads.
- Args:
lr,weight_decay=0.0,beta=0.9,delta=1e-6,decoupled_wd=True - Accumulator:
r ← EMA_β(g⊙g); update:w ← w − lr·g / (sqrt(r)+δ)
- Args:
-
Adam / AdamW — first & second moments with bias correction.
- Args:
lr,weight_decay=0.0,beta1=0.9,beta2=0.999,delta=1e-8,decoupled_wd=True - Moments:
v ← EMA_{β1}(g),r ← EMA_{β2}(g⊙g)
Bias-correct:v̂ = v/(1−β1^t),r̂ = r/(1−β2^t)
Update (AdamW if decoupled):w ← w − lr·(wd·w) − lr· v̂/(sqrt(r̂)+δ)
- Args:
Coupled vs decoupled weight decay:
Setdecoupled_wd=Falseto apply classic L2 regularization through the gradient (g ← g + wd·w).
Leave it asTrue(default) for AdamW-style parameter decay (w ← w − lr·wd·w) applied separately from the gradient step.
LR schedulers
Schedulers wrap an optimizer and update optim.lr when you call sched.step():
StepLR(optim, step_size, gamma=0.1)→ piecewise constant: multiply bygammaeverystep_sizesteps.ExponentialLR(optim, gamma)→ smooth exponential decay each step.CosineAnnealingLR(optim, T_max, eta_min=0.0)→ half-cosine frombase_lrtoeta_minoverT_maxsteps.
All schedulers expose save_state(...) / load_state(...) and step(n=1) -> new_lr.
Usage
from pureml.optimizers import Adam, StepLR # also: SGD, AdaGrad, RMSProp; ExponentialLR, CosineAnnealingLR
from pureml.losses import CCE
from pureml.training_utils import DataLoader
from pureml.models.neural_networks import MNIST_BEATER
from pureml.datasets.MNIST import MnistDataset
model = MNIST_BEATER().train()
optim = Adam(model.parameters, lr=1e-3, weight_decay=1e-2) # AdamW by default (decoupled_wd=True)
sched = StepLR(optim, step_size=1000, gamma=0.5) # optional
for epoch in range(5):
for X, Y in DataLoader(MnistDataset('train'), batch_size=128, shuffle=True):
optim.zero_grad()
logits = model(X)
loss = CCE(Y, logits, from_logits=True)
loss.backward()
optim.step()
sched.step() # call per-batch or per-epoch as you prefer
5) Data utilities
- Minimal
Datasetprotocol (__len__,__getitem__) DataLoaderwith batching, shuffling, slice fast-paths, and an optional seeded RNG- Helpers like
one_hot(...)andmulti_hot(...)
Saving & Loading
PureML provides two levels of persistence:
- Parameters only — compact save/load of learnable weights
- Full state — parameters + buffers + top-level literals (versioned), using zarr with Blosc(zstd) compression inside a
.zip
# Save only trainable parameters
model.save("mnist_params")
# Save full state (params + buffers + literals) to .pureml.zip
model.save_state("mnist_full_state")
# Load later
model = MNIST_BEATER().eval().load_state("mnist_full_state.pureml.zip")
MNIST dataset included
The repo ships a compressed zarr archive of MNIST (uint8, 28×28). The MnistDataset:
- Normalizes images to
[0,1]float64 - Uses one-hot labels for training mode
- Supports slicing and context-manager cleanup
Why PureML?
- Read the source, learn the math. Every gradient is explicit and local.
- Great for teaching & research notes. Small enough to copy into slides or notebooks.
- Fast enough for classic datasets. Vectorized NumPy code + light I/O.
If you need GPUs, distributed training, or huge model zoos, you should use PyTorch/JAX. PureML is intentionally light.
Continuous Development (the following will be added soon)
- Dedicated webpage with detailed and complete documentation
- Convolutional layers and pooling
- Recurrent Layers
- Extra evaluation metrics (Precision, Recall, F1-Score)
- Training visualisation utilities
Contributing
PRs, issues, and discussion are welcome! Please include:
- A small, focused change
- Clear rationale (what/why)
- Tests where appropriate
- Tell your friends!
License
Apache-2.0 — see LICENSE in this repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ym_pure_ml-1.0.tar.gz.
File metadata
- Download URL: ym_pure_ml-1.0.tar.gz
- Upload date:
- Size: 10.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e152c9ff14a1e2a2941b23b4595ac7606cdd9143a41c84f69f7c1ffb18f9488
|
|
| MD5 |
0f26e099362951db50b6fa0ef9cd826e
|
|
| BLAKE2b-256 |
831d43f0daef937e11c6f3c5076c38a99f4b71f6c20afa0a3ab169f0fb6d82aa
|
File details
Details for the file ym_pure_ml-1.0-py3-none-any.whl.
File metadata
- Download URL: ym_pure_ml-1.0-py3-none-any.whl
- Upload date:
- Size: 10.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2979ed89f11d7e308bbd78291022a3e21b193d8af582b7cb442df7e0bc38746
|
|
| MD5 |
975f2d34893aa039274bcf8e6e859ae8
|
|
| BLAKE2b-256 |
82c0ace777bd17bdc5cbfe7158bf106d19c0c7bac36a3abcf6efbc44d71870f6
|