An implementation of multilayer perceptrons in NumPy.
Project description
nerva-numpy
nerva-numpy is a minimal, transparent implementation of multilayer perceptrons using NumPy tensors.
It is part of the Nerva project — a suite of Python and C++ libraries that provide well-specified, inspectable implementations of neural networks.
➡️ All equations in this repository are written in batch (minibatch) matrix form, meaning feedforward, backpropagation, and loss functions operate on minibatches of inputs rather than single examples.
🗺️ Overview
The nerva libraries aim to demystify neural networks by:
- Providing precise mathematical specifications.
- Implementing core concepts like backpropagation from scratch.
- Avoiding automatic differentiation to foster understanding.
Currently supported: Multilayer Perceptrons (MLPs).
Future extensions to convolutional or recurrent networks are possible.
❓ Why Use nerva
If you're learning or teaching how neural networks work, most modern frameworks (e.g., PyTorch, TensorFlow) can be too opaque. nerva is different:
- Every function has a clear mathematical interpretation.
- Gradient computations are written by hand — no autograd.
- Includes symbolic validation to ensure correctness.
- Modular and backend-agnostic: choose between JAX, NumPy, PyTorch, or TensorFlow.
- Used as a reference implementation for research and education.
- Modularity: the core operations rely on a small set of primitive matrix operations, making the logic easy to inspect, test, and validate.
📦 Available Python Packages
Each backend has a dedicated PyPI package and GitHub repository:
| Package | Backend | PyPI | GitHub |
|---|---|---|---|
nerva-jax |
JAX | nerva-jax | repo |
nerva-numpy |
NumPy | nerva-numpy | repo |
nerva-tensorflow |
TensorFlow | nerva-tensorflow | repo |
nerva-torch |
PyTorch | nerva-torch | repo |
nerva-sympy |
SymPy | nerva-sympy | repo |
📝
nerva-sympyis intended for validation and testing — it depends on the other four.
See the nerva meta-repo for an overview of all Python and C++ variants.
🚀 Quick Start
Installation
The library can be installed in two ways: from the source repository or from the Python Package Index (PyPI).
# Install from the local repository
pip install .
# Install directly from PyPI
pip install nerva-numpy
Example: Define and Train an MLP
# Create a new MLP model
M = MultilayerPerceptron()
M.layers = [
ActivationLayer(784, 1024, ReLUActivation()),
ActivationLayer(1024, 512, ReLUActivation()),
LinearLayer(512, 10)
]
for layer in M.layers:
layer.set_optimizer("Momentum(0.9)")
layer.set_weights("XavierNormal")
loss = StableSoftmaxCrossEntropyLossFunction()
learning_rate = ConstantScheduler(0.01)
epochs = 10
# Load data
train_loader, test_loader = create_npz_dataloaders("../data/mnist-flattened.npz", batch_size=100)
# Train the network
stochastic_gradient_descent(M, epochs, loss, learning_rate, train_loader, test_loader)
🧱 Architecture
Each major concept is implemented through clear interface classes. Implementations are modular and easy to replace:
| Concept | Interface Class | Example Implementations |
|---|---|---|
| Layer | Layer |
ActivationLayer, LinearLayer |
| Activation Function | ActivationFunction |
ReLUActivation, SigmoidActivation |
| Loss Function | LossFunction |
SoftmaxCrossEntropyLossFunction |
| Optimizer | Optimizer |
GradientDescentOptimizer, MomentumOptimizer |
| Learning Rate Schedule | LearningRateScheduler |
ConstantScheduler, ExponentialScheduler |
🛠 Features
- Feedforward and backpropagation logic match documented equations exactly.
- Formulas use batch matrix form, enabling efficient computation over minibatches.
- Customizable optimizers per parameter group using a composite pattern.
- Symbolic gradient validation using nerva-sympy.
- Lightweight command-line interface for experiments.
📚 Documentation
The full documentation is hosted on GitHub Pages:
From there you can access:
Relevant papers:
-
Nerva: a Truly Sparse Implementation of Neural Networks
arXiv:2407.17437 Introduces the library and reports sparse training experiments.
-
Batch Matrix-form Equations and Implementation of Multilayer Perceptrons
arXiv:2511.11918 Includes mathematical specifications and derivations.
🧪 Training Loop Internals
A mini-batch gradient descent loop with forward, backward, and optimizer steps can be implemented in just a few lines of code:
def stochastic_gradient_descent(M: MultilayerPerceptron,
epochs: int,
loss: LossFunction,
learning_rate: LearningRateScheduler,
train_loader: DataLoader):
for epoch in range(epochs):
lr = learning_rate(epoch)
# Iterate over mini-batches X with target T
for (X, T) in train_loader:
Y = M.feedforward(X)
DY = loss.gradient(Y, T) / Y.shape[0]
M.backpropagate(Y, DY)
M.optimize(lr)
✅ Symbolic Validation (Softmax Layer Example)
We validate the manually written backpropagation code using symbolic differentiation via SymPy.
This example validates the gradient of the softmax layer. It also illustrates how the gradients DZ, DW, Db and DX of the intermediate variable Z, the weights W, bias b and input X are calculated from the output Y and its gradient DY.
# Backpropagation gradients
DZ = hadamard(Y, DY - row_repeat(diag(Y.T * DY).T, K))
DW = DZ * X.T
Db = rows_sum(DZ)
DX = W.T * DZ
# Symbolic comparison
DW1 = gradient(loss(Y), w)
assert equal_matrices(DW, DW1)
🔢 Implementation via Matrix Operations
The validated backpropagation formulae are implemented directly using the library's core set of primitive matrix operations. This approach provides a significant advantage in clarity and maintainability by expressing all computations from loss functions and activation layers to gradient calculations through a single, global vocabulary of operations.
This stands in contrast to implementations that use hundreds of lines of scattered, special-case logic for the same mathematical result. By reducing complex formulae to a concise sequence of well-defined primitives, the implementation becomes both more readable and far easier to verify and debug.
For a complete reference of all available operations, see the Table of Matrix Operations.
📜 License
Distributed under the Boost Software License 1.0.
License file
🙋 Contributing
Bug reports and contributions are welcome via the GitHub issue tracker.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nerva_numpy-1.0.0.tar.gz.
File metadata
- Download URL: nerva_numpy-1.0.0.tar.gz
- Upload date:
- Size: 34.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a3a2891dcd555d331154ed3433dbfae04145c0f41954563cea8c30830f051b4
|
|
| MD5 |
a45bdb7a4ff4a1982fa1750512077741
|
|
| BLAKE2b-256 |
2dfecdaae3d6034c052aa7e3d205021fe612882603d67e9d4e2c4e24cd5fdf6c
|
File details
Details for the file nerva_numpy-1.0.0-py3-none-any.whl.
File metadata
- Download URL: nerva_numpy-1.0.0-py3-none-any.whl
- Upload date:
- Size: 27.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b33462197d25f0b354b8483f89ca82df7ff1393f63d8701b1aa37c88fec0233e
|
|
| MD5 |
fabe63a79fb88b00662d02d5d201a45c
|
|
| BLAKE2b-256 |
f5cd7394bd3a5ca8f4e42dfe1d09aca142814840f43d48bf982a5a2d336403cc
|