Skip to main content

An implementation of multilayer perceptrons in NumPy.

Project description

nerva-numpy

PyPI License: BSL-1.0

nerva-numpy is a minimal, transparent implementation of multilayer perceptrons using NumPy tensors.
It is part of the Nerva project — a suite of Python and C++ libraries that provide well-specified, inspectable implementations of neural networks.

➡️ All equations in this repository are written in batch (minibatch) matrix form, meaning feedforward, backpropagation, and loss functions operate on minibatches of inputs rather than single examples.

🗺️ Overview

The nerva libraries aim to demystify neural networks by:

  • Providing precise mathematical specifications.
  • Implementing core concepts like backpropagation from scratch.
  • Avoiding automatic differentiation to foster understanding.

Currently supported: Multilayer Perceptrons (MLPs).
Future extensions to convolutional or recurrent networks are possible.


❓ Why Use nerva

If you're learning or teaching how neural networks work, most modern frameworks (e.g., PyTorch, TensorFlow) can be too opaque. nerva is different:

  • Every function has a clear mathematical interpretation.
  • Gradient computations are written by hand — no autograd.
  • Includes symbolic validation to ensure correctness.
  • Modular and backend-agnostic: choose between JAX, NumPy, PyTorch, or TensorFlow.
  • Used as a reference implementation for research and education.
  • Modularity: the core operations rely on a small set of primitive matrix operations, making the logic easy to inspect, test, and validate.

📦 Available Python Packages

Each backend has a dedicated PyPI package and GitHub repository:

Package Backend PyPI GitHub
nerva-jax JAX nerva-jax repo
nerva-numpy NumPy nerva-numpy repo
nerva-tensorflow TensorFlow nerva-tensorflow repo
nerva-torch PyTorch nerva-torch repo
nerva-sympy SymPy nerva-sympy repo

📝 nerva-sympy is intended for validation and testing — it depends on the other four.

See the nerva meta-repo for an overview of all Python and C++ variants.


🚀 Quick Start

Installation

The library can be installed in two ways: from the source repository or from the Python Package Index (PyPI).

# Install from the local repository
pip install .
# Install directly from PyPI
pip install nerva-numpy

Example: Define and Train an MLP

# Create a new MLP model
M = MultilayerPerceptron()
M.layers = [
    ActivationLayer(784, 1024, ReLUActivation()),
    ActivationLayer(1024, 512, ReLUActivation()),
    LinearLayer(512, 10)
]
for layer in M.layers:
    layer.set_optimizer("Momentum(0.9)")
    layer.set_weights("XavierNormal")

loss = StableSoftmaxCrossEntropyLossFunction()
learning_rate = ConstantScheduler(0.01)
epochs = 10

# Load data
train_loader, test_loader = create_npz_dataloaders("../data/mnist-flattened.npz", batch_size=100)

# Train the network
stochastic_gradient_descent(M, epochs, loss, learning_rate, train_loader, test_loader)

🧱 Architecture

Each major concept is implemented through clear interface classes. Implementations are modular and easy to replace:

Concept Interface Class Example Implementations
Layer Layer ActivationLayer, LinearLayer
Activation Function ActivationFunction ReLUActivation, SigmoidActivation
Loss Function LossFunction SoftmaxCrossEntropyLossFunction
Optimizer Optimizer GradientDescentOptimizer, MomentumOptimizer
Learning Rate Schedule LearningRateScheduler ConstantScheduler, ExponentialScheduler

🛠 Features

  • Feedforward and backpropagation logic match documented equations exactly.
  • Formulas use batch matrix form, enabling efficient computation over minibatches.
  • Customizable optimizers per parameter group using a composite pattern.
  • Symbolic gradient validation using nerva-sympy.
  • Lightweight command-line interface for experiments.

📚 Documentation

The full documentation is hosted on GitHub Pages:

From there you can access:

Relevant papers:

  1. Nerva: a Truly Sparse Implementation of Neural Networks

    arXiv:2407.17437 Introduces the library and reports sparse training experiments.

  2. Batch Matrix-form Equations and Implementation of Multilayer Perceptrons

    arXiv:2511.11918 Includes mathematical specifications and derivations.


🧪 Training Loop Internals

A mini-batch gradient descent loop with forward, backward, and optimizer steps can be implemented in just a few lines of code:

def stochastic_gradient_descent(M: MultilayerPerceptron,
                                epochs: int,
                                loss: LossFunction,
                                learning_rate: LearningRateScheduler,
                                train_loader: DataLoader):

    for epoch in range(epochs):
        lr = learning_rate(epoch)

        # Iterate over mini-batches X with target T
        for (X, T) in train_loader:
            Y = M.feedforward(X)
            DY = loss.gradient(Y, T) / Y.shape[0]
            M.backpropagate(Y, DY)
            M.optimize(lr)

✅ Symbolic Validation (Softmax Layer Example)

We validate the manually written backpropagation code using symbolic differentiation via SymPy.

This example validates the gradient of the softmax layer. It also illustrates how the gradients DZ, DW, Db and DX of the intermediate variable Z, the weights W, bias b and input X are calculated from the output Y and its gradient DY.

# Backpropagation gradients
DZ = hadamard(Y, DY - row_repeat(diag(Y.T * DY).T, K))
DW = DZ * X.T
Db = rows_sum(DZ)
DX = W.T * DZ

# Symbolic comparison
DW1 = gradient(loss(Y), w)
assert equal_matrices(DW, DW1)

🔢 Implementation via Matrix Operations

The validated backpropagation formulae are implemented directly using the library's core set of primitive matrix operations. This approach provides a significant advantage in clarity and maintainability by expressing all computations from loss functions and activation layers to gradient calculations through a single, global vocabulary of operations.

This stands in contrast to implementations that use hundreds of lines of scattered, special-case logic for the same mathematical result. By reducing complex formulae to a concise sequence of well-defined primitives, the implementation becomes both more readable and far easier to verify and debug.

For a complete reference of all available operations, see the Table of Matrix Operations.


📜 License

Distributed under the Boost Software License 1.0.
License file


🙋 Contributing

Bug reports and contributions are welcome via the GitHub issue tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nerva_numpy-1.0.0.tar.gz (34.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nerva_numpy-1.0.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file nerva_numpy-1.0.0.tar.gz.

File metadata

  • Download URL: nerva_numpy-1.0.0.tar.gz
  • Upload date:
  • Size: 34.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for nerva_numpy-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0a3a2891dcd555d331154ed3433dbfae04145c0f41954563cea8c30830f051b4
MD5 a45bdb7a4ff4a1982fa1750512077741
BLAKE2b-256 2dfecdaae3d6034c052aa7e3d205021fe612882603d67e9d4e2c4e24cd5fdf6c

See more details on using hashes here.

File details

Details for the file nerva_numpy-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: nerva_numpy-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for nerva_numpy-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b33462197d25f0b354b8483f89ca82df7ff1393f63d8701b1aa37c88fec0233e
MD5 fabe63a79fb88b00662d02d5d201a45c
BLAKE2b-256 f5cd7394bd3a5ca8f4e42dfe1d09aca142814840f43d48bf982a5a2d336403cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page