Skip to main content

An implementation of multilayer perceptrons in Tensorflow.

Project description

nerva-tensorflow

PyPI License: BSL-1.0

nerva-tensorflow is a minimal, transparent implementation of multilayer perceptrons using TensorFlow tensors.
It is part of the Nerva project — a suite of Python and C++ libraries that provide well-specified, inspectable implementations of neural networks.

➡️ All equations in this repository are written in batch (minibatch) matrix form, meaning feedforward, backpropagation, and loss functions operate on minibatches of inputs rather than single examples.

🗺️ Overview

The nerva libraries aim to demystify neural networks by:

  • Providing precise mathematical specifications.
  • Implementing core concepts like backpropagation from scratch.
  • Avoiding automatic differentiation to foster understanding.

Currently supported: Multilayer Perceptrons (MLPs).
Future extensions to convolutional or recurrent networks are possible.


❓ Why Use nerva

If you're learning or teaching how neural networks work, most modern frameworks (e.g., PyTorch, TensorFlow) can be too opaque. nerva is different:

  • Every function has a clear mathematical interpretation.
  • Gradient computations are written by hand — no autograd.
  • Includes symbolic validation to ensure correctness.
  • Modular and backend-agnostic: choose between JAX, NumPy, PyTorch, or TensorFlow.
  • Used as a reference implementation for research and education.
  • Modularity: the core operations rely on a small set of primitive matrix operations, making the logic easy to inspect, test, and validate.

📦 Available Python Packages

Each backend has a dedicated PyPI package and GitHub repository:

Package Backend PyPI GitHub
nerva-jax JAX nerva-jax repo
nerva-numpy NumPy nerva-numpy repo
nerva-tensorflow TensorFlow nerva-tensorflow repo
nerva-torch PyTorch nerva-torch repo
nerva-sympy SymPy nerva-sympy repo

📝 nerva-sympy is intended for validation and testing — it depends on the other four.

See the nerva meta-repo for an overview of all Python and C++ variants.


🚀 Quick Start

Installation

The library can be installed in two ways: from the source repository or from the Python Package Index (PyPI).

# Install from the local repository
pip install .
# Install directly from PyPI
pip install nerva-tensorflow

Example: Define and Train an MLP

# Create a new MLP model
M = MultilayerPerceptron()
M.layers = [
    ActivationLayer(784, 1024, ReLUActivation()),
    ActivationLayer(1024, 512, ReLUActivation()),
    LinearLayer(512, 10)
]
for layer in M.layers:
    layer.set_optimizer("Momentum(0.9)")
    layer.set_weights("XavierNormal")

loss = StableSoftmaxCrossEntropyLossFunction()
learning_rate = ConstantScheduler(0.01)
epochs = 10

# Load data
train_loader, test_loader = create_npz_dataloaders("../data/mnist-flattened.npz", batch_size=100)

# Train the network
stochastic_gradient_descent(M, epochs, loss, learning_rate, train_loader, test_loader)

🧱 Architecture

Each major concept is implemented through clear interface classes. Implementations are modular and easy to replace:

Concept Interface Class Example Implementations
Layer Layer ActivationLayer, LinearLayer
Activation Function ActivationFunction ReLUActivation, SigmoidActivation
Loss Function LossFunction SoftmaxCrossEntropyLossFunction
Optimizer Optimizer GradientDescentOptimizer, MomentumOptimizer
Learning Rate Schedule LearningRateScheduler ConstantScheduler, ExponentialScheduler

🛠 Features

  • Feedforward and backpropagation logic match documented equations exactly.
  • Formulas use batch matrix form, enabling efficient computation over minibatches.
  • Customizable optimizers per parameter group using a composite pattern.
  • Symbolic gradient validation using nerva-sympy.
  • Lightweight command-line interface for experiments.

📚 Documentation

The full documentation is hosted on GitHub Pages:

From there you can access:

Relevant papers:

  1. Nerva: a Truly Sparse Implementation of Neural Networks

    arXiv:2407.17437 Introduces the library and reports sparse training experiments.

  2. Batch Matrix-form Equations and Implementation of Multilayer Perceptrons

    arXiv:2511.11918 Includes mathematical specifications and derivations.


🧪 Training Loop Internals

A mini-batch gradient descent loop with forward, backward, and optimizer steps can be implemented in just a few lines of code:

def stochastic_gradient_descent(M: MultilayerPerceptron,
                                epochs: int,
                                loss: LossFunction,
                                learning_rate: LearningRateScheduler,
                                train_loader: DataLoader):

    for epoch in range(epochs):
        lr = learning_rate(epoch)

        # Iterate over mini-batches X with target T
        for (X, T) in train_loader:
            Y = M.feedforward(X)
            DY = loss.gradient(Y, T) / Y.shape[0]
            M.backpropagate(Y, DY)
            M.optimize(lr)

✅ Symbolic Validation (Softmax Layer Example)

We validate the manually written backpropagation code using symbolic differentiation via SymPy.

This example validates the gradient of the softmax layer. It also illustrates how the gradients DZ, DW, Db and DX of the intermediate variable Z, the weights W, bias b and input X are calculated from the output Y and its gradient DY.

# Backpropagation gradients
DZ = hadamard(Y, DY - row_repeat(diag(Y.T * DY).T, K))
DW = DZ * X.T
Db = rows_sum(DZ)
DX = W.T * DZ

# Symbolic comparison
DW1 = gradient(loss(Y), w)
assert equal_matrices(DW, DW1)

🔢 Implementation via Matrix Operations

The validated backpropagation formulae are implemented directly using the library's core set of primitive matrix operations. This approach provides a significant advantage in clarity and maintainability by expressing all computations from loss functions and activation layers to gradient calculations through a single, global vocabulary of operations.

This stands in contrast to implementations that use hundreds of lines of scattered, special-case logic for the same mathematical result. By reducing complex formulae to a concise sequence of well-defined primitives, the implementation becomes both more readable and far easier to verify and debug.

For a complete reference of all available operations, see the Table of Matrix Operations.


📜 License

Distributed under the Boost Software License 1.0.
License file


🙋 Contributing

Bug reports and contributions are welcome via the GitHub issue tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nerva_tensorflow-1.0.0.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nerva_tensorflow-1.0.0-py3-none-any.whl (27.7 kB view details)

Uploaded Python 3

File details

Details for the file nerva_tensorflow-1.0.0.tar.gz.

File metadata

  • Download URL: nerva_tensorflow-1.0.0.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for nerva_tensorflow-1.0.0.tar.gz
Algorithm Hash digest
SHA256 69d429ea8feb7586d74705fa0d2e08205f79e790bcb465bb11e0810fb66aae79
MD5 78a8e0c16137a4130daf176f1bab1114
BLAKE2b-256 358dd211c5f1eee6dda26b14c2df18bc3b6bcf95120e617edaa9ae92ef0ca250

See more details on using hashes here.

File details

Details for the file nerva_tensorflow-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nerva_tensorflow-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c0cded2eaef865a4b2bb77be10a610fd79447a22e2ba982dd56d77dfabf51ad5
MD5 35804378b489418fd5ae7ff9cb8dc531
BLAKE2b-256 881b200c892cf47a81410d8dcf45208d3475340bc5c0a222c9e3248d8e8b1419

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page