A simple neural network library written in Python

These details have not been verified by PyPI

Project description

Enilnets Library Documentation

A pure NumPy-based deep learning library with support for dense, convolutional, pooling, batch normalization, dropout, and sparse layers. Includes multiple optimizers, loss functions, activation functions, and weight initialization methods.

Quick Start
Core Architecture
Model Configuration
- NeuralNet Constructor
- Summary
Layer Types
Forward Pass
Backward Pass
Optimizers
Loss Functions
Training
Activation Functions
Weight Initialization
Reinforcement Learning
Model I/O
Utility Functions

Quick Start

from Enilnets import NeuralNet
import numpy as np

# Create a simple classifier
model = NeuralNet(learning_rate=0.001, optimizer="adam", l2_lambda=0.01)

# Build architecture
model.add_dense(784, 256, activation="relu")
model.add_dropout(0.3)
model.add_dense(256, 10, activation="softmax")

# Train
X_train = np.random.randn(1000, 784)
Y_train = np.eye(10)[np.random.randint(0, 10, 1000)]

history = model.Train(X_train, Y_train, epochs=10, batch_size=32)

# Predict
predictions = model.Forward(X_test)

Core Architecture

The library is built around the NeuralNet class in base.py, which maintains:

Attribute	Type	Description
`layers`	`list`	Layer definitions with weights, biases, and hyperparameters
`learning_rate`	`float`	Global learning rate
`optimizer_type`	`str`	Optimizer name: `"sgd"`, `"rmsprop"`, `"adagrad"`, `"adam"`
`l2_lambda`	`float`	L2 regularization coefficient
`momentum`	`float`	Momentum coefficient for SGD
`outputs`	`list`	Cached layer outputs during forward pass
`pre_activations`	`list`	Cached pre-activation values (z)
`batchnorm_cache`	`list`	BatchNorm statistics cache
`deltas`	`list`	Gradient errors per layer
`opt_state`	`list`	Optimizer state (momentum, velocity)
`t`	`int`	Global timestep for bias correction (Adam)

Model Configuration

NeuralNet Constructor

NeuralNet(learning_rate=0.001, optimizer="adam", l2_lambda=0.01, momentum=0.9)

Parameters:

Parameter	Default	Description
`learning_rate`	`0.001`	Step size for parameter updates
`optimizer`	`"adam"`	Optimization algorithm. Options: `"sgd"`, `"rmsprop"`, `"adagrad"`, `"adam"`
`l2_lambda`	`0.01`	L2 regularization strength applied to weights
`momentum`	`0.9`	Momentum factor for SGD optimizer

Example:

# Adam with low regularization
model = NeuralNet(learning_rate=0.001, optimizer="adam", l2_lambda=0.001)

# SGD with high momentum
model = NeuralNet(learning_rate=0.01, optimizer="sgd", momentum=0.95, l2_lambda=0.0)

Summary

model.summary()

Prints a model architecture overview including layer types, dimensions, and total parameter count. No parameters, returns None.

Output Example:

Model Summary
============================================================
Optimizer: ADAM | LR: 0.001 | L2: 0.01
============================================================
Layer 0: DENSE - Input: 784, Output: 256, Params: 200960
Layer 1: DROPOUT
Layer 2: DENSE - Input: 256, Output: 10, Params: 2570
Total Parameters: 203530
============================================================

Layer Types

All layer methods are bound to the NeuralNet class and return None (they mutate self.layers).

Dense Layer

model.add_dense(n_in, n_out, activation="relu", init_method="xavier_uniform")

Fully connected (linear) layer: output = activation(x @ W^T + b)

Parameters:

Parameter	Default	Description
`n_in`	required	Number of input features
`n_out`	required	Number of output features (neurons)
`activation`	`"relu"`	Activation function name (see Activation Functions)
`init_method`	`"xavier_uniform"`	Weight initialization method (see Weight Initialization)

Layer Dictionary Keys:

type: "dense"
weights: (n_out, n_in) ndarray
bias: (n_out,) ndarray
activation: activation string

Example:

model.add_dense(784, 256, activation="relu", init_method="he_normal")
model.add_dense(256, 10, activation="softmax")

Sparse Layer

model.add_sparse(n_in, n_out, connectivity=0.5, activation="relu", init_method="xavier_uniform")

Dense layer with random connectivity masking. Only a fraction of weights are non-zero.

Parameters:

Parameter	Default	Description
`n_in`	required	Number of input features
`n_out`	required	Number of output features
`connectivity`	`0.5`	Fraction of weights to keep (0.0 to 1.0)
`activation`	`"relu"`	Activation function name
`init_method`	`"xavier_uniform"`	Weight initialization method

Layer Dictionary Keys:

type: "sparse"
weights: (n_out, n_in) ndarray (masked)
bias: (n_out,) ndarray
mask: (n_out, n_in) boolean ndarray
activation: activation string

Note: During backpropagation, gradients are masked by layer["mask"] to maintain sparsity.

Example:

# 30% connectivity sparse layer
model.add_sparse(784, 256, connectivity=0.3, activation="relu")

Conv2D Layer

model.add_conv2d(in_ch, out_ch, k, activation="relu", init_method="he_normal")

2D convolutional layer. Uses im2col for efficient convolution computation.

Parameters:

Parameter	Default	Description
`in_ch`	required	Number of input channels
`out_ch`	required	Number of output channels (filters)
`k`	required	Kernel size (square kernel: k x k)
`activation`	`"relu"`	Activation function name
`init_method`	`"he_normal"`	Weight initialization method

Layer Dictionary Keys:

type: "conv2d"
weights: (out_ch, in_ch, k, k) ndarray
bias: (out_ch,) ndarray
in_ch, out_ch, k: integers
activation: activation string

Notes:

Stride is fixed at 1, padding is fixed at 0.
Output spatial dimensions: (H - k + 1, W - k + 1)
Input must be 4D: (batch, channels, height, width)

Example:

# 3x3 conv, 1 input channel -> 32 output channels
model.add_conv2d(1, 32, k=3, activation="relu")
# 3x3 conv, 32 input channels -> 64 output channels
model.add_conv2d(32, 64, k=3, activation="relu")

Flatten Layer

model.add_flatten()

Flattens multi-dimensional input to 2D: (batch, ...) -> (batch, -1).

Parameters: None

Layer Dictionary Keys:

type: "flatten"

Example:

model.add_conv2d(1, 32, k=3)
model.add_maxpool2d(2)
model.add_flatten()  # Flattens (B, 32, H, W) to (B, 32*H*W)
model.add_dense(32*14*14, 128)

MaxPool2D Layer

model.add_maxpool2d(pool_size=2)

Max pooling with square kernel. Reduces spatial dimensions by pool_size.

Parameters:

Parameter	Default	Description
`pool_size`	`2`	Size of pooling window (p x p)

Layer Dictionary Keys:

type: "maxpool2d"
p: pool size integer

Notes:

Input is trimmed to multiples of pool_size before pooling.
Backpropagation routes gradient only to the max-valued position(s) in each window.

Example:

model.add_conv2d(1, 32, k=3)
model.add_maxpool2d(pool_size=2)  # Halves spatial dimensions

AvgPool2D Layer

model.add_avgpool2d(pool_size=2)

Average pooling with square kernel.

Parameters:

Parameter	Default	Description
`pool_size`	`2`	Size of pooling window (p x p)

Layer Dictionary Keys:

type: "avgpool2d"
p: pool size integer

Notes:

Backpropagation distributes gradient equally across all positions in each window.

Example:

model.add_conv2d(1, 32, k=3)
model.add_avgpool2d(pool_size=2)

BatchNorm Layer

model.add_batchnorm(num_features, epsilon=1e-5, momentum=0.1)

Batch normalization layer. Normalizes across the batch dimension.

Parameters:

Parameter	Default	Description
`num_features`	required	Number of features (must match flattened input dimension)
`epsilon`	`1e-5`	Small constant for numerical stability
`momentum`	`0.1`	Running statistics update momentum (0.1 means 90% old, 10% new)

Layer Dictionary Keys:

type: "batchnorm"
num_features: integer
epsilon, momentum: floats
running_mean, running_var: running statistics
gamma: scale parameter, initialized to 1
beta: shift parameter, initialized to 0

Notes:

Input is flattened to 2D, normalized, then reshaped back.
During training, uses batch statistics and updates running statistics.
During inference (training=False), uses running statistics.

Example:

model.add_dense(256, 128)
model.add_batchnorm(128)  # Must match output of previous layer
model.add_dense(128, 10, activation="softmax")

Dropout Layer

model.add_dropout(rate=0.5)

Randomly zeroes elements during training with probability rate.

Parameters:

Parameter	Default	Description
`rate`	`0.5`	Dropout probability (0.0 = no dropout, 1.0 = drop everything)

Layer Dictionary Keys:

type: "dropout"
rate: float
mask: cached mask during training (set during forward pass)

Notes:

Active only when training=True in Forward() or TrainBatch().
Scales surviving activations by 1/(1-rate) (inverted dropout).

Example:

model.add_dense(256, 128, activation="relu")
model.add_dropout(0.3)  # 30% dropout
model.add_dense(128, 10, activation="softmax")

Forward Pass

Forward / Predict

output = model.Forward(inputs, training=False, dropout_rate=0.0)
output = model.predict(inputs)  # Alias for Forward

Runs the forward pass through all layers.

Parameters:

Parameter	Default	Description
`inputs`	required	Input array. Can be 1D `(features,)`, 2D `(batch, features)`, 3D `(channels, height, width)`, or 4D `(batch, channels, height, width)`
`training`	`False`	Whether to enable training-specific behavior (dropout, batch norm updates)
`dropout_rate`	`0.0`	Fallback dropout rate if layer doesn't specify one

Returns:

output: ndarray of shape matching the last layer's output

Side Effects:

Populates self.outputs (layer-by-layer activations)
Populates self.pre_activations (pre-activation values for dense/conv layers)
Populates self.batchnorm_cache (batch norm statistics)
Sets layer["mask"] for dropout layers during training

Example:

# Inference
pred = model.Forward(X_test, training=False)

# Training (enables dropout and batch norm updates)
pred = model.Forward(X_batch, training=True)

Backward Pass

Backward

model.Backward(targets)
model.Backward(None, output_delta=output_delta)  # For custom gradients (e.g. REINFORCE)

Computes gradients via backpropagation and stores them in self.deltas.

Parameters:

Parameter	Type	Description
`targets`	ndarray	Target values. Shape: `(batch, output_dim)` or `(output_dim,)` for single sample
`output_delta`	ndarray	Optional. Custom gradient w.r.t. output layer. If provided, `targets` is ignored.

Side Effects:

Populates self.deltas with gradients for each layer
For batch norm layers, stores d_gamma and d_beta in the layer dict

Gradient Computation:

Output layer: If softmax activation, uses delta = (output - targets) / batch_size
Otherwise: delta = (output - targets) * activation_derivative(pre_activation) / batch_size
Hidden layers: Propagates error backward through weights, then multiplies by activation derivative

Example:

model.Forward(X_batch, training=True)
model.Backward(Y_batch)
model.update()  # Apply gradients

Optimizers

Update

model.update()

Applies computed gradients to update all layer parameters. No parameters, no return value.

Supported Optimizers:

Optimizer	Description	Hyperparameters Used
`"sgd"`	Stochastic Gradient Descent with momentum	`learning_rate`, `momentum`
`"rmsprop"`	RMSProp adaptive learning	`learning_rate`
`"adagrad"`	AdaGrad adaptive learning	`learning_rate`
`"adam"`	Adam (default)	`learning_rate`, `t` (timestep)

Adam Configuration (fixed):

beta1 = 0.9 (first moment decay)
beta2 = 0.999 (second moment decay)
epsilon = 1e-8 (numerical stability)
Bias correction applied: m / (1 - beta1^t), v / (1 - beta2^t)

L2 Regularization:

Applied to all dense, sparse, and conv2d weights: grad_w += l2_lambda * weights
Not applied to biases or batch norm parameters

Sparse Layer Handling:

Gradients are masked by layer["mask"] before update

Example:

model = NeuralNet(optimizer="adam", learning_rate=0.001)
# ... forward and backward ...
model.update()

Loss Functions

ComputeLoss

loss = model.ComputeLoss(output, target, function="mse", reduction="mean", **kwargs)

Computes the loss between predictions and targets.

Parameters:

Parameter	Default	Description
`output`	required	Model predictions
`target`	required	Ground truth values
`function`	`"mse"`	Loss function name (see below)
`reduction`	`"mean"`	`"mean"`, `"sum"`, or `"none"` (returns per-element loss)
`**kwargs`		Additional arguments for specific loss functions

Available Loss Functions:

Function	Description	Extra Args	Formula
`"mse"`	Mean Squared Error	none	`(o - t)^2`
`"mae"`	Mean Absolute Error	none	`
`"huber"`	Huber Loss	`delta=1.0`	`0.5diff^2` if `diff < delta`, else `delta(diff - 0.5*delta)`
`"smooth_l1"`	Smooth L1 Loss	none	`0.5*diff^2` if `diff < 1`, else `diff - 0.5`
`"binary_cross_entropy"`	Binary Cross-Entropy	none	`-tlog(o) - (1-t)log(1-o)`
`"cross_entropy"` / `"categorical_cross_entropy"`	Categorical Cross-Entropy	none	`-t*log(o)`
`"focal"`	Focal Loss	`alpha=0.25`, `gamma=2.0`	Down-weights easy examples
`"hinge"`	Hinge Loss (SVM)	none	`max(0, 1 - t*o)`

Notes:

For cross-entropy losses, outputs are clipped to [1e-12, 1.0] to prevent log(0).
For binary cross-entropy, outputs are clipped to [1e-12, 1 - 1e-12].

Example:

# MSE loss
loss = model.ComputeLoss(pred, target, "mse", "mean")

# Focal loss for imbalanced classification
loss = model.ComputeLoss(pred, target, "focal", alpha=0.25, gamma=2.0)

# Per-sample losses (no reduction)
losses = model.ComputeLoss(pred, target, "cross_entropy", "none")

Training

TrainBatch

loss = model.TrainBatch(xs, ys, loss_function=None, **loss_kwargs)

Trains on a single batch (forward + backward + update).

Parameters:

Parameter	Default	Description
`xs`	required	Input batch
`ys`	required	Target batch
`loss_function`	`None`	Loss function name. Auto-detected: `"cross_entropy"` if last layer uses `"softmax"`, else `"mse"`
`**loss_kwargs`		Extra arguments passed to `ComputeLoss`

Returns:

loss: float, the computed loss value

Example:

loss = model.TrainBatch(X_batch, Y_batch, loss_function="cross_entropy")

Train

history = model.Train(X_train, Y_train, epochs=10, batch_size=32, 
                      X_val=None, Y_val=None, loss_function=None, 
                      verbose=True, **loss_kwargs)

Full training loop with batching, optional validation, and history tracking.

Parameters:

Parameter	Default	Description
`X_train`	required	Training inputs
`Y_train`	required	Training targets
`epochs`	`10`	Number of training epochs
`batch_size`	`32`	Batch size
`X_val`	`None`	Validation inputs (optional)
`Y_val`	`None`	Validation targets (optional)
`loss_function`	`None`	Loss function name (auto-detected if None)
`verbose`	`True`	Print progress per epoch
`**loss_kwargs`		Extra arguments for `ComputeLoss`

Returns:

history: dict with keys "loss", "accuracy", "val_loss", "val_accuracy" (lists of floats)

Behavior:

Shuffles data each epoch
Computes average loss and accuracy per epoch
If validation data provided, computes validation metrics after each epoch
Accuracy: multi-class uses argmax, binary uses > 0.5 threshold

Example:

history = model.Train(
    X_train, Y_train,
    epochs=20, batch_size=64,
    X_val=X_val, Y_val=Y_val,
    loss_function="cross_entropy",
    verbose=True
)

# Plot training history
import matplotlib.pyplot as plt
plt.plot(history["loss"], label="train")
plt.plot(history["val_loss"], label="val")
plt.legend()
plt.show()

ComputeAccuracy

acc = model.compute_accuracy(predictions, targets)

Standalone accuracy computation.

Parameters:

Parameter	Description
`predictions`	Model output array
`targets`	Ground truth array

Returns:

acc: float, proportion of correct predictions (0.0 to 1.0)

Logic:

If last dimension > 1: multi-class, uses argmax
If last dimension == 1: binary, uses > 0.5 threshold

Example:

acc = model.compute_accuracy(model.Forward(X_test), Y_test)
print(f"Accuracy: {acc:.4f}")

Activation Functions

Activation functions are defined in Enilnets.activations. Both activate() and derivative() are available as standalone functions, but are typically used internally by the library.

Standalone Functions

from Enilnets.activations import activate, derivative

out = activate("relu", x)
grad = derivative("relu", x)

Available Activations:

Name	Function `activate(x)`	Derivative `derivative(x)`	Notes
`"relu"`	`max(0, x)`	`1` if `x > 0`, else `0`	Default for hidden layers
`"leakyrelu"`	`x` if `x > 0`, else `0.01*x`	`1` if `x > 0`, else `0.01`	Negative slope 0.01
`"elu"`	`x` if `x > 0`, else `exp(x) - 1`	`1` if `x > 0`, else `exp(x)`
`"selu"`	`scale * (x if x>0 else alpha*(exp(x)-1))`	`scale * (1 if x>0 else alpha*exp(x))`	alpha=1.673, scale=1.051
`"gelu"`	`0.5x(1+tanh(sqrt(2/pi)(x+0.044715x^3)))`	`cdf + x*pdf`	Gaussian Error Linear Unit
`"swish"`	`x * sigmoid(x)`	`s + xs(1-s)`	Self-gated
`"sigmoid"`	`1/(1+exp(-x))`	`sigmoid(x)*(1-sigmoid(x))`	Clipped to [-500, 500]
`"tanh"`	`tanh(x)`	`1 - tanh(x)^2`
`"softmax"`	`exp(x - max(x)) / sum(exp(x - max(x)))`	N/A (handled specially in backprop)	Output layer only
`"linear"`	`x`	`1`	Default if no activation specified

Usage in Layers:

model.add_dense(256, 128, activation="gelu")
model.add_dense(128, 10, activation="softmax")

Weight Initialization

Weight initialization functions are defined in Enilnets.weight_init. Used internally by layer constructors.

Standalone Functions

from Enilnets.weight_init import init_weights, init_conv_weights

w, b = init_weights(n_in, n_out, method="xavier_uniform")
w, b = init_conv_weights(in_ch, out_ch, k, method="he_normal")

Available Methods:

Method	Dense Formula	Conv Formula	Best For
`"xavier_uniform"`	`U(-sqrt(6/(n_in+n_out)), sqrt(6/(n_in+n_out)))`	`U(-sqrt(6/(in_chkk+out_ch)), ...)`	Sigmoid/tanh
`"xavier_normal"`	`N(0, sqrt(2/(n_in+n_out)))`	`N(0, sqrt(2/(in_chkk+out_ch)))`	Sigmoid/tanh
`"he_uniform"`	`U(-sqrt(6/n_in), sqrt(6/n_in))`	`U(-sqrt(6/(in_chkk)), ...)`	ReLU variants
`"he_normal"`	`N(0, sqrt(2/n_in))`	`N(0, sqrt(2/(in_chkk)))`	ReLU variants (default for conv)
`"normal"`	`N(0, 0.1)`	`N(0, 0.1)`	General purpose
`"orthogonal"`	SVD-based orthogonal init	SVD-based, reshaped	RNNs, deep nets

Biases:

All methods initialize biases to zeros.

Reinforcement Learning

Reinforce (Policy Gradient)

avg_return = model.Reinforce(states, actions, returns, action_type="discrete", std=1.0, normalize_returns=True)

Real REINFORCE (Monte-Carlo Policy Gradient) algorithm. Updates the policy network by following the gradient of expected reward using the model's configured optimizer (Adam, SGD, etc.).

Parameters:

Parameter	Default	Description
`states`	required	Array of observed states, shape `(N, features)` or `(N, C, H, W)`
`actions`	required	Discrete: int array `(N,)`. Continuous: array `(N, action_dim)`
`returns`	required	Discounted returns (rewards-to-go), shape `(N,)` or `(N, 1)`
`action_type`	`"discrete"`	`"discrete"` (softmax policy) or `"continuous"` (Gaussian policy)
`std`	`1.0`	Fixed standard deviation for continuous Gaussian policy
`normalize_returns`	`True`	Normalize returns to zero mean / unit variance for lower variance

Returns:

avg_return: float, average raw return for the batch

Notes:

For discrete actions, the final layer should use "softmax" activation.
For continuous actions, the final layer should use "linear" activation.
Gradients are computed via backpropagation and applied with the model's configured optimizer.
Use compute_returns() to compute discounted returns from raw rewards.

Example:

from Enilnets import NeuralNet
from Enilnets.reinforce import compute_returns
import numpy as np

# Discrete action space (e.g., CartPole)
model = NeuralNet(learning_rate=0.001, optimizer="adam")
model.add_dense(4, 128, activation="relu")
model.add_dense(128, 2, activation="softmax")

# Dummy episode data
states = np.random.randn(100, 4)
actions = np.random.randint(0, 2, 100)
rewards = np.random.randn(100)

returns = compute_returns(rewards, gamma=0.99)
avg_ret = model.Reinforce(states, actions, returns)
print(f"Avg return: {avg_ret:.4f}")

compute_returns

from Enilnets.reinforce import compute_returns
returns = compute_returns(rewards, gamma=0.99)

Computes discounted returns for a single episode.

Parameters:

Parameter	Default	Description
`rewards`	required	1-D array of step rewards
`gamma`	`0.99`	Discount factor

Returns:

returns: ndarray of discounted returns (same length as rewards)

Example:

rewards = [1.0, 0.0, 1.0, 1.0]
returns = compute_returns(rewards, gamma=0.95)
# returns: [2.7175, 1.8075, 1.90, 1.0]

Evolve

best_score = model.Evolve(inputs, score_fn, noise=0.05, tries=10, sigma=1.0)

Evolutionary Strategy (ES). Perturbs network weights with Gaussian noise and keeps the best performing variant. This method was previously misnamed Reinforce.

Parameters:

Parameter	Default	Description
`inputs`	required	Input to evaluate on
`score_fn`	required	Callable that takes model output and returns a scalar score (higher = better)
`noise`	`0.05`	Standard deviation multiplier for weight perturbations
`tries`	`10`	Number of candidate networks to try
`sigma`	`1.0`	Base standard deviation for perturbations

Returns:

best_score: float, the highest score achieved

Side Effects:

Mutates self.layers to the best performing configuration
Original weights are lost unless saved beforehand

Example:

def score_fn(output):
    return np.mean(output[:, 0])  # maximize first output dimension

model.Evolve(state_input, score_fn, noise=0.1, tries=20)

Model I/O

Save

model.Save(file)

Saves model architecture, weights, and training state to disk.

Parameters:

Parameter	Description
`file`	File path. Extension determines format: `.pkl` for pickle, anything else for JSON

Saved Data:

Layer configurations and weights
Optimizer type and hyperparameters
Training timestep t

Example:

model.Save("model.pkl")      # Binary pickle format
model.Save("model.json")     # Human-readable JSON format

Load

model.Load(file)

Restores model from saved file.

Parameters:

Parameter	Description
`file`	File path (`.pkl` or `.json`)

Restored Data:

All layer weights and configurations
Optimizer settings
Resets opt_state (optimizer momentum buffers are cleared)

Example:

model = NeuralNet()
model.Load("model.pkl")
predictions = model.Forward(X_test)

Utility Functions

im2col

from Enilnets.forward import im2col

col = im2col(input_data, filter_h, filter_w, stride=1, pad=0)

Converts image batches to column format for efficient convolution.

Parameters:

Parameter	Default	Description
`input_data`	required	4D array `(N, C, H, W)`
`filter_h`	required	Filter height
`filter_w`	required	Filter width
`stride`	`1`	Stride
`pad`	`0`	Zero padding

Returns:

col: 2D array (N * out_h * out_w, C * filter_h * filter_w)

Example:

col = im2col(images, 3, 3, stride=1, pad=1)
# Now matrix multiplication can replace convolution

Complete Example: MNIST Classifier

from Enilnets import NeuralNet
import numpy as np

# Load data (pseudo-code)
# X_train: (60000, 1, 28, 28), Y_train: (60000, 10) one-hot
# X_test: (10000, 1, 28, 28), Y_test: (10000, 10)

# Build model
model = NeuralNet(learning_rate=0.001, optimizer="adam", l2_lambda=0.0001)

# Conv block 1
model.add_conv2d(1, 32, k=3, activation="relu", init_method="he_normal")
model.add_maxpool2d(2)

# Conv block 2
model.add_conv2d(32, 64, k=3, activation="relu", init_method="he_normal")
model.add_maxpool2d(2)

# Classifier
model.add_flatten()
model.add_dense(64 * 7 * 7, 256, activation="relu")
model.add_dropout(0.5)
model.add_dense(256, 10, activation="softmax")

# Print summary
model.summary()

# Train
history = model.Train(
    X_train, Y_train,
    epochs=10, batch_size=128,
    X_val=X_test, Y_val=Y_test,
    loss_function="cross_entropy",
    verbose=True
)

# Save
model.Save("mnist_model.pkl")

# Load and predict later
model2 = NeuralNet()
model2.Load("mnist_model.pkl")
predictions = model2.Forward(X_test)

Architecture Notes

Data Format

The library uses channels-first format for convolutions:

4D input: (batch, channels, height, width)
This matches PyTorch convention, not TensorFlow's channels-last.

Backward Pass Details

The backward pass handles layer transitions automatically:

From Layer	To Layer	Error Propagation
Dense/Sparse	Dense/Sparse	`np.dot(delta, W.T)`
Conv2D	Conv2D	`conv2d_backward_input` (full convolution with flipped weights)
Flatten	Dense	`reshape` to match output shape
MaxPool2D	Any	Routes to max positions only
AvgPool2D	Any	Distributes equally
Dropout	Any	Scales by `mask / (1 - rate)`
BatchNorm	Any	Backprop through normalization + gamma/beta gradients

Numerical Stability

Sigmoid clips inputs to [-500, 500] to prevent overflow
Cross-entropy clips probabilities to [1e-12, 1.0]
BatchNorm uses epsilon=1e-5 for division stability
All computations use float64 dtype

API Reference Summary

NeuralNet Methods

Method	Description
`__init__(lr, opt, l2, mom)`	Constructor
`summary()`	Print architecture
`add_dense(n_in, n_out, ...)`	Add dense layer
`add_sparse(n_in, n_out, ...)`	Add sparse layer
`add_conv2d(in_ch, out_ch, k, ...)`	Add conv layer
`add_flatten()`	Add flatten layer
`add_maxpool2d(p)`	Add max pool
`add_avgpool2d(p)`	Add avg pool
`add_batchnorm(n_features, ...)`	Add batch norm
`add_dropout(rate)`	Add dropout
`Forward(x, training, dropout_rate)`	Forward pass
`predict(x)`	Alias for Forward
`Backward(targets, output_delta)`	Backpropagation
`update()`	Apply gradients
`TrainBatch(xs, ys, ...)`	Train one batch
`Train(X, Y, epochs, ...)`	Full training loop
`ComputeLoss(out, tgt, ...)`	Compute loss
`compute_accuracy(pred, tgt)`	Compute accuracy
`Reinforce(states, actions, returns, ...)`	Policy gradient (REINFORCE)
`Evolve(inputs, score_fn, ...)`	Evolutionary strategy
`Save(file)`	Save model
`Load(file)`	Load model

Standalone Functions

Function	Module	Description
`activate(name, x)`	`Enilnets.activations`	Apply activation
`derivative(name, x)`	`Enilnets.activations`	Activation derivative
`init_weights(n_in, n_out, method)`	`Enilnets.weight_init`	Init dense weights
`init_conv_weights(in_ch, out_ch, k, method)`	`Enilnets.weight_init`	Init conv weights
`im2col(data, fh, fw, stride, pad)`	`Enilnets.forward`	Image to columns
`batchnorm_forward(x, layer, training)`	`Enilnets.forward`	Batch norm forward
`maxpool2d_backward(d, x, p)`	`Enilnets.backward`	Max pool backprop
`avgpool2d_backward(d, x, p)`	`Enilnets.backward`	Avg pool backprop
`batchnorm_backward(dout, cache)`	`Enilnets.backward`	Batch norm backprop
`conv2d_backward_input(d, w, shape)`	`Enilnets.backward`	Conv input gradient
`compute_returns(rewards, gamma)`	`Enilnets.reinforce`	Discounted returns for an episode

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.1.0

Jun 29, 2026

2.0.0

Jun 29, 2026

This version

1.1.2

Jun 28, 2026

1.0.1

Jun 25, 2026

1.0.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enilnets-1.1.2.tar.gz (36.3 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

enilnets-1.1.2-py3-none-any.whl (22.8 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file enilnets-1.1.2.tar.gz.

File metadata

Download URL: enilnets-1.1.2.tar.gz
Upload date: Jun 28, 2026
Size: 36.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.6

File hashes

Hashes for enilnets-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`786d7b8a2c45c00c5540b0302d373b215d2c5a8f7efa911dd882e29526f128ac`
MD5	`02cbbcbe73fbc69f2a1daa7b95969465`
BLAKE2b-256	`3a5da5cd4645c704c54a72309bd34a05bafad62aba7f2b8e296397b6eb0834d4`

See more details on using hashes here.

File details

Details for the file enilnets-1.1.2-py3-none-any.whl.

File metadata

Download URL: enilnets-1.1.2-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 22.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.6

File hashes

Hashes for enilnets-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8651161e036feed14dc97763af727dfe064324959dd91ab51cca1217758339c`
MD5	`c7e21dfa87fc14eb73577415afdd9d43`
BLAKE2b-256	`57d04663fd35338c6052dc9595d574f64dfdfc8444f3be1ef3b7c1e5cafcc570`

See more details on using hashes here.

Enilnets 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Enilnets Library Documentation

Table of Contents

Quick Start

Core Architecture

Model Configuration

NeuralNet Constructor

Summary

Layer Types

Dense Layer

Sparse Layer

Conv2D Layer

Flatten Layer

MaxPool2D Layer

AvgPool2D Layer

BatchNorm Layer

Dropout Layer

Forward Pass

Forward / Predict

Backward Pass

Backward

Optimizers

Update

Loss Functions

ComputeLoss

Training

TrainBatch

Train

ComputeAccuracy

Activation Functions

Standalone Functions

Weight Initialization

Standalone Functions

Reinforcement Learning

Reinforce (Policy Gradient)

compute_returns

Evolve

Model I/O

Save

Load

Utility Functions

im2col

Complete Example: MNIST Classifier

Architecture Notes

Data Format

Backward Pass Details

Numerical Stability

API Reference Summary

NeuralNet Methods

Standalone Functions

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes