Skip to main content

Modern neural networks in pure NumPy - Transformers, ResNet, and more

Project description

ForgeNN

Python License PyPI

Modern neural network framework built from scratch with NumPy

I got tired of the bloated ML frameworks that hide everything behind abstractions, so I built this. It's a fully-functional deep learning library that implements modern architectures (Transformers, ResNet, attention mechanisms) using just NumPy.

Why? Because sometimes you need to actually understand what's happening under the hood. And because I can.


What's in here?

The good stuff

  • Transformer encoder - yeah, the attention mechanism everyone's talking about
  • ResNet blocks - because deep networks are cool
  • Modern activations - GELU (GPT uses this), Swish (EfficientNet), Mish, and the classics
  • Smart initialization - Xavier, He, LeCun, Orthogonal (actually matters)

Features I actually use

  • Multi-head self-attention
  • Layer normalization
  • Dropout (because overfitting is real)
  • Early stopping (saves time)
  • Adam optimizer (because it just works)
  • Model save/load (obviously)

What makes this different

No TensorFlow. No PyTorch. Just NumPy and math.

You can actually read the code and understand what's happening. Try doing that with PyTorch's C++ backend.

Setup

Install from PyPI (recommended)

pip install forgenn

Or install from source

git clone https://github.com/Cobkgukgg/forgenn.git
cd forgenn
pip install -e .

That's it. Seriously, just NumPy.

numpy>=1.19.0

Quick example

Build a network in like 10 lines:

from forgenn import NeuralNetwork, Dense, TrainingConfig
import numpy as np

# some random data
X = np.random.randn(1000, 10)
y = np.random.randint(0, 2, (1000, 1))

# build it
model = NeuralNetwork("MyFirstModel")
model.add(Dense(10, 64, activation="relu"))
model.add(Dense(64, 32, activation="gelu"))  # gelu because why not
model.add(Dense(32, 1, activation="sigmoid"))

# train it
model.compile(loss="binary_crossentropy", optimizer="adam")
model.fit(X, y, TrainingConfig(epochs=100, batch_size=32))

# use it
predictions = model.predict(X[:5])

Or use pre-built stuff

I already made some common architectures:

from forgenn import Architectures

# ResNet for when you need to go deep
model = Architectures.resnet(
    input_dim=784,
    num_blocks=3,
    hidden_dim=128,
    output_dim=10
)

# Transformer because transformers are everywhere now
model = Architectures.transformer_encoder(
    input_dim=512,
    num_heads=8,
    ff_dim=2048,
    num_layers=6
)

Config stuff

You can tweak things:

from forgenn import TrainingConfig

config = TrainingConfig(
    learning_rate=0.001,      # standard
    batch_size=64,            # bigger = faster but needs more RAM
    epochs=200,               # or until early stopping kicks in
    dropout_rate=0.3,         # helps with overfitting
    early_stopping=True,      # stop when val loss stops improving
    patience=15,              # how long to wait
    validation_split=0.2      # use 20% for validation
)

history = model.fit(X_train, y_train, config)

How it works

Layers you can use

Dense(input_size, output_size, 
      activation="relu",
      dropout_rate=0.0)

MultiHeadAttention(embed_dim, num_heads, dropout=0.1)

LayerNormalization(normalized_shape)

Conv2D(in_channels, out_channels, 
       kernel_size=3, stride=1, padding=0)

ResidualBlock(dim, activation="relu")

Activations

What When to use
relu Default choice, works most of the time
gelu Transformers (GPT, BERT use this)
swish Good for mobile/efficient networks
mish Newer, slightly better than ReLU
leaky_relu When you get dead neurons

Loss functions

  • mse - regression
  • mae - regression (robust to outliers)
  • binary_crossentropy - binary classification
  • categorical_crossentropy - multi-class
  • huber - regression with outliers

Model Methods

# Add layer
model.add(layer)

# Compile
model.compile(loss="mse", optimizer="adam")

# Train
model.fit(X_train, y_train, config=config)

# Predict
predictions = model.predict(X_test)

# Evaluate
results = model.evaluate(X_test, y_test)

# Save/Load
model.save("model.pkl")
model.load("model.pkl")

# Summary
model.summary()

Examples

MNIST-style classification

import numpy as np
from forgenn import Architectures, TrainingConfig

# flatten those images
X_train = train_images.reshape(-1, 784) / 255.0
y_train = np.eye(10)[train_labels]

# build something that works
model = Architectures.mlp(
    input_dim=784,
    hidden_dims=[256, 128],
    output_dim=10,
    activation="gelu"
)

model.compile(loss="categorical_crossentropy", optimizer="adam")

config = TrainingConfig(
    learning_rate=0.001,
    batch_size=128,
    epochs=50,
    early_stopping=True
)

model.fit(X_train, y_train, config)

# check how we did
results = model.evaluate(X_test, y_test)
print(f"Accuracy: {results['accuracy']:.4f}")

Custom architecture

Mix and match whatever you want:

from forgenn import NeuralNetwork, Dense, ResidualBlock, LayerNormalization

model = NeuralNetwork("MyCustomNet")

model.add(Dense(100, 256, activation="gelu", dropout_rate=0.3))
model.add(LayerNormalization(256))

# throw in some residual blocks
for _ in range(3):
    model.add(ResidualBlock(256, activation="mish"))

model.add(Dense(256, 10, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="adam")
model.summary()

Some notes

Why GELU?

Used in GPT and BERT. Smoother than ReLU, works better for NLP stuff. The math is kinda cool:

GELU(x) = 0.5 * x * (1 + tanh(sqrt(2/π) * (x + 0.044715 * x³)))

Residual connections

The thing that made deep networks actually work:

output = F(x) + x

Gradients can flow back easier. Without this, training deep networks is pain.

Performance

Tested on my laptop (i7, 16GB RAM, no GPU):

Dataset Model Accuracy Time
MNIST MLP ~98% 2 min
MNIST ResNet ~99% 4 min
CIFAR-10 ResNet ~75% 15 min

Not bad for pure Python/NumPy.

Todo

Things I might add:

  • Batch normalization
  • LSTM/GRU layers
  • Better conv layers
  • GPU support (CuPy?)
  • Model visualization
  • Data loaders
  • More optimizers

Pull requests welcome.

Contributing

Found a bug? Want to add something? PRs are open.

Just keep it clean and add tests.

License

MIT - do whatever you want with it


Made this because I was bored and wanted to actually understand how transformers work. Turned out pretty decent.

If you use this for something cool, let me know!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forgenn_ml-1.0.0.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forgenn_ml-1.0.0-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file forgenn_ml-1.0.0.tar.gz.

File metadata

  • Download URL: forgenn_ml-1.0.0.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for forgenn_ml-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b679915e41a62dfdc1fb86640d9b4dd0a7079a869ebc0be26f41a07efb5d3511
MD5 18883c1f388070fcc40e199190efc15b
BLAKE2b-256 e004b82120ca2e1b935856d2749643d07b036a1577cf63004589e796efc85431

See more details on using hashes here.

File details

Details for the file forgenn_ml-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: forgenn_ml-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for forgenn_ml-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d7ea9155ea1865ba2fa18ce4ce6a45b11a4e76c00d296cb57e8249698648aa2
MD5 ced52f2fe7c82a055a6c757097155010
BLAKE2b-256 33e44c396b98e6b802fe9b4040cffe6ada15c837edc83b8f4c237ad0ee13416e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page