Neural-Matter Network (NMN) - Advanced neural network layers with attention mechanisms
Project description
โ๏ธ NMN โ Neural Matter Networks
Not the neurons we want, but the neurons we need
Activation-free neural layers that learn non-linearity through geometric operations
๐ Documentation ยท ๐ Read the Paper ยท ๐ Read the Blog ยท ๐ Report Bug ยท ๐ Azetta.ai
๐ฏ TL;DR
NMN replaces traditional Linear + ReLU with a single geometric operation that learns non-linearity without activation functions:
# Traditional approach
y = relu(linear(x)) # dot product โ activation
# NMN approach
y = yat(x) # geometric operation with built-in non-linearity
The Yat-Product (โต) balances similarity and distance to create inherently non-linear transformationsโno activations needed.
โจ Key Features
| Feature | Description |
|---|---|
| ๐ฅ Activation-Free | Learn complex non-linear relationships without ReLU, sigmoid, or tanh |
| ๐ Multi-Framework | PyTorch, TensorFlow, Keras, Flax (Linen & NNX) |
| ๐งฎ Geometric Foundation | Based on distance-similarity tradeoff, not just correlations |
| โ Full Framework Parity | Dense, Conv, ConvTranspose, Attention, Embedding, and Squashers across all 5 frameworks |
| ๐ง Complete Layer Suite | Dense, Conv1D/2D/3D, ConvTranspose1D/2D/3D, Multi-Head Attention, Embeddings |
| โก Production Ready | Comprehensive tests, CI/CD, high code coverage |
๐ The Mathematics
Yat-Product (โต)
The core operation that powers NMN:
$$ โต(\mathbf{w}, \mathbf{x}) = \frac{\langle \mathbf{w}, \mathbf{x} \rangle^2}{|\mathbf{w} - \mathbf{x}|^2 + \epsilon} $$
๐ Geometric Interpretation (click to expand)
Rewriting in terms of norms and angles:
$$ โต(\mathbf{w}, \mathbf{x}) = \frac{|\mathbf{w}|^2 |\mathbf{x}|^2 \cos^2\theta}{|\mathbf{w}|^2 - 2\langle\mathbf{w}, \mathbf{x}\rangle + |\mathbf{x}|^2 + \epsilon} $$
Output is maximized when:
- โ Vectors are aligned (small ฮธ โ large cosยฒฮธ)
- โ Vectors are close (small Euclidean distance)
- โ Vectors have large magnitude (amplifies the signal)
This creates a fundamentally different learning dynamic:
| Traditional Neuron | Yat Neuron |
|---|---|
| Measures correlation only | Balances similarity AND proximity |
| Requires activation for non-linearity | Non-linearity is intrinsic |
| Can fire for distant but aligned vectors | Penalizes distance between w and x |
Yat-Convolution (โต*)
The same principle applied to local patches:
$$ โต^*(\mathbf{W}, \mathbf{X}) = \frac{(\sum_{i,j} w_{ij} \cdot x_{ij})^2}{\sum_{i,j}(w_{ij} - x_{ij})^2 + \epsilon} $$
Where W is the kernel and X is the input patch.
๐ Quick Start
Installation
pip install nmn
# Framework-specific installations
pip install "nmn[torch]" # PyTorch
pip install "nmn[keras]" # Keras/TensorFlow
pip install "nmn[nnx]" # Flax NNX (JAX)
pip install "nmn[linen]" # Flax Linen (JAX)
pip install "nmn[all]" # Everything
Basic Usage
|
PyTorch import torch
from nmn.torch import YatNMN
layer = YatNMN(
in_features=128,
out_features=64,
epsilon=1e-5
)
x = torch.randn(32, 128)
y = layer(x) # (32, 64) โ non-linear output!
|
Keras import keras
from nmn.keras import YatNMN
layer = YatNMN(
features=64,
epsilon=1e-5
)
x = keras.ops.zeros((32, 128))
y = layer(x) # (32, 64)
|
|
Flax NNX import jax.numpy as jnp
from flax import nnx
from nmn.nnx import YatNMN
layer = YatNMN(
in_features=128,
out_features=64,
rngs=nnx.Rngs(0)
)
x = jnp.zeros((32, 128))
y = layer(x) # (32, 64)
|
TensorFlow import tensorflow as tf
from nmn.tf import YatNMN
layer = YatNMN(features=64)
x = tf.zeros((32, 128))
y = layer(x) # (32, 64)
|
๐ฆ Layer Support Matrix
All layers are available across all 5 frameworks with verified numerical equivalence.
| Layer | PyTorch | TensorFlow | Keras | Flax NNX | Flax Linen |
|---|---|---|---|---|---|
| YatNMN (Dense) | โ | โ | โ | โ | โ |
| YatConv1D | โ | โ | โ | โ | โ |
| YatConv2D | โ | โ | โ | โ | โ |
| YatConv3D | โ | โ | โ | โ | โ |
| YatConvTranspose1D | โ | โ | โ | โ | โ |
| YatConvTranspose2D | โ | โ | โ | โ | โ |
| YatConvTranspose3D | โ | โ | โ | โ | โ |
| MultiHeadAttention | โ | โ | โ | โ | โ |
| YatEmbed | โ | โ | โ | โ | โ |
| Squashers | โ | โ | โ | โ | โ |
Advanced Attention Variants (Flax NNX)
| Variant | Description | Complexity |
|---|---|---|
| RotaryYatAttention | YAT + Rotary Position Embeddings (RoPE) | O(nยฒ) |
| Spherical YAT-Performer | YAT + FAVOR+ random features | O(n) |
๐ฌ Cross-Framework Consistency
All implementations are verified to produce numerically equivalent outputs given identical inputs and weights:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Cross-Framework Consistency Test โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Framework Pair โ Max Error โ Status โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโค
โ PyTorch โ TensorFlow โ < 1e-6 โ โ
PASS โ
โ PyTorch โ Keras โ < 1e-6 โ โ
PASS โ
โ PyTorch โ Flax NNX โ < 1e-6 โ โ
PASS โ
โ PyTorch โ Flax Linen โ < 1e-6 โ โ
PASS โ
โ TensorFlow โ Keras โ < 1e-7 โ โ
PASS โ
โ Flax NNX โ Flax Linen โ < 1e-7 โ โ
PASS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโ
โ๏ธ Advanced Features
Attention Mechanisms
# PyTorch
from nmn.torch import MultiHeadYatAttention
attn = MultiHeadYatAttention(embed_dim=512, num_heads=8)
output = attn(query, key, value)
# Flax NNX โ with Rotary Position Embeddings
from nmn.nnx import RotaryYatAttention
from flax import nnx
attn = RotaryYatAttention(
num_heads=8,
in_features=512,
rngs=nnx.Rngs(0)
)
output = attn(x)
# Flax NNX โ Spherical YAT-Performer (O(n) linear complexity)
from nmn.nnx import MultiHeadAttention
attn = MultiHeadAttention(
num_heads=8,
in_features=512,
use_performer=True,
rngs=nnx.Rngs(0)
)
output = attn(x)
Embeddings
# PyTorch
from nmn.torch import YatEmbed
embed = YatEmbed(num_embeddings=10000, embedding_dim=128)
output = embed(token_ids)
# Flax NNX
from nmn.nnx import Embed
from flax import nnx
embed = Embed(
num_embeddings=10000,
features=128,
constant_alpha=True,
rngs=nnx.Rngs(0)
)
output = embed(token_ids)
# YAT attend for attention-based retrieval
scores = embed.attend(query)
Squashing Functions
Alternatives to standard activation functions, available in all frameworks:
from nmn.nnx import softermax, softer_sigmoid, soft_tanh
y1 = softermax(x, n=2) # Smoother softmax with power n
y2 = softer_sigmoid(x, sharpness=1) # Smooth sigmoid variant
y3 = soft_tanh(x) # Smooth tanh variant
See EXAMPLES.md for comprehensive usage guides including:
- Framework-specific quick starts (PyTorch, Keras, TensorFlow, Flax)
- Architecture examples (CNN, Transformer)
- Advanced features (custom squashers, attention)
Quick run:
# PyTorch Examples
python src/nmn/torch/examples/quick_example.py # Quick demo
python src/nmn/torch/examples/vision/resnet_training.py # ResNet training
# Flax NNX Examples
python src/nmn/nnx/examples/vision/aether_resnet50_tpu.py # ResNet50 on TPU
python src/nmn/nnx/examples/language/m3za.py # MiniBERT pre-training
python src/nmn/nnx/examples/language/m3za_perf.py # Performance evaluation
๐งช Testing
Comprehensive test suite with cross-framework validation:
# Install test dependencies
pip install "nmn[test]"
# Run all tests
pytest tests/ -v
# Run specific framework tests
pytest tests/test_torch/ -v # PyTorch
pytest tests/test_keras/ -v # Keras
pytest tests/test_nnx/ -v # Flax NNX
# Cross-framework consistency validation
pytest tests/integration/test_cross_framework_consistency.py -v
# With coverage report
pytest tests/ --cov=nmn --cov-report=html
๐ Theoretical Foundation
Based on the research papers:
Deep Learning 2.0: Artificial Neurons that Matter โ Reject Correlation, Embrace Orthogonality
Deep Learning 2.1: Mind and Cosmos โ Towards Cosmos-Inspired Interpretable Neural Networks
Why Yat-Product?
Traditional neurons compute: $y = \sigma(\mathbf{w}^\top \mathbf{x} + b)$
This has limitations:
- Correlation-based: Only measures alignment, ignores proximity
- Requires activation: Non-linearity is external
- Spurious activations: Can fire strongly for distant but aligned vectors
The Yat-Product addresses these by combining:
- Squared dot product (similarity) in the numerator
- Squared distance (proximity) in the denominator
- Epsilon for numerical stability
The result is a neuron that responds geometrically โ activated when inputs are both similar AND close to weights.
๐ค Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
# Development setup
git clone https://github.com/azettaai/nmn.git
cd nmn
pip install -e ".[dev,test]"
# Run tests
pytest tests/ -v
# Format code
black src/ tests/
isort src/ tests/
Areas for contribution:
- ๐ Bug fixes (open issues)
- โจ New layer types (normalization, graph, etc.)
- ๐ Documentation and tutorials
- โก Performance optimizations
- ๐จ Example applications
๐ Quick API Reference
Common Parameters
| Parameter | Type | Description |
|---|---|---|
in_features |
int | Input dimension (Dense) or channels (Conv) |
out_features |
int | Output dimension or filters |
kernel_size |
int | tuple | Convolution kernel size |
epsilon |
float | Numerical stability (default: 1e-5) |
use_bias |
bool | Include bias term (default: True) |
constant_alpha |
bool | Use fixed โ2 scaling (default: varies) |
spherical |
bool | Enable spherical mode (default: False) |
Framework Imports
# PyTorch
from nmn.torch import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.torch import softermax, softer_sigmoid, soft_tanh
# Keras
from nmn.keras import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.keras import softermax, softer_sigmoid, soft_tanh
# TensorFlow
from nmn.tf import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.tf import softermax, softer_sigmoid, soft_tanh
# Flax NNX (includes advanced attention variants)
from nmn.nnx import YatNMN, YatConv, MultiHeadAttention, Embed
from nmn.nnx import RotaryYatAttention, softermax
# Flax Linen
from nmn.linen import YatNMN, YatConv2D, MultiHeadAttention, YatEmbed
from nmn.linen import softermax, softer_sigmoid, soft_tanh
๐ Full reference โ EXAMPLES.md
๐ Citation
If you use NMN in your research, please cite:
@software{nmn2024,
author = {Bouhsine, Taha},
title = {NMN: Neural Matter Networks},
year = {2024},
url = {https://github.com/azettaai/nmn}
}
@article{bouhsine2024dl2,
author = {Bouhsine, Taha},
title = {Deep Learning 2.0: Artificial Neurons that Matter --- Reject Correlation, Embrace Orthogonality},
year = {2024}
}
๐ฌ Support & Community
- ๐ Issues: GitHub Issues
- ๐ฌ Discussions: GitHub Discussions
- ๐ Company: azetta.ai
- ๐ง Contact: taha@azetta.ai
๐ License
AGPL-3.0 โ Free for personal, academic, and commercial use with attribution.
If you modify and deploy on a network, you must share the source code.
For alternative licensing, contact us at taha@azetta.ai.
๐ Acknowledgments
This project was originally developed under the mlnomadpy organization and is now maintained by Azetta.ai.
The foundations of NMN were established through extensive research and community contributions. We're grateful to everyone who has contributed code, feedback, and ideas to make this project better.
Built with โค๏ธ by Azetta.ai ยท Originally created by ML Nomad
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nmn-0.2.24.tar.gz.
File metadata
- Download URL: nmn-0.2.24.tar.gz
- Upload date:
- Size: 543.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a22fb27ff1efa0b189606add787a083fa2e20780dadf7c9284f7f7d6ab2823db
|
|
| MD5 |
cb02780ba9336ff2b66dd548e0d1ed7d
|
|
| BLAKE2b-256 |
353ac89075f5d17b9c23185368a7807f99077c67af84695a4d8966c3538a8084
|
Provenance
The following attestation bundles were made for nmn-0.2.24.tar.gz:
Publisher:
publish.yml on azettaai/nmn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nmn-0.2.24.tar.gz -
Subject digest:
a22fb27ff1efa0b189606add787a083fa2e20780dadf7c9284f7f7d6ab2823db - Sigstore transparency entry: 1189522135
- Sigstore integration time:
-
Permalink:
azettaai/nmn@7f78ac54477800e2790b1653e67a0ee0eabeffce -
Branch / Tag:
refs/tags/v0.2.24 - Owner: https://github.com/azettaai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7f78ac54477800e2790b1653e67a0ee0eabeffce -
Trigger Event:
push
-
Statement type:
File details
Details for the file nmn-0.2.24-py3-none-any.whl.
File metadata
- Download URL: nmn-0.2.24-py3-none-any.whl
- Upload date:
- Size: 192.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f72a2deb670acedf0995003220ac62ae5306171ad8d4a4b5852df8424e771c18
|
|
| MD5 |
2ff072c17502e3016729b4a699b6468d
|
|
| BLAKE2b-256 |
a548924158164a5fbaa01cd0cfa76de54fbed2dd4949da451d2d59a1fce2fcea
|
Provenance
The following attestation bundles were made for nmn-0.2.24-py3-none-any.whl:
Publisher:
publish.yml on azettaai/nmn
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nmn-0.2.24-py3-none-any.whl -
Subject digest:
f72a2deb670acedf0995003220ac62ae5306171ad8d4a4b5852df8424e771c18 - Sigstore transparency entry: 1189522140
- Sigstore integration time:
-
Permalink:
azettaai/nmn@7f78ac54477800e2790b1653e67a0ee0eabeffce -
Branch / Tag:
refs/tags/v0.2.24 - Owner: https://github.com/azettaai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7f78ac54477800e2790b1653e67a0ee0eabeffce -
Trigger Event:
push
-
Statement type: