Skip to main content

Neural-Matter Network (NMN) - Advanced neural network layers with attention mechanisms

Project description

NMN Logo

โš›๏ธ NMN โ€” Neural Matter Networks

Not the neurons we want, but the neurons we need

Activation-free neural layers that learn non-linearity through geometric operations

PyPI version Downloads GitHub stars Tests Coverage Python License

๐Ÿ“š Documentation ยท ๐Ÿ“„ Read the Paper ยท ๐Ÿ“ Read the Blog ยท ๐Ÿ› Report Bug ยท ๐ŸŒ Azetta.ai


๐ŸŽฏ TL;DR

NMN replaces traditional Linear + ReLU with a single geometric operation that learns non-linearity without activation functions:

# Traditional approach
y = relu(linear(x))  # dot product โ†’ activation

# NMN approach
y = yat(x)  # geometric operation with built-in non-linearity

The Yat-Product (โตŸ) balances similarity and distance to create inherently non-linear transformationsโ€”no activations needed.


โœจ Key Features

Feature Description
๐Ÿ”ฅ Activation-Free Learn complex non-linear relationships without ReLU, sigmoid, or tanh
๐ŸŒ Multi-Framework PyTorch, TensorFlow, Keras, Flax (Linen & NNX)
๐Ÿงฎ Geometric Foundation Based on distance-similarity tradeoff, not just correlations
โœ… Full Framework Parity Dense, Conv, ConvTranspose, Attention, Embedding, and Squashers across all 5 frameworks
๐Ÿง  Complete Layer Suite Dense, Conv1D/2D/3D, ConvTranspose1D/2D/3D, Multi-Head Attention, Embeddings
โšก Production Ready Comprehensive tests, CI/CD, high code coverage

๐Ÿ“ The Mathematics

Yat-Product (โตŸ)

The core operation that powers NMN:

$$ โตŸ(\mathbf{w}, \mathbf{x}) = \frac{\langle \mathbf{w}, \mathbf{x} \rangle^2}{|\mathbf{w} - \mathbf{x}|^2 + \epsilon} $$

๐Ÿ” Geometric Interpretation (click to expand)

Rewriting in terms of norms and angles:

$$ โตŸ(\mathbf{w}, \mathbf{x}) = \frac{|\mathbf{w}|^2 |\mathbf{x}|^2 \cos^2\theta}{|\mathbf{w}|^2 - 2\langle\mathbf{w}, \mathbf{x}\rangle + |\mathbf{x}|^2 + \epsilon} $$

Output is maximized when:

  • โœ… Vectors are aligned (small ฮธ โ†’ large cosยฒฮธ)
  • โœ… Vectors are close (small Euclidean distance)
  • โœ… Vectors have large magnitude (amplifies the signal)

This creates a fundamentally different learning dynamic:

Traditional Neuron Yat Neuron
Measures correlation only Balances similarity AND proximity
Requires activation for non-linearity Non-linearity is intrinsic
Can fire for distant but aligned vectors Penalizes distance between w and x

Yat-Convolution (โตŸ*)

The same principle applied to local patches:

$$ โตŸ^*(\mathbf{W}, \mathbf{X}) = \frac{(\sum_{i,j} w_{ij} \cdot x_{ij})^2}{\sum_{i,j}(w_{ij} - x_{ij})^2 + \epsilon} $$

Where W is the kernel and X is the input patch.


๐Ÿš€ Quick Start

Installation

pip install nmn

# Framework-specific installations
pip install "nmn[torch]"    # PyTorch
pip install "nmn[keras]"    # Keras/TensorFlow
pip install "nmn[nnx]"      # Flax NNX (JAX)
pip install "nmn[linen]"    # Flax Linen (JAX)
pip install "nmn[all]"      # Everything

Basic Usage

PyTorch

import torch
from nmn.torch import YatNMN

layer = YatNMN(
    in_features=128,
    out_features=64,
    epsilon=1e-5
)

x = torch.randn(32, 128)
y = layer(x)  # (32, 64) โ€” non-linear output!

Keras

import keras
from nmn.keras import YatNMN

layer = YatNMN(
    features=64,
    epsilon=1e-5
)

x = keras.ops.zeros((32, 128))
y = layer(x)  # (32, 64)

Flax NNX

import jax.numpy as jnp
from flax import nnx
from nmn.nnx import YatNMN

layer = YatNMN(
    in_features=128,
    out_features=64,
    rngs=nnx.Rngs(0)
)

x = jnp.zeros((32, 128))
y = layer(x)  # (32, 64)

TensorFlow

import tensorflow as tf
from nmn.tf import YatNMN

layer = YatNMN(features=64)

x = tf.zeros((32, 128))
y = layer(x)  # (32, 64)

๐Ÿ“ฆ Layer Support Matrix

All layers are available across all 5 frameworks with verified numerical equivalence.

Layer PyTorch TensorFlow Keras Flax NNX Flax Linen
YatNMN (Dense) โœ… โœ… โœ… โœ… โœ…
YatConv1D โœ… โœ… โœ… โœ… โœ…
YatConv2D โœ… โœ… โœ… โœ… โœ…
YatConv3D โœ… โœ… โœ… โœ… โœ…
YatConvTranspose1D โœ… โœ… โœ… โœ… โœ…
YatConvTranspose2D โœ… โœ… โœ… โœ… โœ…
YatConvTranspose3D โœ… โœ… โœ… โœ… โœ…
MultiHeadAttention โœ… โœ… โœ… โœ… โœ…
YatEmbed โœ… โœ… โœ… โœ… โœ…
Squashers โœ… โœ… โœ… โœ… โœ…

Advanced Attention Variants (Flax NNX)

Variant Description Complexity
RotaryYatAttention YAT + Rotary Position Embeddings (RoPE) O(nยฒ)
Spherical YAT-Performer YAT + FAVOR+ random features O(n)

๐Ÿ”ฌ Cross-Framework Consistency

All implementations are verified to produce numerically equivalent outputs given identical inputs and weights:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              Cross-Framework Consistency Test               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Framework Pair          โ”‚ Max Error    โ”‚ Status            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  PyTorch โ†” TensorFlow    โ”‚ < 1e-6       โ”‚ โœ… PASS           โ”‚
โ”‚  PyTorch โ†” Keras         โ”‚ < 1e-6       โ”‚ โœ… PASS           โ”‚
โ”‚  PyTorch โ†” Flax NNX      โ”‚ < 1e-6       โ”‚ โœ… PASS           โ”‚
โ”‚  PyTorch โ†” Flax Linen    โ”‚ < 1e-6       โ”‚ โœ… PASS           โ”‚
โ”‚  TensorFlow โ†” Keras      โ”‚ < 1e-7       โ”‚ โœ… PASS           โ”‚
โ”‚  Flax NNX โ†” Flax Linen   โ”‚ < 1e-7       โ”‚ โœ… PASS           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โš™๏ธ Advanced Features

Attention Mechanisms

# PyTorch
from nmn.torch import MultiHeadYatAttention

attn = MultiHeadYatAttention(embed_dim=512, num_heads=8)
output = attn(query, key, value)

# Flax NNX โ€” with Rotary Position Embeddings
from nmn.nnx import RotaryYatAttention
from flax import nnx

attn = RotaryYatAttention(
    num_heads=8,
    in_features=512,
    rngs=nnx.Rngs(0)
)
output = attn(x)

# Flax NNX โ€” Spherical YAT-Performer (O(n) linear complexity)
from nmn.nnx import MultiHeadAttention

attn = MultiHeadAttention(
    num_heads=8,
    in_features=512,
    use_performer=True,
    rngs=nnx.Rngs(0)
)
output = attn(x)

Embeddings

# PyTorch
from nmn.torch import YatEmbed

embed = YatEmbed(num_embeddings=10000, embedding_dim=128)
output = embed(token_ids)

# Flax NNX
from nmn.nnx import Embed
from flax import nnx

embed = Embed(
    num_embeddings=10000,
    features=128,
    constant_alpha=True,
    rngs=nnx.Rngs(0)
)
output = embed(token_ids)
# YAT attend for attention-based retrieval
scores = embed.attend(query)

Squashing Functions

Alternatives to standard activation functions, available in all frameworks:

from nmn.nnx import softermax, softer_sigmoid, soft_tanh

y1 = softermax(x, n=2)              # Smoother softmax with power n
y2 = softer_sigmoid(x, sharpness=1) # Smooth sigmoid variant
y3 = soft_tanh(x)                   # Smooth tanh variant

See EXAMPLES.md for comprehensive usage guides including:

  • Framework-specific quick starts (PyTorch, Keras, TensorFlow, Flax)
  • Architecture examples (CNN, Transformer)
  • Advanced features (custom squashers, attention)

Quick run:

# PyTorch Examples
python src/nmn/torch/examples/quick_example.py         # Quick demo
python src/nmn/torch/examples/vision/resnet_training.py # ResNet training

# Flax NNX Examples
python src/nmn/nnx/examples/vision/aether_resnet50_tpu.py  # ResNet50 on TPU
python src/nmn/nnx/examples/language/m3za.py                # MiniBERT pre-training
python src/nmn/nnx/examples/language/m3za_perf.py           # Performance evaluation

๐Ÿงช Testing

Comprehensive test suite with cross-framework validation:

# Install test dependencies
pip install "nmn[test]"

# Run all tests
pytest tests/ -v

# Run specific framework tests
pytest tests/test_torch/ -v      # PyTorch
pytest tests/test_keras/ -v      # Keras
pytest tests/test_nnx/ -v        # Flax NNX

# Cross-framework consistency validation
pytest tests/integration/test_cross_framework_consistency.py -v

# With coverage report
pytest tests/ --cov=nmn --cov-report=html

๐Ÿ“š Theoretical Foundation

Based on the research papers:

Deep Learning 2.0: Artificial Neurons that Matter โ€” Reject Correlation, Embrace Orthogonality

Deep Learning 2.1: Mind and Cosmos โ€” Towards Cosmos-Inspired Interpretable Neural Networks

Why Yat-Product?

Traditional neurons compute: $y = \sigma(\mathbf{w}^\top \mathbf{x} + b)$

This has limitations:

  • Correlation-based: Only measures alignment, ignores proximity
  • Requires activation: Non-linearity is external
  • Spurious activations: Can fire strongly for distant but aligned vectors

The Yat-Product addresses these by combining:

  1. Squared dot product (similarity) in the numerator
  2. Squared distance (proximity) in the denominator
  3. Epsilon for numerical stability

The result is a neuron that responds geometrically โ€” activated when inputs are both similar AND close to weights.


๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

# Development setup
git clone https://github.com/azettaai/nmn.git
cd nmn
pip install -e ".[dev,test]"

# Run tests
pytest tests/ -v

# Format code
black src/ tests/
isort src/ tests/

Areas for contribution:

  • ๐Ÿ› Bug fixes (open issues)
  • โœจ New layer types (normalization, graph, etc.)
  • ๐Ÿ“š Documentation and tutorials
  • โšก Performance optimizations
  • ๐ŸŽจ Example applications

๐Ÿ“– Quick API Reference

Common Parameters

Parameter Type Description
in_features int Input dimension (Dense) or channels (Conv)
out_features int Output dimension or filters
kernel_size int | tuple Convolution kernel size
epsilon float Numerical stability (default: 1e-5)
use_bias bool Include bias term (default: True)
constant_alpha bool Use fixed โˆš2 scaling (default: varies)
spherical bool Enable spherical mode (default: False)

Framework Imports

# PyTorch
from nmn.torch import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.torch import softermax, softer_sigmoid, soft_tanh

# Keras
from nmn.keras import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.keras import softermax, softer_sigmoid, soft_tanh

# TensorFlow
from nmn.tf import YatNMN, YatConv2D, MultiHeadYatAttention, YatEmbed
from nmn.tf import softermax, softer_sigmoid, soft_tanh

# Flax NNX (includes advanced attention variants)
from nmn.nnx import YatNMN, YatConv, MultiHeadAttention, Embed
from nmn.nnx import RotaryYatAttention, softermax

# Flax Linen
from nmn.linen import YatNMN, YatConv2D, MultiHeadAttention, YatEmbed
from nmn.linen import softermax, softer_sigmoid, soft_tanh

๐Ÿ“‹ Full reference โ†’ EXAMPLES.md


๐Ÿ“„ Citation

If you use NMN in your research, please cite:

@software{nmn2024,
  author = {Bouhsine, Taha},
  title = {NMN: Neural Matter Networks},
  year = {2024},
  url = {https://github.com/azettaai/nmn}
}

@article{bouhsine2024dl2,
  author = {Bouhsine, Taha},
  title = {Deep Learning 2.0: Artificial Neurons that Matter --- Reject Correlation, Embrace Orthogonality},
  year = {2024}
}

๐Ÿ“ฌ Support & Community


๐Ÿ“œ License

AGPL-3.0 โ€” Free for personal, academic, and commercial use with attribution.

If you modify and deploy on a network, you must share the source code.

For alternative licensing, contact us at taha@azetta.ai.


๐Ÿ™ Acknowledgments

This project was originally developed under the mlnomadpy organization and is now maintained by Azetta.ai.

The foundations of NMN were established through extensive research and community contributions. We're grateful to everyone who has contributed code, feedback, and ideas to make this project better.


Built with โค๏ธ by Azetta.ai ยท Originally created by ML Nomad

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nmn-0.2.24.tar.gz (543.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nmn-0.2.24-py3-none-any.whl (192.1 kB view details)

Uploaded Python 3

File details

Details for the file nmn-0.2.24.tar.gz.

File metadata

  • Download URL: nmn-0.2.24.tar.gz
  • Upload date:
  • Size: 543.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nmn-0.2.24.tar.gz
Algorithm Hash digest
SHA256 a22fb27ff1efa0b189606add787a083fa2e20780dadf7c9284f7f7d6ab2823db
MD5 cb02780ba9336ff2b66dd548e0d1ed7d
BLAKE2b-256 353ac89075f5d17b9c23185368a7807f99077c67af84695a4d8966c3538a8084

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmn-0.2.24.tar.gz:

Publisher: publish.yml on azettaai/nmn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nmn-0.2.24-py3-none-any.whl.

File metadata

  • Download URL: nmn-0.2.24-py3-none-any.whl
  • Upload date:
  • Size: 192.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nmn-0.2.24-py3-none-any.whl
Algorithm Hash digest
SHA256 f72a2deb670acedf0995003220ac62ae5306171ad8d4a4b5852df8424e771c18
MD5 2ff072c17502e3016729b4a699b6468d
BLAKE2b-256 a548924158164a5fbaa01cd0cfa76de54fbed2dd4949da451d2d59a1fce2fcea

See more details on using hashes here.

Provenance

The following attestation bundles were made for nmn-0.2.24-py3-none-any.whl:

Publisher: publish.yml on azettaai/nmn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page