Skip to main content

Algebraic embeddings for exact neural arithmetic

Project description

FluxEM

PyPI version Python versions License: MIT

A deterministic, continuous number embedding with exact algebraic closure (within IEEE-754), usable as a baseline encoding for LLMs and transformers.

FluxEM terminal demo

Why FluxEM?

Numbers are hard for neural networks. Learned approaches like NALU struggle with extrapolation; tokenization schemes like Abacus require training. FluxEM takes a different approach: pure algebraic structure, zero training.

Target use-cases:

  • Number tokenization / continuous embeddings for LLM inputs — Drop-in numeric representation that doesn't fragment digits
  • Deterministic arithmetic module for neural nets — A differentiable primitive where embed(a) + embed(b) = embed(a+b) by construction
  • Baseline for learned arithmetic units — Compare NALU, xVal, or custom modules against a training-free reference

The core insight: arithmetic operations are group homomorphisms. Addition is vector addition. Multiplication becomes addition in log-space. This is the same trick NALU uses, but FluxEM ships it as deterministic structure rather than learned gates.

Supported Operations

Operation Syntax Embedding Algebraic Property
Addition a + b Linear embed(a) + embed(b) = embed(a+b)
Subtraction a - b Linear embed(a) - embed(b) = embed(a-b)
Multiplication a * b Logarithmic log(a) + log(b) = log(a*b)
Division a / b Logarithmic log(a) - log(b) = log(a/b)
Powers a ** b Logarithmic b * log(a) = log(a^b)
Roots sqrt(a) Logarithmic 0.5 * log(a) = log(sqrt(a))

All operations generalize out-of-distribution within IEEE-754 floating-point tolerance. See ERROR_MODEL.md for precision bounds.

Installation

Requires Python 3.10+.

pip install fluxem

Or from source (latest):

git clone https://github.com/Hmbown/FluxEM.git
cd FluxEM && pip install -e .

Quick Start

from fluxem import create_unified_model

model = create_unified_model()

model.compute("1234 + 5678")  # -> 6912.0
model.compute("250 * 4")      # -> 1000.0
model.compute("1000 / 8")     # -> 125.0
model.compute("3 ** 4")       # -> 81.0

Integration: Embedding-Level API

For integration with neural networks, use the encode/decode interface directly:

from fluxem import create_unified_model

model = create_unified_model(dim=256)

# Encode numbers to embeddings
emb_a = model.linear_encoder.encode_number(42.0)
emb_b = model.linear_encoder.encode_number(58.0)

# Arithmetic in embedding space
emb_sum = emb_a + emb_b  # Vector addition = numeric addition

# Decode back to number
result = model.linear_encoder.decode(emb_sum)  # -> 100.0

# For multiplication, use log embeddings
log_a = model.log_encoder.encode_number(6.0)
log_b = model.log_encoder.encode_number(7.0)
log_product = model.log_encoder.multiply(log_a, log_b)
product = model.log_encoder.decode(log_product)  # -> 42.0

Extended Operations

from fluxem import create_extended_ops

ops = create_extended_ops()
ops.power(2, 16)   # -> 65536.0
ops.sqrt(256)      # -> 16.0
ops.exp(1.0)       # -> 2.718...
ops.ln(2.718)      # -> 1.0...

How It Works

Embedding Operations Property Identity
Linear + - Vector addition = arithmetic addition embed(3) + embed(5) = embed(8)
Logarithmic * / ** Log-space addition = multiplication log(3) + log(4) = log(12)

This is the same mathematical structure that shows up in NALU's log-space branch for multiplication/division — but FluxEM provides it as a fixed algebraic encoding rather than learned gates.

Note on dimensionality: The embeddings are low-rank — linear lives on a 1D line, logarithmic on a 2D plane. The d=256 default is a compatibility wrapper for integration with neural network pipelines, not additional capacity.

See FORMAL_DEFINITION.md for mathematical details.

The Insight

This approach comes from music theory. Lewin's Generalized Interval Systems (1987) formalized how musical intervals form a group structure. FluxEM applies the same framework:

GIS Component Music Theory FluxEM
S (space) Pitches R (numbers)
IVLS (intervals) Z12 (semitones) R under +
int(a,b) Pitch distance Embedding distance

Prior Work & Positioning

Approach Method FluxEM Difference
NALU (Trask, 2018) Learned log/exp gates No learned parameters; same log-space trick, but deterministic
xVal (Golkar, 2023) Learned scaling direction Fixed algebraic structure; no training distribution drift
Abacus (McLeish, 2024) Positional digit encoding Continuous embeddings; not tokenized digits

FluxEM is not claiming to outperform learned approaches on their benchmarks. It's a reference implementation for "how far can you get with pure structure, zero training?" — useful as a baseline, a drop-in numeric primitive, or a pedagogical tool.

Limitations & Edge Cases

Constraint Behavior
Zero handling Explicit flag; log(0) undefined, so zero is masked separately
Sign tracking Sign stored in x[1]; log-space is magnitude-only
Negative base + fractional exponent Unsupported — returns real-valued magnitude surrogate (no complex)
Precision < 1e-6 relative error (float32); from log/exp rounding only
Arithmetic only Not a general reasoning system

See FORMAL_DEFINITION.md for how zero is encoded (zero vector + flag) and ERROR_MODEL.md for precision details.

Citation

@software{fluxem2025,
  title={FluxEM: Algebraic Embeddings for Neural Arithmetic},
  author={Bown, Hunter},
  year={2025},
  url={https://github.com/Hmbown/FluxEM}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fluxem-0.2.0.tar.gz (240.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fluxem-0.2.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file fluxem-0.2.0.tar.gz.

File metadata

  • Download URL: fluxem-0.2.0.tar.gz
  • Upload date:
  • Size: 240.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for fluxem-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3fc290d7ca0738a1577955da6a1b8205ebc30a8ab9f1694c2e085512b8418e4a
MD5 d3f1b328861415d84c4e0f54c80b7bac
BLAKE2b-256 88ac1a1a06c6f06cf12fe0ea96ba79ad7b1da25a64c49564e27b44895586f08b

See more details on using hashes here.

File details

Details for the file fluxem-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fluxem-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for fluxem-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d69c8f9efce469c4d3062d9cf4ea11e57ab8caa22172f9a20e66107a79eea7cd
MD5 a8dbbf3a21532d020faedda06bf92ff7
BLAKE2b-256 48ff5382a62862da6c942adde81a92d4106b2c8486324526593d7b00ab3908df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page