Skip to main content

Symbolic regression with a single EML binary operator.

Project description

MLeml

MLeml is a small Python package built around the EML operator introduced in the paper All elementary functions from a single binary operator.

The operator is:

eml(x, y) = exp(x) - log(y)

The package exposes two public entry points:

from mleml import eml, predict

eml evaluates the primitive operator directly. predict fits a shallow EML tree to numerical data and returns the discovered symbolic expression in textual form, for example:

eml(1, eml(x, 1))

Why this package exists

The paper argues that elementary-function expressions can be represented as trees built from one binary operator plus the constant 1. That gives a uniform grammar:

S -> 1 | x | x1 | x2 | eml(S, S)

This package implements a practical subset of that idea for symbolic regression:

  • exact eml evaluation for scalars and arrays
  • a trainable EML tree based on PyTorch
  • deterministic multi-restart optimization with Adam
  • hardening and snapping from soft gates to a discrete formula
  • readable string output through str(result)

Installation

From PyPI

pip install mleml

From source

git clone https://github.com/<your-user>/MLeml.git
cd MLeml
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

If you specifically want a CPU-only local PyTorch install:

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install -e ".[dev]"

Quick start

Evaluate the EML primitive

from mleml import eml

print(eml(2.0, 3.0))

eml accepts scalars and array-like inputs. Internally it evaluates in complex128 and returns real values when the imaginary part is numerically negligible.

Recover a univariate formula from points

import numpy as np
from mleml import predict

x = np.linspace(0.8, 2.0, 64)
y = np.exp(1.0) - np.log(x)

result = predict(x, y, max_depth=1)

print(result)          # eml(1, x)
print(result.mse)      # close to zero
print(result(x[:5]))   # evaluate the snapped expression

Recover a bivariate formula from points

import numpy as np
from mleml import predict

x1 = np.linspace(0.7, 1.6, 16)
x2 = np.linspace(1.1, 2.0, 16)
y = np.exp(x1) - np.log(x2)

result = predict((x1, x2), y, max_depth=1)

print(result)          # eml(x1, x2)
print(result(x1, x2))

API overview

eml(x, y)

  • evaluates exp(x) - log(y)
  • accepts Python scalars, NumPy arrays, and array-like objects
  • uses complex arithmetic internally to preserve the EML semantics

predict(X, Y, max_depth)

  • supports one feature: predict(X, Y, max_depth=...)
  • supports two features: predict((X1, X2), Y, max_depth=...)
  • returns a PredictResult

PredictResult

  • str(result) returns the snapped EML expression
  • result(x) evaluates a 1D expression
  • result(x1, x2) evaluates a 2D expression
  • result.mse is the training MSE of the snapped tree
  • result.depth is the effective symbolic depth after snapping
  • result.n_features is 1 or 2

How predict works

The model is a full binary EML tree of depth max_depth.

  • leaves choose among 1, x, x1, and x2 through softmax logits
  • each internal node chooses, independently for left and right inputs, whether to pass through the child value or replace it with the constant 1
  • after gating, the node always applies eml(left, right)

Optimization uses:

  • deterministic multi-restart initialization
  • full-batch Adam
  • temperature annealing during hardening
  • penalties for diffuse leaves and non-binary gates
  • snapping to a discrete symbolic tree after optimization

The returned formula is always the best snapped candidate found. The function does not require exact recovery to return a result.

Limitations

  • This is a shallow-tree symbolic regression package, not a full theorem prover.
  • Exact recovery is realistic mainly for shallow expressions that are naturally expressible as small EML trees.
  • For noisy data or functions such as sin(x) or x**8, the package will usually return a best-fit EML expression rather than an algebraically exact identity.
  • The repository examples include both a recoverable EML target and explicit stress tests, so visual output should be interpreted accordingly.
  • Internal complex arithmetic and repeated exponentials can cause difficult optimization landscapes for larger depths.
  • Runtime increases quickly with depth.

Repository layout

src/mleml/       package source
tests/           tests and example plots
docs/            API, method, examples, release notes
2603.21852v2.pdf local copy of the reference paper

Development

Create a local environment and install development dependencies:

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Run the fast test suite:

pytest -s -m "not slow"

Run the example tests that also save plots:

pytest -s -m slow

Build the package:

python -m build
python -m twine check dist/*

Publishing to PyPI

Short version:

python -m build
python -m twine check dist/*
python -m twine upload dist/*

For a complete release checklist, see docs/release.md.

GitHub Actions publishing

The repository also supports credential-free PyPI publishing through GitHub Actions Trusted Publishers.

  • normal CI runs on main and release
  • PyPI publishing runs only on pushes to the release branch
  • the publishing workflow file is .github/workflows/publish.yml
  • the recommended GitHub environment name is pypi

See docs/trusted-publisher.md for the exact PyPI pending publisher values.

Reference

The package is inspired by:

  • Andrzej Odrzywolek, All elementary functions from a single binary operator, arXiv:2603.21852v2

This repository also keeps the local paper copy in the root directory as 2603.21852v2.pdf.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mleml-0.1.0.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mleml-0.1.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file mleml-0.1.0.tar.gz.

File metadata

  • Download URL: mleml-0.1.0.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mleml-0.1.0.tar.gz
Algorithm Hash digest
SHA256 844fb2ffdd01577eb1b44b257d6bc3b750d29150f8603d35a0ab3c22e3a5b196
MD5 da0cab99d050eb5f349bdded3ed760d9
BLAKE2b-256 4150f03ff23dbb49b5f91e638eb67076008e6ef5817436ef2e12801665ee4d13

See more details on using hashes here.

Provenance

The following attestation bundles were made for mleml-0.1.0.tar.gz:

Publisher: publish.yml on art22017/MLeml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mleml-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mleml-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mleml-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e37dc30e9d425294b36d0e2d9a73fe221839ad330578a4b3ca773720c1674232
MD5 7fabddd867c9a279a60be22936d15475
BLAKE2b-256 decbb0a04e09ebf7930a57ba9837b8cb2517ccaef7eb6ce555017e6738a6e7b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for mleml-0.1.0-py3-none-any.whl:

Publisher: publish.yml on art22017/MLeml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page