Equivalence-class graph over corpora of SymPy expressions.
Project description
eml-graph
Equivalence-class graph over corpora of SymPy expressions. Cluster a codebase's mathematical content by Pfaffian cost class; optionally add cost-monotone rewrite edges; render to Graphviz DOT for visualization.
Quick start
pip install eml-graph
from eml_graph import build_graph, to_dot
import sympy as sp
x, y = sp.symbols("x y")
corpus = [
sp.sin(x),
sp.sin(y), # collapses with sin(x) by axes
sp.exp(x),
sp.cos(y), # collapses with sin(*) by axes
sp.exp(x) / (1 + sp.exp(x)), # textbook sigmoid
1 / (1 + sp.exp(-x)), # canonical sigmoid
]
g = build_graph(corpus, label_with_discover=True)
print(g.num_nodes(), "nodes across", g.num_classes(), "Pfaffian-cost classes")
# Save as DOT, then render with graphviz:
with open("corpus.dot", "w") as f:
f.write(to_dot(g, include_edges=True))
# $ dot -Tsvg corpus.dot -o corpus.svg
What does this do that nothing else does?
Software architects have call graphs. Mathematicians have
equivalence classes. eml-graph is the first tool to give you
both at once for a real corpus of code.
Open numpy/, scipy/, torch/, or your own pipeline; scrape every
SymPy expression (or build them from your symbolic layer); feed the
list to build_graph(). You get back:
- Cost-class clusters. All expressions sharing the same Pfaffian
profile (
pfaffian_r,max_path_r,eml_depth, correction sum) collapse into one cluster. You see immediately which mathematical shapes the codebase actually uses. - Rewrite edges (optional). Within a cluster, two expressions
are connected if the
eml-rewritelibrary can walk one into the other under monotone-decreasing cost. Edge weights are the rewrite step counts. - Named labels (optional). When
eml-discoveris installed andlabel_with_discover=True, nodes that match a registry formula (sigmoid, GELU, Pythagorean identity, ...) get the canonical name.
Output
to_dot(graph) returns a Graphviz DOT-format string. No binary deps;
pipe through dot, neato, sfdp, or paste into a DOT web viewer.
For SVG: dot -Tsvg input.dot -o output.svg.
Status
Beta. v0.1.0 is the MVP — clustering + DOT rendering + lazy path-based edges. Roadmap:
- v0.2: matplotlib renderer (no graphviz binary needed)
- v0.3: streaming graph for corpora that don't fit in memory
- v0.4: per-class summary statistics (entropy, modularity, ...)
- v1.0: stable API + first public release with paper
License
PROPRIETARY-PRE-RELEASE. Patent pending. Do not redistribute.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eml_graph-0.1.0.tar.gz.
File metadata
- Download URL: eml_graph-0.1.0.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44f120c7cf1815e42a26d5984a12804c97b2f3ed54920d479b79e2ff5a6e7ea4
|
|
| MD5 |
d471ce479df7edf0010f0e3283ba9d0e
|
|
| BLAKE2b-256 |
2398daea659f4782e6238cf3a4e08d648d583647a3465f42afc8cb7c9feb534d
|
File details
Details for the file eml_graph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: eml_graph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b66ad86ca3110840e5f94554167dc71ebfa59357ac061664db27362948804289
|
|
| MD5 |
eefd937cca3fc01dffaccd2ddf34cefd
|
|
| BLAKE2b-256 |
830b98dcd94132752bda16865c6a6a7bd4478920206a303b832e23bc8d036847
|