A compiler that understands mathematics: it optimizes expressions by reasoning about algebra, using equality saturation over an e-graph.

These details have not been verified by PyPI

Project links

Project description

Logic-Loom

A compiler that understands mathematics, not just instructions.

Most optimizers shuffle instructions. Logic-Loom reasons about algebra. It discovers that a*b + a*c is a*(b + c), finds Horner's scheme on its own, cancels a*(b+c) - a*b down to a*c, optimizes for a chosen hardware target, tracks the domain assumptions it relies on, refuses to disturb side effects, and emits the result as C, Rust, JavaScript or LLVM IR.

Quick start · Showcase · How it works · Cost models · Domain safety · Side effects · Code generation · Correctness · Limitations · Roadmap

The idea

A traditional compiler optimizes by pattern-matching instructions: replace a multiply-by-two with a shift, fold two constants, peephole away a redundant move. It reads code like a clerk with a checklist.

Logic-Loom reads code like a mathematician. Given

a*b + a*c

it does not ask "which instruction is cheaper?" It recognizes the distributive law and rewrites the algorithm:

a*(b + c)        # one multiply instead of two

It does this not with a hand-written branch for every case, but by knowing a handful of algebraic laws and exploring all of their consequences at once, then picking the cheapest equivalent form under a configurable cost model. The same engine that factors a sum also discovers Horner's scheme for polynomials and cancels terms that destroy each other; none of it was special-cased.

The technique: equality saturation over an e-graph, the same idea behind egg and the Herbie floating-point optimizer. See How it works.

Quick start

No dependencies. Pure Python (3.9+).

pip install logic-loom

Then use it from the command line:

# optimize an expression
logic-loom "a*b + a*c"
#  a * b + a * c  =>  a * (b + c)

# the module form works too
python -m logic_loom "a*b + a*c"

# optimize for a hardware target and emit LLVM IR
python -m logic_loom --profile gpu --target llvm "a/b + c/b"

# track the domain assumptions a simplification relies on
python -m logic_loom --extended --explain "sqrt(x)*sqrt(x)"
#  sqrt(x) * sqrt(x)  =>  x
#    assumes (for soundness): ?x >= 0

# respect side effects: do not collapse two random draws into zero
python -m logic_loom --impure rand "rand(s) - rand(s)"
#  rand(s) - rand(s)  ==  rand(s) - rand(s)

From Python:

from logic_loom import optimize, to_code

r = optimize("a*x*x + b*x + c")
print(r.optimized)              # x * (a * x + b) + c     <- Horner's scheme
print(r.speedup)               # 1.32
print(to_code(r.optimized, "c"))

Or run from a source checkout (and try the full showcase):

git clone https://github.com/elianalfonsolopezpreciado/Logic-Loom.git
cd Logic-Loom
pip install -e .
python examples/demo.py

Showcase

Every row below is produced by the same engine and the same rule set; nothing is special-cased. cost is the weighted operation count under the default model.

Input	Logic-Loom output	What it figured out	Cost
`ab + ac`	`a * (b + c)`	distributive law / factoring	5.4 -> 3.3
`pq + pr + p*s`	`p * (q + (r + s))`	factor a term shared by three products	8.6 -> 4.4
`axx + b*x + c`	`x * (a*x + b) + c`	Horner's scheme, discovered	8.6 -> 6.5
`a(b + c) - ab`	`a * c`	expand, then cancel `a*b`	6.5 -> 2.2
`23 + 4x0 + a1`	`6 + a`	constant folding + identities	10.7 -> 1.2
`2x + 3x`	`x * 5`	combine like terms	5.4 -> 2.2
`x + 0 - x + 5`	`5`	self-inverse vanishes	3.4 -> 0.1
`x/x + y - y`	`1`	division and subtraction cancel	6.4 -> 0.1
`(a+b)/c + (a-b)/c`	`(a + a) / c`	combine over a denominator	11.6 -> 5.3

Run python examples/demo.py to reproduce all of these with live statistics.

How it works

The key insight is that Logic-Loom never commits to a single rewrite. A greedy compiler that applies factor too early can miss a better form that needed distribute first. Logic-Loom sidesteps this phase-ordering problem entirely by keeping every equivalent form alive simultaneously.

flowchart LR
    A["a*b + a*c<br/>(text)"] --> B[Parse]
    B --> P[Static analysis:<br/>prune rules, size limits]
    P --> C["e-graph<br/>(one term)"]
    C --> D{Equality<br/>saturation}
    D -->|apply every law,<br/>both directions| C
    D -->|nothing new learned| E[Extract cheapest<br/>under cost model]
    E --> F["a*(b + c)<br/>(optimal form)"]

1. The e-graph. An e-graph stores a large set of equivalent expressions compactly. Terms known to be equal are grouped into an e-class; an e-node is an operator applied to e-classes rather than to concrete terms. So a single + node over the classes {a*b} and {a*c} already represents every term those classes contain.

2. Equality saturation. Algebraic laws are applied as rewrites that add equalities instead of replacing terms:

distribute :  ?a * (?b + ?c)  ==  ?a*?b + ?a*?c
factor     :  ?a*?b + ?a*?c   ==  ?a * (?b + ?c)
comm-add   :  ?a + ?b         ==  ?b + ?a
assoc-mul  :  (?a*?b)*?c      ==  ?a*(?b*?c)
self-mul   :  ?a * ?a         ==  ?a ^ 2
...        (see logic_loom/rules.py)

Because rewrites only add information, contradictory-looking rules (distribute and factor) coexist without looping, and the result is independent of the order rules fire in. The engine runs until the laws teach it nothing new (the graph is saturated) or a resource limit is reached.

3. Extraction. The saturated graph now contains all discovered forms. A cost model assigns each operator a weight, and a fixed-point picks the cheapest representative of each e-class. Extraction is deterministic: the same input yields the same output regardless of hash seed.

4. Smarter limits. Two mechanisms keep the inherently explosive search under control:

Static pruning and auto-sizing (logic_loom/analysis.py) runs before saturation. A fixed-point computes which rules can ever fire given the operators in the input, and drops the rest; it also sizes the resource limits to the input's complexity. On a polynomial, every transcendental rule is pruned automatically. This alone cut the test-suite runtime by roughly 6x.
A backoff scheduler (after egg's BackoffScheduler) throttles rules that match explosively at runtime: when a rule blows past its budget it is briefly banned, and its budget then doubles, so a productive rule is delayed but never silenced. A hard node cap remains as the final guarantee of termination.

Hardware-aware cost models

"Optimal" is not absolute; it depends on what you are optimizing for. On a GPU a division can cost as much as eight multiplies and a transcendental call far more; on a modern x86 core a fused multiply-add makes * and + nearly free. Logic-Loom makes the cost model a first-class input:

python -m logic_loom --profile gpu  "a/b + c/b"     # strongly prefers one divide
python -m logic_loom --profile x86  "..."
python -m logic_loom --profile arm  "..."

Built-in profiles (default, x86, arm, gpu) encode relative latencies in the spirit of published instruction tables. The model genuinely changes the chosen form, not just the reported number:

from logic_loom import optimize, CostModel

cheap_pow = CostModel("cheap-pow", {"+": 1, "-": 1, "*": 2, "/": 4, "^": 1})
optimize("x*x*x").optimized                 # x * (x * x)   (default: powers are dear)
optimize("x*x*x", model=cheap_pow).optimized  # x^3         (powers are cheap here)

You can load a profile measured with your own micro-benchmarks:

CostModel.from_json("examples/profile.skylake.json")

Domain safety and assumptions

Some algebraic laws are only valid on part of the real line: x/x = 1 assumes x != 0, sqrt(x)*sqrt(x) = x assumes x >= 0, log(a*b) = log a + log b assumes both arguments are positive. Logic-Loom records the preconditions of every rule, and reports exactly which assumptions a given result relied on:

python -m logic_loom --extended --explain "sqrt(x)*sqrt(x)"
#  sqrt(x) * sqrt(x)  =>  x
#    assumes (for soundness): ?x >= 0

r = optimize("x / x")
r.optimized        # 1
r.assumptions      # ['?a != 0']

The set is a conservative over-approximation (it lists the assumptions of every rule that fired on the way to the result), which is the safe direction: it never hides a precondition. Surfacing these is the foundation for emitting guarded code or refusing an unsafe rewrite, rather than silently assuming validity.

Side-effect awareness

By default Logic-Loom assumes expressions are pure, so terms may be freely duplicated, dropped or reordered. That is wrong when a subexpression has side effects (reading input, drawing a random number, mutating state). Name the impure functions and the engine taints every term that can contain such a call, then refuses any rewrite that would change how often it runs or in what order:

optimize("rand(s) - rand(s)")                  # 0            (pure assumption)
optimize("rand(s) - rand(s)", impure={"rand"}) # rand(s) - rand(s)   (preserved)
optimize("rand(s) + rand(s)", impure={"rand"}) # rand(s) + rand(s)   (not 2*rand)

A rewrite is rejected if it would duplicate, eliminate, or reorder a tainted term. The check is deliberately conservative -- when in doubt it keeps the effect -- which is exactly what soundness in the presence of side effects requires. Pure factors around impure calls (where each call still runs exactly once) are still optimized.

Code generation

Optimizing an expression is only useful if you can run it. Logic-Loom emits the optimized form as real source code in C, Rust, JavaScript, or LLVM IR so it can plug into an existing toolchain:

python -m logic_loom --target c    "a*x*x + b*x + c"   # x * (a * x + b) + c
python -m logic_loom --target rust "x ^ 3"             # (x).powf(3)
python -m logic_loom --target llvm "a*x + b"

define double @f(double %a, double %b, double %x) {
entry:
  %t1 = fmul double %a, %x
  %t2 = fadd double %t1, %b
  ret double %t2
}

The LLVM backend emits SSA IR with the right intrinsics (@llvm.pow.f64, @llvm.sqrt.f64, ...) and their declarations, ready for opt/clang to inline and vectorize. Power and function calls in the textual targets are rendered idiomatically per language (pow / Math.pow / .powf).

Extended domain: transcendental functions

By default Logic-Loom reasons over polynomial/rational arithmetic. The --extended flag (or rules=ALL_RULES) adds identities for exponentials, logarithms, square roots and trigonometry, each validated numerically in the test-suite:

Input	Output	Identity used
`exp(a) * exp(b)`	`exp(a + b)`	product of exponentials
`log(exp(x))`	`x`	log and exp are inverse
`sin(x)^2 + cos(x)^2`	`1`	Pythagorean identity
`sqrt(x) * sqrt(x)`	`x`	square root squared

These assume the usual real domains, which is why they are opt-in and why their assumptions are tracked.

Visualize the e-graph

To see what saturation actually explores, export the e-graph to Graphviz:

python -m logic_loom --dot "a*b + a*c" > egraph.dot
dot -Tsvg egraph.dot -o egraph.svg

Each dashed box is an e-class (a set of forms proven equal); nodes inside it are the different ways to build a value in that class; edges point from an operator to the classes of its operands.

Teach it new mathematics

Rules are one-liners, optionally annotated with domain assumptions:

from logic_loom import optimize, rule, DEFAULT_RULES

power_of_two = rule("pow2", "?x ^ 2", "?x * ?x")

r = optimize("(a + b) ^ 2", rules=DEFAULT_RULES + [power_of_two])
print(r.optimized)

A rule is rule(name, left_pattern, right_pattern, assumes=(...)), where ?name marks a pattern variable. You describe a theorem, not a procedure; Logic-Loom decides when and where it pays off.

Correctness

A clever optimizer is worthless if it is ever wrong. Logic-Loom is backed by differential testing: for each example the original and optimized expressions are evaluated on hundreds of random inputs and asserted to agree to floating-point tolerance.

pip install pytest
pytest -q          # 53 passed

The suite covers parsing, code generation (including LLVM IR), cost models, static pruning, side-effect safety, every class of optimization, and -- most importantly -- that no rewrite ever changes what an expression computes.

Architecture

A compact, readable codebase.

File	Responsibility
`logic_loom/expr.py`	expression AST, pretty-printer, numeric evaluator
`logic_loom/parser.py`	Pratt parser (precedence, unary minus, calls, `?patvars`)
`logic_loom/egraph.py`	the e-graph: union-find, hashcons, congruence `rebuild`
`logic_loom/rules.py`	rewrite rules (default + extended), assumptions, e-matching
`logic_loom/analysis.py`	static rule pruning and automatic limit sizing
`logic_loom/saturate.py`	saturation loop, constant folding, backoff scheduler
`logic_loom/effects.py`	side-effect taint analysis and rewrite safety
`logic_loom/cost.py`	cost models, hardware profiles, cheapest-term extraction
`logic_loom/codegen.py`	emit C / Rust / JavaScript / LLVM IR
`logic_loom/viz.py`	Graphviz DOT export of the e-graph
`logic_loom/compiler.py`	the high-level `optimize()` API
`logic_loom/cli.py`	the `python -m logic_loom` command line

Limitations

This is a focused, working demonstration of a powerful idea, not a production-grade computer algebra system. The current boundaries are explicit:

Cost models are configurable but still illustrative. Profiles for x86, ARM and GPU exist and can be loaded from JSON, and the model demonstrably changes the extracted form. The shipped numbers are relative estimates, however, not values measured on a specific chip; calibrate with micro-benchmarks before drawing performance conclusions.
Domain safety is tracked, not enforced. Assumptions like x != 0 and x >= 0 are recorded and reported per result, which is the right foundation. The engine does not yet prove those conditions hold, emit runtime guards automatically, or model floating-point rounding and NaN propagation; a sound numerical pipeline would add those checks.
Combinatorial explosion is bounded, not eliminated. Static pruning, a backoff scheduler and a node cap keep the search tractable and guarantee termination, but over many associative/commutative operators the graph can still hit a resource limit. In that case the globally optimal form is not guaranteed; Logic-Loom returns the best form found within the budget.
Toolchain integration is via code emission, not a compiler pass. It emits LLVM IR text that an existing toolchain can consume, but it is not yet an in-tree LLVM optimization pass or a registered backend for a language front end. Wiring it directly into a compiler's pass pipeline remains future work.
Side-effect handling is sound but coarse. Impure calls are never duplicated, dropped or reordered, which is safe, but the analysis is whole-term tainting rather than a precise effect system; it does not model distinct effect kinds, aliasing, or ordering constraints between different impure calls beyond forbidding their reordering.
Not a full CAS. This is deliberately a demonstrator of the idea, not a complete symbolic-mathematics system; it does not solve equations, integrate, or simplify across the full breadth a CAS would.

Roadmap

Directions for turning the demonstrator into something broader. Items marked (done) ship today; (partial) are started with a clear next step.

Realistic cost model (partial). Hardware profiles and JSON loading exist; next is calibrating them from real micro-benchmarks per architecture and modeling fused operations (FMA) and vector throughput.
Domain guards and numerical safety (partial). Assumptions are tracked and reported; next is proving or emitting runtime guards for them and modeling floating-point behavior (rounding, NaN/Inf) so transcendental rewrites are sound under IEEE-754.
Scalability heuristics (partial). Static pruning and a backoff scheduler are in place; next is cost-aware scheduling and detecting AC subgraphs that cannot improve, to prune the search further before saturating.
Code generation (done, expanding). C, Rust, JavaScript and LLVM IR are supported; more targets and full function/statement emission are natural extensions.
e-graph visualization (done, expanding). DOT export exists; an interactive viewer that animates saturation round by round would help.
Integration with toolchains. Build a real LLVM optimization pass (or an IR transpiler wired into the pass pipeline) so C++/Rust builds can call Logic-Loom directly, rather than pasting emitted IR.
Richer side-effect model. Move from whole-term tainting to a precise effect system that tracks distinct effects and ordering constraints, enabling optimization that reorders independent effects safely.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Jun 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logic_loom-0.3.0.tar.gz (41.7 kB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

logic_loom-0.3.0-py3-none-any.whl (34.5 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file logic_loom-0.3.0.tar.gz.

File metadata

Download URL: logic_loom-0.3.0.tar.gz
Upload date: Jun 27, 2026
Size: 41.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for logic_loom-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`e7dafde073d7cb076f10e0abb3f2190f04fe4ba1b7c1c6363de0c2ba87629a8c`
MD5	`bdf7100170ebb0c9ce1a790606f4715d`
BLAKE2b-256	`4616ef29f72c79837e00b30aad99b776f82f99bafbe7d7d1d70005ca4127b1b0`

See more details on using hashes here.

File details

Details for the file logic_loom-0.3.0-py3-none-any.whl.

File metadata

Download URL: logic_loom-0.3.0-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 34.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for logic_loom-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a940e47ef3e1d79e6817537416d4a358392facf691228aab4aa8da01bf4c82ce`
MD5	`42a2583f70b813fea00e8925405cb94a`
BLAKE2b-256	`3e1802cf4ea50d84de6c6334590b457133c316292f653abdfae12a781f3569cb`

See more details on using hashes here.

logic-loom 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Logic-Loom

A compiler that understands mathematics, not just instructions.

The idea

Quick start

Showcase

How it works

Hardware-aware cost models

Domain safety and assumptions

Side-effect awareness

Code generation

Extended domain: transcendental functions

Visualize the e-graph

Teach it new mathematics

Correctness

Architecture

Limitations

Roadmap

Further reading

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes