Symbolic regression by genetic programming (C++ engine, Python bindings)

These details have not been verified by PyPI

Project links

Project description

eqhunt

pip install eqhunt

Symbolic regression by genetic programming. C++ engine, Python bindings via nanobind.

Give it a table of (inputs, target) pairs; it returns a human-readable formula that approximates the relationship. No neural network, no black box — just an algebraic expression you can read, paste into a calculator, or hand-tune.

import eqhunt

X = [[1, 1], [2, 3], [4, 5], [7, 2], [9, 9]]
y = [2, 5, 9, 9, 18]

model = eqhunt.fit(X, y)
print(model.formula)        # e.g.  f(x,y) = (x+y)
print(model.error)          # e.g   0.0
print(model.predict([6, 7])) # -> 13.0

Install

pip install eqhunt

Prebuilt wheels are published for Linux, macOS and Windows on common Python versions. If pip falls back to building from source you'll need a C++17 compiler.

Two ways to use it

Ultra-simple

import eqhunt

model = eqhunt.fit(X, y, generations=5000)
print(model.formula)
model.predict([1, 2])         # single row
model.predict([[1, 2], [3, 4]])  # batch

fit() accepts any Config field as a keyword argument:

eqhunt.fit(X, y, pop=800, trig_penalty=2.0, bloat_penalty=0.3)

Fully configurable

import eqhunt

cfg = eqhunt.Config()
cfg.pop               = 800
cfg.gen               = 50000
cfg.tournament_size   = 5
cfg.initial_depth     = 5
cfg.bloat_penalty     = 0.3
cfg.trig_penalty      = 1.5
cfg.accepted_error    = 0.01

# Re-weight individual operators (higher = more likely to appear)
cfg.op_weights.sin = 1.0      # boost sine
cfg.op_weights.cos = 1.0
cfg.op_weights.exp = 0.0      # disable exp entirely
cfg.pi_prob = 0.10            # 'pi' more frequent in terminals

model = eqhunt.Model(cfg).fit(X, y)
print(model.formula)

You can also train from a CSV file (one row per sample, last column = target, lines starting with # are comments):

eqhunt.Model().fit_csv("nivel_embase.csv")

Operators available

Category	Operators
Arithmetic	`+ - * / -x`
Powers	`sqrt **`
Conditional	`if(cond, then, else)` (cond > 0)
Trig	`sin cos tan`
Exp / log	`exp log`
Constants	numeric literals, `pi`

Trigonometric, log and exp nodes have low default weights so they only appear after enough mutation pressure — useful for cyclic / physical data, ignored otherwise. Adjust via Config.op_weights.

How error and validity are handled

Per-sample error is |prediction - target|; total error is the sum.
Invalid evaluations (/0, sqrt(<0), log(<=0), exp(huge)) get a soft per-sample penalty rather than killing the whole formula — a single out-of-domain sample no longer disqualifies an otherwise good candidate. If more than 25% of samples fail, the formula is rejected.

Stopping early

Config.accepted_error stops the search as soon as total error drops below the threshold. You can also call model.stop() from another thread (or a signal handler) to ask the loop to wrap up after the current generation.

Saving and reloading a formula

A trained model is just a string — you can persist it, ship it, paste it, diff it. To reuse a formula in a new process without retraining, parse it back into a Model:

import eqhunt

# train and save
m = eqhunt.fit(X, y)
print(m.formula)          # e.g.  f(x,y) = ((x*x) - (y*y))
m.save("model.txt")       # one-liner persisted

# later, in a fresh process — no training needed
m2 = eqhunt.Model.load_file("model.txt")
m2.predict([6, 7])        # -13.0
m2.predict([[1, 2], [3, 4]])

You can also go through strings directly:

formula_str = m.formula                   # or any equivalent expression
m3 = eqhunt.Model.from_formula(formula_str)
m3.predict([12, 5])

Or mutate an existing model in place:

m.load_formula("(x*x + y*y)")             # replaces the current tree

Accepted syntax: anything the engine itself emits via get_formula() — arithmetic (+ - * / **), unary minus, sqrt sin cos tan log exp if, variables x y z w v u x6 x7 …, numeric literals (int / float / 1e5), and pi. Both the bare expression ("(x+y)") and the full prefixed form ("f(x,y) = (x+y)") are accepted; the parser strips everything up to and including the first =. Parse errors raise RuntimeError.

The number of input variables is inferred from the highest variable index in the formula, so m2.num_vars is set correctly without needing to know it in advance.

Config reference

Field	Default	Meaning
`pop`	400	Population size
`gen`	15000	Max generations
`tournament_size`	4	Tournament selection pool
`crossover_prob`	0.7	Crossover probability per pair
`mutation_prob`	0.25	Mutation probability per offspring
`initial_depth`	4	Depth used to seed the initial population
`mutation_depth`	3	Depth for mutation-generated subtrees
`const_min/max`	-9, 9	Range for random numeric terminals
`pi_prob`	0.01	Probability a terminal is `pi`
`bloat_penalty`	0.1	Per-node penalty (favours smaller trees)
`trig_penalty`	0.5	Extra penalty per `sin/cos/tan/log/exp` node
`immigrant_rate`	0.05	Fraction of population replaced by random each gen
`weak_parent_rate`	0.2	Prob. 2nd parent is random (not tournament)
`accepted_error`	0.5	Stop training once total error < this value
`verbose`	False*	Print best-so-far per improvement
`simplify`	True	Run algebraic simplification on the final tree
`simplify_interval`	500	Periodically simplify top-N members during training
`simplify_top_n`	10	How many to simplify periodically

*C++ default is True; the Python fit() helper defaults to False.

Building from source

git clone https://github.com/sha0coder/eqhunt
cd eqhunt
pip install -e .
pytest

Requires Python 3.8+, a C++17 compiler, CMake 3.15+.

License

MIT.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.5

May 19, 2026

This version

0.0.4

May 19, 2026

0.0.3

May 19, 2026

0.0.2

May 19, 2026

0.0.1

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eqhunt-0.0.4.tar.gz (19.6 kB view details)

Uploaded May 19, 2026 Source

File details

Details for the file eqhunt-0.0.4.tar.gz.

File metadata

Download URL: eqhunt-0.0.4.tar.gz
Upload date: May 19, 2026
Size: 19.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for eqhunt-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`34fe69a0e1b98eee05d8309e15d93081977a0240f99f2f3d734c04918d46aa4b`
MD5	`71d98a0376a481bec71e82958ea5887f`
BLAKE2b-256	`ace6f96802ce29537aa5db14b29d3018bf9629a15fdafb0cd2164c3041d586c8`

See more details on using hashes here.

eqhunt 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

eqhunt

Install

Two ways to use it

Ultra-simple

Fully configurable

Operators available

How error and validity are handled

Stopping early

Saving and reloading a formula

Config reference

Building from source

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes