Symbolic regression by genetic programming (C++ engine, Python bindings)
Project description
eqhunt
pip install eqhunt
Symbolic regression by genetic programming. C++ engine, Python bindings via nanobind.
Give it a table of (inputs, target) pairs; it returns a human-readable formula
that approximates the relationship. No neural network, no black box — just an
algebraic expression you can read, paste into a calculator, or hand-tune.
import eqhunt
X = [[1, 1], [2, 3], [4, 5], [7, 2], [9, 9]]
y = [2, 5, 9, 9, 18]
model = eqhunt.fit(X, y)
print(model.formula) # e.g. f(x,y) = (x+y)
print(model.predict([6, 7])) # -> 13.0
Install
pip install eqhunt
Prebuilt wheels are published for Linux, macOS and Windows on common Python versions. If pip falls back to building from source you'll need a C++17 compiler.
Two ways to use it
Ultra-simple
import eqhunt
model = eqhunt.fit(X, y, generations=5000)
print(model.formula)
model.predict([1, 2]) # single row
model.predict([[1, 2], [3, 4]]) # batch
fit() accepts any Config field as a keyword argument:
eqhunt.fit(X, y, pop=800, trig_penalty=2.0, bloat_penalty=0.3)
Fully configurable
import eqhunt
cfg = eqhunt.Config()
cfg.pop = 800
cfg.gen = 50000
cfg.tournament_size = 5
cfg.initial_depth = 5
cfg.bloat_penalty = 0.3
cfg.trig_penalty = 1.5
cfg.accepted_error = 0.01
# Re-weight individual operators (higher = more likely to appear)
cfg.op_weights.sin = 1.0 # boost sine
cfg.op_weights.cos = 1.0
cfg.op_weights.exp = 0.0 # disable exp entirely
cfg.pi_prob = 0.10 # 'pi' more frequent in terminals
model = eqhunt.Model(cfg).fit(X, y)
print(model.formula)
You can also train from a CSV file (one row per sample, last column = target,
lines starting with # are comments):
eqhunt.Model().fit_csv("nivel_embase.csv")
Operators available
| Category | Operators |
|---|---|
| Arithmetic | + - * / -x |
| Powers | sqrt ** |
| Conditional | if(cond, then, else) (cond > 0) |
| Trig | sin cos tan |
| Exp / log | exp log |
| Constants | numeric literals, pi |
Trigonometric, log and exp nodes have low default weights so they only
appear after enough mutation pressure — useful for cyclic / physical data,
ignored otherwise. Adjust via Config.op_weights.
How error and validity are handled
- Per-sample error is
|prediction - target|; total error is the sum. - Invalid evaluations (
/0,sqrt(<0),log(<=0),exp(huge)) get a soft per-sample penalty rather than killing the whole formula — a single out-of-domain sample no longer disqualifies an otherwise good candidate. If more than 25% of samples fail, the formula is rejected.
Stopping early
Config.accepted_error stops the search as soon as total error drops below
the threshold. You can also call model.stop() from another thread (or a
signal handler) to ask the loop to wrap up after the current generation.
Config reference
| Field | Default | Meaning |
|---|---|---|
pop |
400 | Population size |
gen |
15000 | Max generations |
tournament_size |
4 | Tournament selection pool |
crossover_prob |
0.7 | Crossover probability per pair |
mutation_prob |
0.25 | Mutation probability per offspring |
initial_depth |
4 | Depth used to seed the initial population |
mutation_depth |
3 | Depth for mutation-generated subtrees |
const_min/max |
-9, 9 | Range for random numeric terminals |
pi_prob |
0.01 | Probability a terminal is pi |
bloat_penalty |
0.1 | Per-node penalty (favours smaller trees) |
trig_penalty |
0.5 | Extra penalty per sin/cos/tan/log/exp node |
immigrant_rate |
0.05 | Fraction of population replaced by random each gen |
weak_parent_rate |
0.2 | Prob. 2nd parent is random (not tournament) |
accepted_error |
0.5 | Stop training once total error < this value |
verbose |
False* | Print best-so-far per improvement |
simplify |
True | Run algebraic simplification on the final tree |
simplify_interval |
500 | Periodically simplify top-N members during training |
simplify_top_n |
10 | How many to simplify periodically |
*C++ default is True; the Python fit() helper defaults to False.
Building from source
git clone https://github.com/uid0/eqhunt
cd eqhunt
pip install -e .
pytest
Requires Python 3.8+, a C++17 compiler, CMake 3.15+.
License
MIT.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file eqhunt-0.0.3.tar.gz.
File metadata
- Download URL: eqhunt-0.0.3.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
767907f65a5982e5d3bc5bfb0f2dbcc72e3d3495e3ce8c63840ce448eceb6d19
|
|
| MD5 |
625467da794cbbeb6ca7914a3adff8da
|
|
| BLAKE2b-256 |
b1e872a5bb183a94643983c8d71412c84cdd22b12a0f299375147490e2cbda9e
|