Symbolic Regression/Equation Discovery Toolkit

These details have not been verified by PyPI

Project links

Project description

SRToolkit: Symbolic Regression / Equation Discovery Benchmark Toolkit

SRToolkit logo

Documentation:: https://smeznar.github.io/SymbolicRegressionToolkit

What is SRToolkit?

The SRToolkit is a comprehensive Python toolkit designed to accelerate research and development in Symbolic Regression (SR) / Equation Discovery (ED). It provides a robust, easy-to-use framework for benchmarking, rapid prototyping, and mathematical expression manipulation.

Core Features

SRToolkit provides a straightforward interface for:

Implementing new Symbolic Regression approaches and evaluating their performance against other approaches.
Benchmarking Symbolic Regression approaches using built-in benchmarks (currently Feynman and Nguyen) or custom data.
Experiment organization and postprocessing of results.
Converting expressions into expression trees or fast, callable NumPy functions.
Generating random expressions by defining the symbol space or a grammar.
Estimating constant parameters of expressions against real-world data.
Comparing and measuring the distance between expressions.

Installation

To install the latest stable release of the package, run the following command in your terminal:

pip install symbolic-regression-toolkit

Alternatively, you can install the latest build directly from the repository with the command (Recommended):

pip install git+https://github.com/smeznar/SymbolicRegressionToolkit

Examples

1. Expression Manipulation (The Toolkit Core)

SRToolkit offers fundamental utilities for working with mathematical expressions as tokens, trees, and executable code—the building blocks for any SR approach.

import numpy as np
from SRToolkit.utils import expr_to_executable_function, tokens_to_tree, SymbolLibrary, expr_to_latex

# Create an executable function from the expression
expr = expr_to_executable_function(["X_0", "+", "X_1", "*", "C"])

# Calculate the output at two points (1, 2) and (2, 5) with C=3
data_points = np.array([[1, 2], [2, 5]])
constants = [3]
output = expr(data_points, constants)
# Variable "output" should now contain np.array([7, 17])

# Create a SymbolLibrary defining the symbol space for 2 variables
sl = SymbolLibrary.default_symbols(num_variables=2)

# Create an expression tree from the token list
expr_tree = tokens_to_tree(["X_0", "+", "X_1", "*", "C"], sl)

# Transform the expression into a list of symbols in postfix notation
postfix_expr = expr_tree.to_list(notation="postfix")

# Create a LaTeX string of the expression for clear presentation
expr_latex = expr_to_latex(expr_tree, sl)

2. Benchmarking and Evaluation (The Main Use Case)

The primary advantage of SRToolkit is its robust benchmarking framework, allowing you to quickly evaluate and compare different Symbolic Regression approaches.

from SRToolkit.approaches import EDHiE, ProGED
from SRToolkit.dataset import Feynman
from SRToolkit.evaluation import LoggingCallback
from SRToolkit.experiments import ExperimentGrid

# Load the Feynman benchmark and pick two 2-variable datasets to run on.
bm = Feynman()
ds_names = bm.list_datasets(num_variables=2, verbose=False)
dataset1 = bm.create_dataset(ds_names[0])
dataset2 = bm.create_dataset(ds_names[1])

# Define the SR approaches to benchmark.
# EDHiE requires a pre-trained/adapted model state; ProGED needs no time consuming adaptation.
edhie = EDHiE()
proged = ProGED()

# Map each (approach, dataset) pair to a file where the adapted model state will
# be saved. Both datasets reuse the same file here because they share the same
# number of variables, so one adapted state covers both.
adapted_states = {edhie.name: {ds_names[0]: "adapted_state_2_vars.pt", ds_names[1]: "adapted_state_2_vars.pt"}}

# Build the experiment grid: every combination of approach × dataset will be run
# num_experiments times (with different random seeds). Results are written under
# results_dir, specifically to "results_dir/{dataset}/{approach}/exp_{seed}.json".
eg = ExperimentGrid(
    approaches=[proged, edhie],
    datasets=[dataset1, dataset2],
    num_experiments=2,
    results_dir="../results/",
    adapted_states=adapted_states,
)

# Write a shell script of CLI commands that can be executed in parallel, e.g.:
#   cat commands.sh | parallel -j 4
eg.save_commands("commands.sh")

# Run adaptation for any approach/dataset pair whose adapted state file is missing.
# This is a no-op if all state files already exist.
eg.adapt_if_missing()

# Collect all pending jobs (skip any whose result file already exists on disk).
jobs = eg.create_jobs(skip_completed=True)

# Run each job sequentially in this process. To parallelize, use the generated
# commands.sh instead.
for job in jobs:
    job.run()

# See how many jobs are completed 
eg.progress()

Additional examples can be found in the examples folder or in the official documentation.

Roadmap 🗺️

In future releases, our primary focus will:

Expanded Library of Approaches: Add more Symbolic Regression approaches to the toolkit.
Result Visualization: Implement a robust visualization and result aggregation framework for SR results.
Simplification: Implement a better (more accurate, efficient, and stable) simplification system for expressions.
Constraints: Implement more robust expression generation constraints using techniques like attribute grammars.
Improved Benchmarking: Improve the robustness and efficiency of the benchmarking framework (continuous).
Advanced Expressions (Distant Plan): Implement support for different types of expressions, such as ODEs and PDEs.

Contributing 🤝

We welcome contributions! Whether you're adding a new benchmark, implementing an SR approach, fixing a bug, or improving the documentation, please feel free to open a issue on the Github page or submit a Pull Request (PR) with a clear description of your changes.

We are especially looking for contributions of:

New Benchmarks and Datasets.
Implementations of additional Symbolic Regression Approaches (once the core framework for comparison is finalized).

Instructions on how to contribute can be found in the Contribution Guide.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.5.0

Apr 14, 2026

1.4.0

Oct 28, 2025

1.3.3

Jul 7, 2025

1.3.2

Jul 7, 2025

1.3.1

Jul 7, 2025

1.3.0

Jul 7, 2025

1.2.6

Apr 25, 2025

1.2.5

Jan 30, 2025

1.2.4

Jan 27, 2025

1.2.3

Jan 27, 2025

1.2.2

Jan 27, 2025

1.2.1

Jan 10, 2025

1.2.0

Jan 10, 2025

1.1.0

Dec 10, 2024

1.0.0

Dec 6, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symbolic_regression_toolkit-1.5.0.tar.gz (115.0 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

symbolic_regression_toolkit-1.5.0-py3-none-any.whl (114.1 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file symbolic_regression_toolkit-1.5.0.tar.gz.

File metadata

Download URL: symbolic_regression_toolkit-1.5.0.tar.gz
Upload date: Apr 14, 2026
Size: 115.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for symbolic_regression_toolkit-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`d92ef7b0b6bb6f00cbb85d2483acbc3c0b2cc5585460f74f59b7f20c38088641`
MD5	`42ba8e8333ddddb0bfafbc8ad7acea34`
BLAKE2b-256	`3e36e75f57019f4c477de13471f28ceaacbece0be919c3771fcc55a31f45eb48`

See more details on using hashes here.

File details

Details for the file symbolic_regression_toolkit-1.5.0-py3-none-any.whl.

File metadata

Download URL: symbolic_regression_toolkit-1.5.0-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 114.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for symbolic_regression_toolkit-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79e18fcc13e0d8031981fdb2aaccf9e1e56800666d49b0e983cd96e9e50982f4`
MD5	`11f6c23d9f4c59e5bd91a2c6a60eba7c`
BLAKE2b-256	`2f8978377afa8d4a326b9e8d7b7f1a2b96bf8bdf6af4d51b433052ac412dc26e`

See more details on using hashes here.

symbolic-regression-toolkit 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SRToolkit: Symbolic Regression / Equation Discovery Benchmark Toolkit

What is SRToolkit?

Core Features

Installation

Examples

1. Expression Manipulation (The Toolkit Core)

2. Benchmarking and Evaluation (The Main Use Case)

Roadmap 🗺️

Contributing 🤝

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes