Automatic parametric modeling with symbolic regression
Project description
An API to automate parametric modeling with symbolic regression, originally developed for data analysis in the experimental high-energy physics community, but also applicable beyond.
Symbolfit takes binned data with measurement/systematic uncertainties as input, utilizes PySR to perform a machine-search for batches of functional forms that model the data, parameterizes these functions, and utilizes LMFIT to re-optimize the functions and provide uncertainty estimation, all in one go. It is designed to maximize automatation with minimal human input. Each run produces a batch of functions with uncertainty estimation, which are evaluated, saved, and plotted automatically into readable output files, ready for downstream tasks.
Installation
Prerequisite
Install Julia (backend for PySR)
curl -fsSL https://install.julialang.org | sh
then check if installed properly
julia --version
Installation via PyPI
With Python>=3.9
pip install symbolfit
Upon first installation, run
python3 -m pysr install
Getting Started
To run an example fit (or python fit_example.py):
from symbolfit.symbolfit import *
dataset = importlib.import_module('examples.datasets.toy_dataset_1.dataset')
pysr_config = importlib.import_module('examples.pysr_configs.pysr_config_1')
model = SymbolFit(
x = dataset.x,
y = dataset.y,
y_up = dataset.y_up,
y_down = dataset.y_down,
pysr_config = pysr_config,
max_complexity = 60,
input_rescale = True,
scale_y_by = 'mean',
max_stderr = 40,
fit_y_unc = True,
random_seed = None,
loss_weights = None
)
model.fit()
After the fit, save results to csv:
model.save_to_csv(output_dir = 'output_dir/')
and plot results to pdf:
model.plot_to_pdf(
output_dir = 'output_dir/',
bin_widths_1d = dataset.bin_widths_1d,
#bin_edges_2d = dataset.bin_edges_2d,
plot_logy = False,
plot_logx = False
)
Candidate functions with full substitutions can be printed in prompt:
model.print_candidate(candidate_number = 10)
Each run will produce a batch of candidate functions and will automatically save all results to five output files:
candidates.csv: saves all candidate functions and evaluations in a csv table.candidates_reduced.csv: saves a reduced version for essential information without intermediate results.candidates.pdf: plots all candidate functions with associated uncertainties one by one for fit quality evaluation.candidates_gof.pdf: plots the goodness-of-fit scores.candidates_correlation.pdf: plots the correlation matrices for the parameters of each candidate function.
Documentation
The documentation can be found here for more info and demonstrations.
Citation
If you find this useful in your research, please consider citing Symbolfit:
Coming soon!
and PySR:
@misc{cranmerInterpretableMachineLearning2023,
title = {Interpretable {Machine} {Learning} for {Science} with {PySR} and {SymbolicRegression}.jl},
url = {http://arxiv.org/abs/2305.01582},
doi = {10.48550/arXiv.2305.01582},
urldate = {2023-07-17},
publisher = {arXiv},
author = {Cranmer, Miles},
month = may,
year = {2023},
note = {arXiv:2305.01582 [astro-ph, physics:physics]},
keywords = {Astrophysics - Instrumentation and Methods for Astrophysics, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing, Computer Science - Symbolic Computation, Physics - Data Analysis, Statistics and Probability},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file symbolfit-0.0.1.tar.gz.
File metadata
- Download URL: symbolfit-0.0.1.tar.gz
- Upload date:
- Size: 25.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dff5e5cac4e057deda3ab464e3292ab2be389afc46b4b00a4b9bba7b20f8d651
|
|
| MD5 |
ec237691fbc03e8533dc1314360df77e
|
|
| BLAKE2b-256 |
fae3f701bc1eec616aff93ee6b1e1472253c83d8e595e78373c8459272627a36
|
File details
Details for the file symbolfit-0.0.1-py3-none-any.whl.
File metadata
- Download URL: symbolfit-0.0.1-py3-none-any.whl
- Upload date:
- Size: 25.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1959ab16bc2b380e090100ac42ff483ef97f4b5f11ce2840641ad43801c627f
|
|
| MD5 |
d72da3bb904db19057ca9481ef1b3b87
|
|
| BLAKE2b-256 |
3aa20e9b80897f50b963d524c9afbcac32903cccd6438ca3dcbd7eef6b8bfafe
|