Skip to main content

Convert volatile trained machine-learning algorithms to preservable formats. Concretely:

Project description

"They took all the trees, and put em in a tree museum... And they charged the people a dollar and a half to see them" — Joni Mitchell, "Big Yellow Taxi"

Boosted decision trees are widely used in HEP, particularly in data analyses for making complex, multivariate nested cuts to separate signal events from background ones.

While powerful, the complexity of their training makes BDT (and therefore analysis) preservation troublesome: BDTs get stored in different formats, which may not be forwards-compatible with future versions of their framework libraries. So now we start talking about dragging around Docker containers just to make sure the right version of the right framework is used. Plus those libraries have to be included in any user code, adding unwelcome dependencies and complexity, and perhaps even being incompatible with the target language (e.g. applying a BDT from a Python framework in a C++ application).

This is ridiculous, because BDTs are actually absurdly simple objects. The framework complexity is needed for training, but not for execution. This package provideds a set of utilities for converting sklearn and TMVA boosted decision trees, for either classification or regression, from their custom formats to vanilla C++ and Python code that has no dependencies, can be safely used forever without risk of format or framework breaking-changes, and by virtue of being static code can execute more quickly and with less memory overhead than the original form. Recently, support for lightweightNNs, TMVA multilayer perceptrons, and MVAUtils lgbm and xgboost has been added.

In summary, this package contains several scripts written to convert BDTs and Neural Nets from various formats common in HEP to long-lived formats (either plain-text code or ONNX files). The individual scripts are described in a detailed readme.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

petrifyml-2.0.0.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

petrifyml-2.0.0-py3-none-any.whl (45.2 kB view details)

Uploaded Python 3

File details

Details for the file petrifyml-2.0.0.tar.gz.

File metadata

  • Download URL: petrifyml-2.0.0.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for petrifyml-2.0.0.tar.gz
Algorithm Hash digest
SHA256 94b2df4103de9af307d65558fb43eb9a531018cdef2ccb64d496243475c24e18
MD5 1e9f4940e67bd60856bb6f0770e2538a
BLAKE2b-256 321b7845f296bee9187f6f95d6ea94aabc401ede8e3ef767e8c0f53f3571e041

See more details on using hashes here.

File details

Details for the file petrifyml-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: petrifyml-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 45.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for petrifyml-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 491626cb1ed7f860290b26b414e0a3a4e0725b4cdbf1d271c04e25f7b5233b7f
MD5 b3e5b53c2c49bcfc96b0838bf9d9bcba
BLAKE2b-256 8fe92fe1a074eb672aab11767a2ef47588f46a8fcfddf9783d59d891307a0c14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page