A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
from sklearn_pmml_model.auto_detect import auto_detect_estimator
# Prepare the data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
# Specify the model type for the least overhead...
#clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
# ...or simply let the library auto-detect the model type
clf = auto_detect_estimator(pmml="models/randomForest.pmml")
# Use the model as any other scikit-learn model
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.6-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 813fbe48d77ae056fa4457cee1174caac5b2580e630e61d6167e5d47b5428da3 |
|
MD5 | 92d2d871dc4beb6d1fc50d61c34435ad |
|
BLAKE2b-256 | 39fc1cd8a3d8fecbfeb7f79948b1af889d838fecabb874ce03c541dfcb9bee8f |
Hashes for sklearn_pmml_model-1.0.6-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ca57a4891cc5752d09fa19492e9c9941fc6ce1dc11fa79eadf27fb501892904 |
|
MD5 | fb05771921e98b04842fbf3bd45ae77c |
|
BLAKE2b-256 | 4aa64ea759b53a02c2ca21fb779042cc3814344ebf2ca0c40017b330c9d52fe5 |
Hashes for sklearn_pmml_model-1.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f27b12d75b73feb64a92cf228fec831f615b9b725b4e97a9f8033648d80ad53 |
|
MD5 | 4fdc9a0ddf1a30ee3634e00d9dc7cb5c |
|
BLAKE2b-256 | a1057a8d015e678e70ceb3a77304c3ae32e07a84684edf490b7586146e2ad2c7 |
Hashes for sklearn_pmml_model-1.0.6-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6f3e75410158cf99aa8d0cfe1f0dc5f5ee669ab01868bcef66f3049208ba030 |
|
MD5 | adf0d8b8bd33f090784b31c7489615e3 |
|
BLAKE2b-256 | 196f7a94cfe6f1a68494d7177d54f11665fd794bbf902241ddbf481d94996656 |
Hashes for sklearn_pmml_model-1.0.6-cp312-cp312-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aeebce71681ea35b5f2d87a1cbd48c95d4ea097e299b7ca5ca00511eb172ee84 |
|
MD5 | c976234ec7450a51a3d460879766f6a3 |
|
BLAKE2b-256 | a51ac017ad84463e1d3b6166a36b677190560eb8091d01f60f9cb54880b6929b |
Hashes for sklearn_pmml_model-1.0.6-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59f26ef869e63c3568ef145c6ef93e72deb3baa922dd1778d6329890cf6d98c4 |
|
MD5 | dd79ab637d58df7b8b91b495d1696507 |
|
BLAKE2b-256 | 8082fa25c1dbd7cf6b3201c2664af7ad9bcdeb01fb5bb40cf5e221d874b85f20 |
Hashes for sklearn_pmml_model-1.0.6-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a998336412ed70cca9f154d66d3754d6c0fdb68f46a1ec0863cc592032a78ee3 |
|
MD5 | 6d1622972ef71ad59e0aecb0b7a52dc6 |
|
BLAKE2b-256 | cf681a1ee74206df86641dad1bfce383e43b55c03987c5a752ea4a94199ae95d |
Hashes for sklearn_pmml_model-1.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 924237480caa5bc506e11740ce2926a4c985819567d1b7a409c73469aea9116c |
|
MD5 | 095e138c5e569501432bacd70d89df28 |
|
BLAKE2b-256 | 37bd2d0e1ba0bc1ab1b8a5f9a5742650f843f4015d1ca99d3ddf8efcba768cd4 |
Hashes for sklearn_pmml_model-1.0.6-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07073a54d1e5dab9523ff70d93130184a33f8ed233ce4229938e7224eae0ff34 |
|
MD5 | 2fa1d6708cfa5ce0267951522b0cd009 |
|
BLAKE2b-256 | 9d67edee9f44bbaeb639b25126258a76cec1048796e27bd234b8b16a55c606b5 |
Hashes for sklearn_pmml_model-1.0.6-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ef01410a512c96a74deb9a5e1f8d58723994b1e9a7bbc59668018b1ed17f0f3 |
|
MD5 | 89a839e236ea7041ca4ff1cb707b762b |
|
BLAKE2b-256 | 9639cec99fffa9d14d46ecf8bb408dcbbbd75548af5024b78ee923ae6cb3062c |
Hashes for sklearn_pmml_model-1.0.6-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7dd7d84a7f34f84cc771128eed8e803a4264b37285b54d788640efe3ea20edc |
|
MD5 | 5f3975b65ee43d735321fafebdee8357 |
|
BLAKE2b-256 | c4178a3605579ad82f13be35385aeabd7b249a096e706f0817b6296ea229977f |
Hashes for sklearn_pmml_model-1.0.6-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4047a42e5ce48617d99b008a3896715b0c14a4e3b6856e045021a940b56911a |
|
MD5 | b8259a3bb71c03533116bbbf9247e937 |
|
BLAKE2b-256 | 9e0a85b656b0114b94bd7448a23aae8336ce7060bb0e5c9f11bd40021820e600 |
Hashes for sklearn_pmml_model-1.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dd5781edf161f17e1cb85f07f658e10d7589bbc310a95da33e10e17660b87f2 |
|
MD5 | dd6c3ca7769883b2151470f2841a9923 |
|
BLAKE2b-256 | cf9dd5fb2a5ca9b94fc49faf1d99cba15001b81c8f05db46e5c0b731fc362ba6 |
Hashes for sklearn_pmml_model-1.0.6-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7c190df68bc0cd63f0a38008a8146ab5d42c78e399ed70fda52e729942a8ae4 |
|
MD5 | 85defc7422bee83a9cc1741e662acc82 |
|
BLAKE2b-256 | e84c7e11d8375d2b693c92c1d2d121eb4edf0e20b53529c1bc6afcc91bbf4ae5 |
Hashes for sklearn_pmml_model-1.0.6-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb4a031ad0fcbe6f17dcd46ef302301feccd028e60ffb367e562b9dcc45e5b84 |
|
MD5 | d5c8a37230ba2ff76bd4c7794830e3dc |
|
BLAKE2b-256 | a141df1da76c5536729e9960f702f66ce2254e42f498316eb6d0ba4a6437e7ac |
Hashes for sklearn_pmml_model-1.0.6-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54b70559dffcd48fc26b6989066687d5c51a79a8d47ed8eb69c322cf7b61e9b8 |
|
MD5 | d61c33190217e722bf7b95876d5f0777 |
|
BLAKE2b-256 | a66d7ddde90ea55b0b63ca745f46827d6947f905abd9007baec79886d4081a4d |
Hashes for sklearn_pmml_model-1.0.6-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47c496a0793ba0971217cade319ae4a3a1908e85ba3122f24536d52ffd921983 |
|
MD5 | f267d18d20fc7d3f86c589f9c0ba750e |
|
BLAKE2b-256 | 873b31308ae7efedd9a45f32f64d8454d9fa24fb570a0d7fadf238edb69c2821 |
Hashes for sklearn_pmml_model-1.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e305a59af565699feec7cde30ab4fe4d21aa8c54f52ab9a9af3e10d641adf940 |
|
MD5 | 04849ae07b566d1089c60ef53761b574 |
|
BLAKE2b-256 | db7967924aa23ced3b5fa22c8945e50d3713b7cecb065452bfceb72cfa4d4c9b |
Hashes for sklearn_pmml_model-1.0.6-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e4c2579d280b800a5170556c9f3857fe61e5addea03503f17e62e31e3dccc4b |
|
MD5 | 609d6d829d622b202719d69dfd5d1cc4 |
|
BLAKE2b-256 | d31ab40712b89727ccefcac6e45b71e33cc725ffd372bc43723264635933a5e5 |
Hashes for sklearn_pmml_model-1.0.6-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08c5bd27cc39f3247d63e489bb1341b858ce9108bcb17b4539a50f98d9611ac5 |
|
MD5 | fcc06ed32f48fa9551213618bc40ec1c |
|
BLAKE2b-256 | 7878bdf7f158d256e25ee03450b6d252b0671ffca09be4bad88a52ab7989385a |
Hashes for sklearn_pmml_model-1.0.6-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96b53e896b3e58097ffe9f38669a3d8a37130d5080e974b87777e860bfb6dfe8 |
|
MD5 | d223576f5697443269d747eb1e035d00 |
|
BLAKE2b-256 | 0555dd22f902b947952b5f7fd7421910154a5faf570e41b790b8c5d7eb1a5838 |
Hashes for sklearn_pmml_model-1.0.6-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ded3fe89ee45ce3a5f017805cd7fed6568e0b56f42e41a71beae699136a25a8 |
|
MD5 | 270d3a0f239efa1b29aeae58d8781356 |
|
BLAKE2b-256 | 059fda960248c6a90ff759973f3508bff221452226a867c9256763e87db3ece5 |
Hashes for sklearn_pmml_model-1.0.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0742595023ddd1cc06820f988fea7d10c3f7c293213ce47fff3adf60386d28c8 |
|
MD5 | 69d9d98b4ec526326cb823c98e8e5a58 |
|
BLAKE2b-256 | a97e0e546bd8b06964e1579513f0b70cabe347eca660ce32560ee7175781ff52 |
Hashes for sklearn_pmml_model-1.0.6-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a58c3524286da124558d440b94dc68dde7b3ec38624659c6f8a2906d40ca106 |
|
MD5 | 8adeffa677d004dfedf72f9bcbfd14e4 |
|
BLAKE2b-256 | 6b002d8c6692df1fc04b277b0b19e5402ec659e096cbbaa176663f2dd5d08d07 |
Hashes for sklearn_pmml_model-1.0.6-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 724115ddcd1ab075884fb0546f5f67c56bca2f02bc68e444374346e547b06d18 |
|
MD5 | 3a893c514151b1bb02b2f5055ae2db34 |
|
BLAKE2b-256 | 0303679510d38f00f9bc1f7074a4605cce318f4ea5b350ff75950d9ac7570986 |
Hashes for sklearn_pmml_model-1.0.6-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0cef9818e31b916feb664b900603d5df0fd326071ccfe8237f810b7a845ca9b |
|
MD5 | e637a106d5ed5c7338a22102c334e5ff |
|
BLAKE2b-256 | da2a1b0309f220cdc9fe5468b2cc47f13071ba7056d5b35e1a8d90a056c28d75 |
Hashes for sklearn_pmml_model-1.0.6-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ccbd0b9b92bf1d1f5933c3798b2d7ae6af5169e7299836af2606f14af7761d8 |
|
MD5 | a5bca6adbc34e83594a50c87376a7713 |
|
BLAKE2b-256 | 8a6d328d5cb9d73821a0b71820c7ef74519620df983acd668a6b1784735eac5e |
Hashes for sklearn_pmml_model-1.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 200c0884e3ce94dca388f603e9e6883f3e8bd09988193e929b5be0896e43d05d |
|
MD5 | 2724db9e6bc35a4b21cbf94ccef3c780 |
|
BLAKE2b-256 | 8507292ef5e8636f79c295ae29175e7bcf0110855db06cefb6d746ec3d6c2923 |
Hashes for sklearn_pmml_model-1.0.6-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc4a51eba6057676759cc955b6303d0d65e8591b73ca35e2e14ddc689d920f1b |
|
MD5 | 5d969a0514e0fa46e7f4275a71fd1b54 |
|
BLAKE2b-256 | 3cd7c9e5350b8921dd7542a2dffaab86859bec4873c635e644ab63891f81ff12 |
Hashes for sklearn_pmml_model-1.0.6-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a5d0671c64aac2b177ba94d6a303ba0cca5905fff8c05659553c5d12c97c241 |
|
MD5 | 825f03922f6be5de06f9a858da7dd25f |
|
BLAKE2b-256 | c400f5b138745a7d08c32f7dcc19c4b1f3b8c85ae1bd0cdc2eb2a621a2a1a2ce |
Hashes for sklearn_pmml_model-1.0.6-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4532e6fb488ba3c7fec39136b5699e9b5c175ff338ee5f4c7c8fe65e2f024a3d |
|
MD5 | 6ea9063ebb955d01c547579e3bd0ed1b |
|
BLAKE2b-256 | aa2f3db912b25a3ca333f14ab62ee078e10dfb884c97c477795a2daca5907f94 |
Hashes for sklearn_pmml_model-1.0.6-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 293763ce674fbade1c5e7be594c5b01d25b34998069f419e69f3ab1e6bdf2495 |
|
MD5 | 3ed7e201907f8dc5ff631b6c3812a2e8 |
|
BLAKE2b-256 | b822914a9d2bac629c45ac456b8ad9d2005db76ae685fb3b56c751afa43108da |
Hashes for sklearn_pmml_model-1.0.6-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d8dd114f72575ade67fb250310f67990fc8963990b88aedbc55c618b10b4793 |
|
MD5 | 834858435ba3655814f7ba10edbb8b9e |
|
BLAKE2b-256 | 4d13007dedd905a77b92fd37ea752822dac8712ffc7734d75d27a69000ece285 |