A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
from sklearn_pmml_model.auto_detect import auto_detect_estimator
# Prepare the data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
# Specify the model type for the least overhead...
#clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
# ...or simply let the library auto-detect the model type
clf = auto_detect_estimator(pmml="models/randomForest.pmml")
# Use the model as any other scikit-learn model
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.5-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82b89d7f44ed52297e225ec7670b2dfdefcceb83959588269ef50ce6a7fb85b2 |
|
MD5 | 3325087b9772dc5f8d60a1bf61a99ed4 |
|
BLAKE2b-256 | 4edcd3fd62dcc846bd169ed2f891112e00148f1f233561d6c54a83a050ab9732 |
Hashes for sklearn_pmml_model-1.0.5-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9cf8d3a21775bbb148a10f3e46f2859b1ce12585097a806cee695a7f073aa09f |
|
MD5 | 29ea88418855b2d01318346d5ccdda4c |
|
BLAKE2b-256 | b59f99a2b612d8c8b38306e2c3321a0bd636f0556db66a7648e3109e4c2dd1ea |
Hashes for sklearn_pmml_model-1.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6c45e1eba84c989bb291e393ba8541e7816e923b5a7a9cb9ae4e241072f9831 |
|
MD5 | f3e08c8e505c394b1b24f9b2f3d76654 |
|
BLAKE2b-256 | fc6fd5457b52de2c2b6ce2d619adb3bc078a523aeac76c9cfa5d04ee5b8e3077 |
Hashes for sklearn_pmml_model-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a154f58244ef35792f5ab4c6541dc5b49803173923ee30458c4f06e490e73ca |
|
MD5 | b83ef1f593cc6b81dbdd006027148671 |
|
BLAKE2b-256 | 431c66aad258b88b771d24c6e46124e827ddb216aca79062bac0e3dd5b94f440 |
Hashes for sklearn_pmml_model-1.0.5-cp312-cp312-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de3ac1a70264d0fc9e635d63700ea8ecf43732ccf362a2eb8c38cb2e6783f64c |
|
MD5 | f91346bf316b86eedbdac54cac5e13b7 |
|
BLAKE2b-256 | 2872887e5a81748012af95178094bb07f907ab88187615f0516c32346876b900 |
Hashes for sklearn_pmml_model-1.0.5-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 315dc575869ae976854485995e6820af851e51f18c52049cc34354956dc84d08 |
|
MD5 | 35b70731d4b658ffb16c67d8410bbbc1 |
|
BLAKE2b-256 | 7474804663a78c0ef35672c66a8a70b66a9f6d2616682c91a4659c193ff2ac22 |
Hashes for sklearn_pmml_model-1.0.5-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16f30178cf5a9361ac76444b9c9f147fde3a1e89b36bde65a01538436e0e05b3 |
|
MD5 | 9776ffd020fe271b55e15dfbcadf6f52 |
|
BLAKE2b-256 | 437a49eaeb1f48e70c9d436509f65be3eb131709bd01860f1995f669c7c13715 |
Hashes for sklearn_pmml_model-1.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7cb8dc7900f193af7703eeca4d97137aeb3eafa67e7fdd34baadacb6058bca7 |
|
MD5 | b4cc8f11c399c1f40200ddf9e7294de0 |
|
BLAKE2b-256 | 76abfb11f90bc77311eeb9fc96b8b5b7ec2b2d4591d674529fc981ea67626ae1 |
Hashes for sklearn_pmml_model-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5dd8e490ef1801ccd6983f5e26e7411b668f79f6d8f86bfb294e5eed206e4be |
|
MD5 | a952f12683f1e2dcb0598d535493ff3e |
|
BLAKE2b-256 | d5603ba8989a524f55eefc2f8c6ac65eb88bafb0a4fd1e5bbbf9cd2714a23ef8 |
Hashes for sklearn_pmml_model-1.0.5-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c54b998205042c852adfd8c4e5eb8698cd5bd0f843ac8b1cc9ce0d7b4b05df18 |
|
MD5 | c084ec44db7d367112e2e93155e0273b |
|
BLAKE2b-256 | ce20c36ac088d1030451afcc0d284e39886cd79a91b0c3f0414e68fc64ee64e8 |
Hashes for sklearn_pmml_model-1.0.5-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d97ac66a1bd83f849886eff1dd626492116545932fd2be20d153c3ec4045e1f |
|
MD5 | 9d950699ced4579375349e724e2e69a9 |
|
BLAKE2b-256 | 62fed8a735acd482b6d6790445b55a9be48bce75fd17108886853a4e910b9999 |
Hashes for sklearn_pmml_model-1.0.5-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0503b800dab429c7d9505b19988d8324e2457c497eaf1ecc12f70d78004689ba |
|
MD5 | 580fdb1759fd73062113d9c2c4e31828 |
|
BLAKE2b-256 | 4f16d3f5cb1b5c6860b02777c70c27df999281c84d69ceeea9eb8e8eab5249b0 |
Hashes for sklearn_pmml_model-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57914291ae68264c9bd5d064cf3844ecf993290ce22cb4220a22248d383b4a8f |
|
MD5 | 5ae13529604eb1b8c44115098d6a5ef7 |
|
BLAKE2b-256 | e45e47c31e91e2ba05dc3dedfc1435fce4b9bc995df751e7730b5a2d89de88a2 |
Hashes for sklearn_pmml_model-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b9646a35b23961fb6a442fa44cf46958b0db625ffdc7430e971bd83c3c43e2b |
|
MD5 | 8605217b5326992a396a9b7ee8cb2380 |
|
BLAKE2b-256 | b9faf98ca3c97b128a1abd196d9d7a52612ea7f2153f1c8b17ba87604da3cac8 |
Hashes for sklearn_pmml_model-1.0.5-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a39b60fcc0d0fd971ed2ce43abfd1fd30267f01369d426d8e66f1cb7dfa0e08f |
|
MD5 | 547818b769ba51e8175722bebd821e01 |
|
BLAKE2b-256 | b3e98dcc1a524d592a846247fc0e40780be6c84c68150f33b703b6f35053f63a |
Hashes for sklearn_pmml_model-1.0.5-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b62fc3f17e21ad2b1428bb2cd78fb9ec742dc13a595dac772598e1a308d7d9c |
|
MD5 | 3aabc2d4a881e8e0d49648dc5f18021b |
|
BLAKE2b-256 | e9b010653312c2e8829c3f8384837dcf69e4485c1cf373dbd793d5fc0f2f0a3b |
Hashes for sklearn_pmml_model-1.0.5-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 858318a561b31bc26984274525f539021f393c6eddf6b12deb361c4478fdf886 |
|
MD5 | d8ed098dd1b4f9e369137d5371866978 |
|
BLAKE2b-256 | e61a07b9e90bac9e8f26af3663058f8215448e3ca7ea8c975d75630ae7b4eaab |
Hashes for sklearn_pmml_model-1.0.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f62b9e223a7269b348b3116fe473aab2edb7dbaf8b7ed28a198a8a7a3dbeaebc |
|
MD5 | c8c6b8cac4f2535703d7d2b551a46af1 |
|
BLAKE2b-256 | 3bd83e815561a947dde2f810001ec39d3e8c7c6b7b4946155bfcb49eb0d77501 |
Hashes for sklearn_pmml_model-1.0.5-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7a6d25eaa39cf903ff441653cf44d1d7627f235d5a8dc5febf668f0779da909 |
|
MD5 | dbe6ba58eafa796462708612dad26110 |
|
BLAKE2b-256 | c81939b1757c8a7796433b8f97e517c324fb77ab2c258244c29bd8d4f88a682a |
Hashes for sklearn_pmml_model-1.0.5-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80bf3745384878268675c0338fc1194fd9c221f71b71d87b485a7e073907db76 |
|
MD5 | eeeeec98e5e616b4efaf150a32ead9be |
|
BLAKE2b-256 | 7430a5b5740afbd4121c0c85082659feeb5bc822616d5b5e7934ac6b377b1089 |
Hashes for sklearn_pmml_model-1.0.5-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38ddd8ca8381d20084ef59287d2dbef292336278e43312f0e3c40a2670306aa0 |
|
MD5 | b265d2a36c1d54d963d58b4f4b07e143 |
|
BLAKE2b-256 | 6bbddccacd5eefe8af6bf57f0962c2225d2acc076df1ca6bdc59b032ec861e29 |
Hashes for sklearn_pmml_model-1.0.5-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42cf04aa1d8e03578b9ac1a65c4ecd713b58027c001aa5070a02842acff9df04 |
|
MD5 | 8aaae54432300356012e15033492465c |
|
BLAKE2b-256 | 537ec6c9204988d1b9d5712b88c3ffab72095beb285f91134af460049a95c127 |
Hashes for sklearn_pmml_model-1.0.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d44afa9666c33ab0ba535968fad7d38c5107cdc8447d96da92a0003e752c082 |
|
MD5 | 5ac70e3600e63d741e5835e495aeb6a2 |
|
BLAKE2b-256 | b40aff376fb72d0b968c5caa8f6a286031faf6d8033be0af8bf7b42a64336b75 |
Hashes for sklearn_pmml_model-1.0.5-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b883aad9c77d5891a88e2e408e5f0f5eb711ec874a454a321f615a380b35ac02 |
|
MD5 | eab7f9547be2368c6b56ebd3b69d1079 |
|
BLAKE2b-256 | 399fbc2e7bfab439f582825f850aeba40d4ceeaf4c8dc82c8840cc8735603f81 |
Hashes for sklearn_pmml_model-1.0.5-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e261b5fc2e4490af41dd5158eb007ea204ad0e8905324c5afa86289237a7efd6 |
|
MD5 | a38bc23a92a19c863bcc23d4b5b26dcc |
|
BLAKE2b-256 | 5c1e2a15ca2d96a6038961bace21949e25447706f2105deac6f198b773bda027 |
Hashes for sklearn_pmml_model-1.0.5-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3a7ce74d4481806bd2b115d37ccf4a9796ee816fb5803091c716920ac7b05a3 |
|
MD5 | 7a7eb9bef7e8bbdc0e58f50086b80c2a |
|
BLAKE2b-256 | 8112cd76a68dcf3fbce0aaba60c57d52a6ec8b2091defbaf2481ae9b6a66011e |
Hashes for sklearn_pmml_model-1.0.5-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81d0038196fc3d5a875637cb2428fe4f2dcd52cd812af29058d9e538720c682f |
|
MD5 | a596e857e65e7c9f6927e20621497e90 |
|
BLAKE2b-256 | 4c1f1f246ebf715e0926c72165426fba1370db1dc0a3a5aab1e61d82e13c441d |
Hashes for sklearn_pmml_model-1.0.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b2abc430df849fb8f7560170ce152a37644ac0f4607ff7affd9c98d16a2ce60 |
|
MD5 | b9f0d0486b3047093e7c54b2f4d8f7e1 |
|
BLAKE2b-256 | 884ce1290d1239f1fb2527ef3f99987df16c6a1144f8e51ea989b0231d89a827 |
Hashes for sklearn_pmml_model-1.0.5-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 850ca719fcf0bdd80d0b8e9f580d04c91070a353dbd0352880ec32f9dc8b8dfd |
|
MD5 | 9d57a72eadb639179d6a824e82ce5043 |
|
BLAKE2b-256 | 76b8ce93259c096c85338aed33368976e48a15b5e3a8c47873a22a38da7b66a4 |
Hashes for sklearn_pmml_model-1.0.5-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea01490854231e37938c9761e99a978c479cf085d7e443a18c62d4ca4145b842 |
|
MD5 | 65a492c0118bd27f9d1b25817b12459c |
|
BLAKE2b-256 | c53b6411161442f6dd6783173d3cc15ee9aacf98d8dca2395975fd5cbc731551 |
Hashes for sklearn_pmml_model-1.0.5-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41feecd7f2e0f9787262e5811119c23e73a9a3b5d65ec251c7e08cd814d00a6a |
|
MD5 | 9d4438e7c9d07fbcbb03eef1610b1aa8 |
|
BLAKE2b-256 | 9dc630f144e735c4bc4690038b29f0bf9aa9363d6d3f4e49a98a6a74195e209c |
Hashes for sklearn_pmml_model-1.0.5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2559077853d531cf8b29783b573c1197f42137aba0119b35ee2a69e2332b39b3 |
|
MD5 | 99767859e688e41dcaa38b1de1cdbc09 |
|
BLAKE2b-256 | 6f3dcb7e47f70ccdd24999eac31ca2071be5c32f7b6cca8a547279a06aa0f249 |
Hashes for sklearn_pmml_model-1.0.5-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ecfae9fe7af683d23a4249f889d20fdf0d48e279fb7dc6822dc0e2c0fff77a6 |
|
MD5 | 563ca05180757f02e64cd0fe04d86ce5 |
|
BLAKE2b-256 | f2ae29da0fd60ac6c77816ae3df150f6cf97caf9af7cf3cfdf568583f961ebc8 |