A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
from sklearn_pmml_model.auto_detect import auto_detect_estimator
# Prepare the data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
# Specify the model type for the least overhead...
#clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
# ...or simply let the library auto-detect the model type
clf = auto_detect_estimator(pmml="models/randomForest.pmml")
# Use the model as any other scikit-learn model
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.4-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0612d14658dd7c5f366829bc05146686b81b921078bec2a364757938af932433 |
|
MD5 | 54a868cc4fea53c329f941fa68dc8131 |
|
BLAKE2b-256 | e7040c6668f118ee23d8bcac18e374a9daa931296c28c0bbc5437df40528f7e1 |
Hashes for sklearn_pmml_model-1.0.4-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c99d12d07551eddc5f563866e4468e29c9ca6ae6e36816c2a32a59ef6c85f077 |
|
MD5 | 440b436ea5a603dd434a9f9ecc34b2b2 |
|
BLAKE2b-256 | c2a148a763063714773f18968e21e8be05e2fd93865e2946738446b6bff690d7 |
Hashes for sklearn_pmml_model-1.0.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e01d693ff07eaa55b4c8bb38fe7e3937d1a04311240f2e7d3727ceb301aa1c9 |
|
MD5 | 630729b6e8335331bb17d02802b2bbf4 |
|
BLAKE2b-256 | 74d720da9207fd9f9090c6b89a72ea62cfbcc256360112527c12b678053cd6bc |
Hashes for sklearn_pmml_model-1.0.4-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6a9629f86322ed7f61f102d0cb5375a89754c85ae7583a1ae2b8a4a184c5d87 |
|
MD5 | a2d4c2fc16c57d6bb09c8617daba9c6e |
|
BLAKE2b-256 | 7602a584b7ec1cedb7ee958e2fccc8808e91728cbe276b5de09a9c2fbec36d64 |
Hashes for sklearn_pmml_model-1.0.4-cp312-cp312-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0de45926b68509670fb29370a8f5990ac2b68f448145a5c7cae8241467da8dc5 |
|
MD5 | c06b9ef79974906982259a8915605613 |
|
BLAKE2b-256 | 50b32c0eebfa757a4b508b2073752fc0b7988f8015435b319f2527f51afe6d37 |
Hashes for sklearn_pmml_model-1.0.4-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f09dc704d58c1635bfe611687bfa94bd6654d13740613cc1f5531d700019a48e |
|
MD5 | fc686537bdf9aed9ef8911b002195ee1 |
|
BLAKE2b-256 | c63d1527ad56ed69c37c756275509026c3a04bb8330d02000e4317318cc650f0 |
Hashes for sklearn_pmml_model-1.0.4-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 526edd830d6aa7f9b721b42649db0a2a9a0a4a7e35ccf33cecda0469608c3e89 |
|
MD5 | 256704731247580f7f13ccc762f517bc |
|
BLAKE2b-256 | 683d46d65220b5e97ee1da2a7cc20038149004cb00c14894f6cce8fbd547fed9 |
Hashes for sklearn_pmml_model-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99ac9314ffa8b26325060db5a90bc6f1d3ea847141671aa6d75007a1335a0144 |
|
MD5 | 71bc257656759f0040ead87be503fdd8 |
|
BLAKE2b-256 | 52f4acedcfa7dc904599de86bdaf16173a64e6997b2a8b13a0e07a5478f6b895 |
Hashes for sklearn_pmml_model-1.0.4-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c14f820c19b2f3d6e3e4c40acacbbf5381c86374efa26053926c268eb91c8464 |
|
MD5 | f0ca1c48ad2c30e247c500e7b681014a |
|
BLAKE2b-256 | 254970acec266ac271a23e2ba94abb499025ef1b9c66d27492c1fef7622bc391 |
Hashes for sklearn_pmml_model-1.0.4-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d03b7add710ddd9769e122dc287c2a9e5bc66fa6816e85c073d75f0d530e5b39 |
|
MD5 | a3f8e99d4c4b0945cfa5ec8997ccef49 |
|
BLAKE2b-256 | bc0284fc73fcb1d304b49428bf0ade9a529e1e0ff0a35932be461a8551642cc2 |
Hashes for sklearn_pmml_model-1.0.4-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d1180a71d7c601488b9750aefd236c5be955aafd05306d230254cc1c44c0915 |
|
MD5 | 868fe49a09962310f827a819affe55a2 |
|
BLAKE2b-256 | e6e7edee69ffcbd54687bec06b45781eca1fb840f4b2d2ed4ccf23cb1b1234fc |
Hashes for sklearn_pmml_model-1.0.4-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e4968e7306797e8ed28e33c36e71d8e78419489bbcf6f3c3e0711b20b6b1428 |
|
MD5 | 41c32bdf3140dec900215ed306c1776a |
|
BLAKE2b-256 | c325188f3e81deca8870e6bd1551ce92788e8af7aafa9fc503ad868e2370afd2 |
Hashes for sklearn_pmml_model-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c37721d29285891719dbe43225aa788bfdb5d99d78f97336505ffaf0d2dce883 |
|
MD5 | 80c0de9c5d0dd1ac195248e3a07db9f5 |
|
BLAKE2b-256 | 9ab087186338be154c61fe5528e456c5480a59e77c64a4e8b05aadd69d9bd46c |
Hashes for sklearn_pmml_model-1.0.4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48d88892eebd58871cb2a676e061ab2fe77895390dd71a8fcb947a3220f1c4b7 |
|
MD5 | 3373bf58b6af061809e755b66af88324 |
|
BLAKE2b-256 | 86f2b353370d8f7c8dc0c043dc4e7ae5d34f1ad26a34c40501a4b9d61b8179fa |
Hashes for sklearn_pmml_model-1.0.4-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11ad007e1de0573290b708929cc691e7bc5c109ec0275538698ea44ccc52640d |
|
MD5 | d99fdacc8d8423a49813cb14dfc67592 |
|
BLAKE2b-256 | a11ef8322d617c7869717e3d3a8b20c1bd545dcb4763dd26a34cb76b4402c093 |
Hashes for sklearn_pmml_model-1.0.4-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb7f75270d6b9ae71ff1338195215acc2b11f8f3b54b6ea6f21ac55da7e0d010 |
|
MD5 | ab443e8bbcae1a760b39509ad7ca747e |
|
BLAKE2b-256 | 68bb46336560cd2d054abc666737a48e25e71a9b7bc46cb9f1713d55eecaf45e |
Hashes for sklearn_pmml_model-1.0.4-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e461f6036d27968d1c90f8bd20b5fcdf48e837486aca5a2b2ab0b4d6129b3416 |
|
MD5 | 3a3f5d9736e873e4ae77eda90e27e2aa |
|
BLAKE2b-256 | 9bb1c9d4c26e634337bb2e865a9c78dca8bfeca4f697f73b39a4c848e7dd7529 |
Hashes for sklearn_pmml_model-1.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 729d6befd5e22c1171aa38edf017be1a3563f4680186baff51a71295123f39c2 |
|
MD5 | 28da7c2a5b5189e46db5b40f8830fe67 |
|
BLAKE2b-256 | 30f45c8e7a50fbfe1947bbb4daf03bbe4b2e126b58a385317eb6f601055f6445 |
Hashes for sklearn_pmml_model-1.0.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ccce06119f53feb2782dd4067da2e1cb64af576f941001909f71911285221fcd |
|
MD5 | 5fd29ede9da22e1b3453879fa237f4e6 |
|
BLAKE2b-256 | 2525f1ef004c35c0ea4d6da9e32760fead76db4cd4d1b0bdc97271e83774081f |
Hashes for sklearn_pmml_model-1.0.4-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8d66108057bb1bf371ecb2f3180933cb42ad31c134b367d2e96514f91da7ac4 |
|
MD5 | 0ac22e6a2c2e967d434ee5ae3e945335 |
|
BLAKE2b-256 | bf3bf77ccc424739f452e0590cf3cd4417ae2aadbcad23a70d7657717a57bdcc |
Hashes for sklearn_pmml_model-1.0.4-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f23da56fbf81c1ff551ea3381bef50bf06b1616c1562a09831c872ba631ef4d7 |
|
MD5 | 311777429a0c6df8d65fe40965e7c96c |
|
BLAKE2b-256 | 95ef278bf5c836533c751fcfe5489fcbfd2ad76576853f0b31fa2b00d87378a1 |
Hashes for sklearn_pmml_model-1.0.4-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 809734d2b620c6767111b7c4dc290c89b2d8d1d47053650bcd7945947316779d |
|
MD5 | 05b04f86c5847e258987c9b178d4e8b7 |
|
BLAKE2b-256 | 3dd52ac9d2506087358f02b5afe06d61f27bbffe9120296019a34e75860aa89f |
Hashes for sklearn_pmml_model-1.0.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 321642aee87eded42401cbe16fdb55b6c3e0e9df3e06e7ddcb568eb62b1996d5 |
|
MD5 | 4d1673b1183d4668a49ada3a49b642c0 |
|
BLAKE2b-256 | ef6cd235b9d943102dfca4bc150fe00004a228faadbbc2b55212aec0d1a440ff |
Hashes for sklearn_pmml_model-1.0.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4955b47255250d6624f5c101d703ab2dfa4765da1af996be3a42936e11d7f8f |
|
MD5 | a7feab3be573c6fb3cae8b41779557de |
|
BLAKE2b-256 | b880c8d8af0a0959dc456682e101a7852fe490645f440557a2ecae15f83fbb69 |
Hashes for sklearn_pmml_model-1.0.4-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 000662576487b2aadf6b4aad2f883cdcf24c9942a08b0f645c561fc189237699 |
|
MD5 | cfa94502917d238540b4bf8bf59af49b |
|
BLAKE2b-256 | 4397a739f9bb2dbcfabf9e61512d5505f1ff8ab0a8dc83d693f8bf7773170d56 |
Hashes for sklearn_pmml_model-1.0.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5f45ef86a4eaf482b957888dd05ea917245293f5a069a9a8d0c7bb2c865ed1c |
|
MD5 | 3d1b3a84d10699523b6bd5045a87c150 |
|
BLAKE2b-256 | e00bb30d9796ee618592c01959d44927d03af5bfc8a59783cf095384a85e7bee |
Hashes for sklearn_pmml_model-1.0.4-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a6a20efed2d24d513195e0fd407cbe67ea3672ae05fa7b04b1cd0e0f45c4267 |
|
MD5 | 365617d86465269abd274c447755dfb9 |
|
BLAKE2b-256 | 01c4cf56e42a3b38124bd3fd2f8d93dda307d60c5272452c9ae6115de461bc22 |
Hashes for sklearn_pmml_model-1.0.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a474042ad29752e6416778065ebf2a2290e5907a48aab667108433da6767e641 |
|
MD5 | 4fd25980323231c2ad71e38c5a8b1ce5 |
|
BLAKE2b-256 | b9fc67833ea68c0e05c76049081782b0f50efc31f649de3600b22d75da273c79 |
Hashes for sklearn_pmml_model-1.0.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 155b55835925b147ab748b5ade2bb8277f2ebcbd51e3835ef189862c7fa53117 |
|
MD5 | 5bdbf45bf154c936a3c25aa7bf16762b |
|
BLAKE2b-256 | 9fa393de2d763ac335d06500311916351b2705046de4738a4d50864affab05c5 |
Hashes for sklearn_pmml_model-1.0.4-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d6ff8ec27deb92c2fe9574de9d2e80ddec5b14201b28ec841bb0305be190fca |
|
MD5 | 80d9709d1bc4a176296319cb31bf0a19 |
|
BLAKE2b-256 | a7425e179238cfc93e89e95255eb6ebea56e597167fafe7d40e0cc107aa9b7e4 |
Hashes for sklearn_pmml_model-1.0.4-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a780086d87da55475fe6480d6f96170f4da569ae844553173acb29cae1a14eb5 |
|
MD5 | 472df64fb87375be5431f9cd2005bec5 |
|
BLAKE2b-256 | 5e96497373206a45a321196aca8e31b4bf7e283ea8325e2a9a97dd00997dd507 |
Hashes for sklearn_pmml_model-1.0.4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cbf739ab6cb93d523f0c0bb948769f1d89d899e40a971fc31c0406c427881e18 |
|
MD5 | a14a544c9eb18d9857c87edfab966ef2 |
|
BLAKE2b-256 | e38f8753a082750f54c42889e54b30ae8cef30fc82953f4f882720332c9dd5f6 |
Hashes for sklearn_pmml_model-1.0.4-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92d31c6d2ebaa2c48fe374f03d37f196d85966c0457c37ee782b09dbd7fc88d8 |
|
MD5 | 5241a6e7278ef3a2149a1bd6c7d9cddc |
|
BLAKE2b-256 | 38deecc0e586818a99d25a9c41e5e28a9e20cd5c52d77f4cc803253bbb2608a8 |