A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69602ace79a6caca4a40ba8a7d3d8e7c5e7f00b2492aff3a4a2214f2fc93e2cd |
|
MD5 | 3bcf4134d02f756a6dabfbc0ce9fa1d2 |
|
BLAKE2b-256 | 47db92d941dbc3737d32bdaaac078b9dc02c4deb1a2671c5bb6a7dcaf8f33b12 |
Hashes for sklearn_pmml_model-1.0.3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb0a1d62c505adee6996a194812f4d4a37f453916e8dc67c90ea65e2391b7997 |
|
MD5 | 7fdf520591da278dbd75c2c78ee10aa0 |
|
BLAKE2b-256 | b0bd78574c525663c0bec6522ed98669d64f57965938bcc4c474210a2b8a30d3 |
Hashes for sklearn_pmml_model-1.0.3-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 078d13bde93674a867e4a8612e26f374d7bd9279399bf748b78932899fd132c4 |
|
MD5 | 4543fce6ef8c9350e95d93510afb84d6 |
|
BLAKE2b-256 | 4996d9f36c2a9558c1fdbe3fa4dce7c3d59ec288766c1ba4f0cb83917af34d11 |
Hashes for sklearn_pmml_model-1.0.3-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5092660de2806100ee2ca61b488f2f2b0df7228160f5857590510e284048473f |
|
MD5 | 5498eff2b0b366dcee40a439673036c0 |
|
BLAKE2b-256 | 5a6ab3bc38553f296d953e23ce5fe904d81ebd7bd72ea6804ca09bc801045aaa |
Hashes for sklearn_pmml_model-1.0.3-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2d3ab9fa265aff7f33112d2a6b42d02c3ab37654944c0649be0064c8c2f5f73 |
|
MD5 | 4f6c17e250c8d65f271d86a1682a57b6 |
|
BLAKE2b-256 | 972c96e016dc2747aea0228178db80e53b16827cd8cce7a9c064a75360ebe1b1 |
Hashes for sklearn_pmml_model-1.0.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc8eab04eb67125388067e87363ea9b47c4fec73c75bcd186bd213262149d89c |
|
MD5 | b0c9008fbe603c1874d765d47d4133f1 |
|
BLAKE2b-256 | 61c09623aa5adfee2058b89646889314473c5e3224a6791aec41bb95eb5742bd |
Hashes for sklearn_pmml_model-1.0.3-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0756e44db2552ee37a33e39eeaea752e78aca149168496c92430319392eecf97 |
|
MD5 | d3582eb9121ff998d4ef0d956c267990 |
|
BLAKE2b-256 | fb7043e0b99d61f8c7d4ee8c3aceda7ed66444985fa5421f4606c8937abf4e29 |
Hashes for sklearn_pmml_model-1.0.3-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33ee54f3cd0f7a06c642d8b60beaf682fa25bc2e85194b72a0e0e1ab5c9cca81 |
|
MD5 | bdfcfe41630f4a3cac1f33aea28a083f |
|
BLAKE2b-256 | 7c1783e2ef6d08492da19e5bb9d1610f46641a6feb36332c2a3d4972cb21de25 |
Hashes for sklearn_pmml_model-1.0.3-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b2f7c28463beb07bf62fa1c57ea94aeea65e5cbe748a87bb832bc0ff9388907 |
|
MD5 | 68537dd4a7850c252e014ff03677b6ad |
|
BLAKE2b-256 | be88e0a90a318d27bc286eea5390ab21b3cdd72ed67d0aedc0912be3bca5a15d |
Hashes for sklearn_pmml_model-1.0.3-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b35535b0d5bc94507df9bdae17a699a309fc5ebb0b848cab7580ffe01a65e6dc |
|
MD5 | b8dabb73945dfd9468b28ad850670952 |
|
BLAKE2b-256 | 40a913a7966646e0f518dfb578920ba25e56b877dd6c5fc01707f7f52709d71d |
Hashes for sklearn_pmml_model-1.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0aa47b4375c666437c1165b0fa657f21d70e516fbda0ed9db8130f4e921aad60 |
|
MD5 | 29f2401d9bb1e47223322c9e68bce824 |
|
BLAKE2b-256 | 7a0eea08c04f62e894d1c68c35ffccd5b2c723f9955c1545e82587e6a976d1c0 |
Hashes for sklearn_pmml_model-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d64dae55101be4be85b0b0f8a0ee6d4dbd0c2ce3ce67ed4d9356988ac8adb3e0 |
|
MD5 | 8a7f50df0365e7660fd41840af609382 |
|
BLAKE2b-256 | 8b6ed5cc15ed8733f876b3365a6f417de484292bc0dc46a310aeca7510f7fe49 |
Hashes for sklearn_pmml_model-1.0.3-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8175c873b1b778437185e01e1232e3083ec0ba8b600bee6ee4fe8691f8826360 |
|
MD5 | d27a0cd182b2333079546642ad90040c |
|
BLAKE2b-256 | 0e1923ed82de20f3240b1def3a31c7312c8a3ebf9e00fd9e6606b12f134164ef |
Hashes for sklearn_pmml_model-1.0.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c0bb797cab6c5883078e4266671ff63dbb078790311642874cf3e398483d7ac |
|
MD5 | 8c50d690342209045e7a7c23f9eacfae |
|
BLAKE2b-256 | 6ff371c5c8bfa73b19dddb623bcfaefa333a29d58611dacf7455f49002e48aa8 |
Hashes for sklearn_pmml_model-1.0.3-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7fed74057223dcc363cdf0a98601752d3ba609d7a5a179f89c0cb096583e1819 |
|
MD5 | 5251a880bacebc2082a911c4dd423f27 |
|
BLAKE2b-256 | 366d7c67a8261cfc3b05316f7103dc9675d308e8699cc555c8e4e2ce7d945584 |
Hashes for sklearn_pmml_model-1.0.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 127dd158ced547e6bdd8729f23cec83b34b6abe2da45cc5f5286b38fe91b46af |
|
MD5 | 25cbad4f97fd6027bbf3ced66d38b670 |
|
BLAKE2b-256 | 93c6556e67fb51c6f96a152520ad06268f45273c9f8cde1ce6a83934894342b7 |
Hashes for sklearn_pmml_model-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9bf4755a8fc7695d9a51809cc4e59df8969358efaf9938a0d679ab54143f57da |
|
MD5 | 8a313394032b69b1b35edc0400ff1551 |
|
BLAKE2b-256 | 671b890098fd24636455e8d160d56b3d8932d6d10f78d5f28869932533ab9346 |
Hashes for sklearn_pmml_model-1.0.3-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c461caab62bbe77f9c64072a7468ba2d5c70cfdd78bbc15d6b46f9747c39450 |
|
MD5 | 8e4bb66236b8a6e08a150a0f75143bed |
|
BLAKE2b-256 | 0499829a3ffbdc24c10498caa7354d5310b8a7aa45f47c94c89326350e7faf19 |
Hashes for sklearn_pmml_model-1.0.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c641c4388dff171f9c7d63fb67b250cabfad888807105414e5140eda9c405882 |
|
MD5 | 192199bd080f5ff609cd14673d4b2e57 |
|
BLAKE2b-256 | ad26101d48828f5ceae103e28066144c361c084acd1d904baafee961fa049889 |
Hashes for sklearn_pmml_model-1.0.3-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aefb4faa621a713da0b6d919cdecf4285c4c3a7194722a344cb993647dc52e87 |
|
MD5 | 2b8e7b043a976a61c159e843daeeaee6 |
|
BLAKE2b-256 | 55496ff542aa5ddcf4d168b572b9926b557d5d5647da552c22dda2d36d8cb7c1 |
Hashes for sklearn_pmml_model-1.0.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76caaa5c2a90b2c13a52c2b8eeeca71e14ff9f0e522de9c7085535858ad06a6f |
|
MD5 | f2d1d53f55dac77d346ba0b613c6ea2b |
|
BLAKE2b-256 | f287be5caf90d01d6f4463dcefe7761ae612ea2ce4f960aab19dcc571a7c4810 |
Hashes for sklearn_pmml_model-1.0.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c82ab594d001d2374150fb7789e95c12d228d4d253b7d3aba997411ee15f976 |
|
MD5 | 343120ee2e7a16777b85f3176ac04e65 |
|
BLAKE2b-256 | 763a45a6f45e0ac38cfaaa81441f666faa0d2baa0fe78fa0936eac631f83dd19 |
Hashes for sklearn_pmml_model-1.0.3-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aa37ea2e3e96cf70ed9c757787b152a346fac0711b512f03d11fb8e68e64429 |
|
MD5 | 32547c3483da88684aa5cd463fb45c88 |
|
BLAKE2b-256 | cc217c81edb088c5fe71b11ba303e1d4f2448806f49c3e98412454168a59b3b3 |
Hashes for sklearn_pmml_model-1.0.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e18db97b75fc63a0de8cf8dfedfbf526a8a2e6eefb702144abdac1a20349f6af |
|
MD5 | 45643037fc6a1518c6d7046a035d7de9 |
|
BLAKE2b-256 | 3cf2ee7bcadee764c70c6c896df2e68b0720e9c50a9a8b687a24926311a3f27d |
Hashes for sklearn_pmml_model-1.0.3-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d58cf3dab3a682e47ebbdcda6cbd55b433c9aecc096d3b6fe3635741297a02e8 |
|
MD5 | bd59582c214d61709c85c97f7b992579 |
|
BLAKE2b-256 | c7e387b45dafa0b8ac03d64f2f3918c0af0de71679c842e8f8eaa0ca3e794e1c |
Hashes for sklearn_pmml_model-1.0.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 450a80449a5743ca91905ba75c470a63a8e3231016a7adb411630de9783c77d6 |
|
MD5 | d2526cdf548a18d7fbe9c4a5b56ce59e |
|
BLAKE2b-256 | 60736ed8b8d143a508c191523e223d10671ae63752aabf88bfe9900df0ff4cfc |
Hashes for sklearn_pmml_model-1.0.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af782517a74d1f21b08099d75fa6568ac7cc1c52c8276c74ea3317afa4ff1117 |
|
MD5 | f48fddbe627f50364e6f0c242dfb2ad5 |
|
BLAKE2b-256 | a5c72a04228e8f66ef4700437aec88c50927dfb056abbe64cd212493bdefdb31 |
Hashes for sklearn_pmml_model-1.0.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ef390e25171f3dceeddcd9419d81e9a731543ce4e4944a7cfb32ee859ac45d9 |
|
MD5 | 0103d2519ccc5a610a635a9458c25d3c |
|
BLAKE2b-256 | 0f37a731395cc06f5e9b73ec485fe748802d324d1633c4f68bef088550dec067 |
Hashes for sklearn_pmml_model-1.0.3-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9ae3584e82153fd1f302a4a4f90ff5c171316c8b27d0f049b76405b874a8a24 |
|
MD5 | 1384962f5d7b8e2544933605efabd699 |
|
BLAKE2b-256 | 8aaa58a7e0c6fc6aa32bf95adb0e88e7bda836bd8cc140ad095634a69dd9f1ba |
Hashes for sklearn_pmml_model-1.0.3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d46568cfb7fbe6f4351594e30dd0551cf690fe1306001a0d09d8579caf97bd9 |
|
MD5 | d29ba62b6c662af904df6a6de3791736 |
|
BLAKE2b-256 | 35c268eaeb487bfc9cdd80bd3eb6aa797b9211a47646a8a32adcb2742c435cd9 |
Hashes for sklearn_pmml_model-1.0.3-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe112be24f98b4fed64fa684aa601939832295f1fb566d6a9a2a966e12b03855 |
|
MD5 | 5cb918fafcf7e774834aa81bf75b5300 |
|
BLAKE2b-256 | da6dfc08322d1f6a039851cd3fce33da3a83d04a232c49e954f6e51f08ce7c9f |