A library to parse PMML models into Scikit-learn estimators.
Project description
sklearn-pmml-model
A library to effortlessly import models trained on different platforms and with programming languages into scikit-learn in Python. First export your model to PMML (widely supported). Next, load the exported PMML file with this library, and use the class as any other scikit-learn estimator.
Installation
The easiest way is to use pip:
$ pip install sklearn-pmml-model
Status
The library currently supports the following models:
Model | Classification | Regression | Categorical features |
---|---|---|---|
Decision Trees | ✅ | ✅ | ✅1 |
Random Forests | ✅ | ✅ | ✅1 |
Gradient Boosting | ✅ | ✅ | ✅1 |
Linear Regression | ✅ | ✅ | ✅3 |
Ridge | ✅2 | ✅ | ✅3 |
Lasso | ✅2 | ✅ | ✅3 |
ElasticNet | ✅2 | ✅ | ✅3 |
Gaussian Naive Bayes | ✅ | ✅3 | |
Support Vector Machines | ✅ | ✅ | ✅3 |
Nearest Neighbors | ✅ | ✅ | |
Neural Networks | ✅ | ✅ |
1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.
2 These models differ only in training characteristics, the resulting model is of the same form. Classification is supported using PMMLLogisticRegression
for regression models and PMMLRidgeClassifier
for general regression models.
3 By one-hot encoding categorical features automatically.
Example
A minimal working example (using this PMML file) is shown below:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier
# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)
clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
clf.predict(Xte)
clf.score(Xte, yte)
More examples can be found in the subsequent packages: tree, ensemble, linear_model, naive_bayes, svm, neighbors and neural_network.
Benchmark
Depending on the data set and model, sklearn-pmml-model
is between 5 and a 1000 times faster than competing libraries, by leveraging the optimization and industry-tested robustness of sklearn
. Source code for this benchmark can be found in the corresponding jupyter notebook.
Running times (load + predict, in seconds)
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | PyPMML |
0.773291 | 0.77384 | 0.777425 | 0.895204 | 0.902355 |
sklearn-pmml-model |
0.005813 | 0.006357 | 0.002693 | 0.108882 | 0.121823 | |
Breast cancer | PyPMML |
3.849855 | 3.878448 | 3.83623 | 4.16358 | 4.13766 |
sklearn-pmml-model |
0.015723 | 0.011278 | 0.002807 | 0.146234 | 0.044016 |
Improvement
Linear model | Naive Bayes | Decision tree | Random Forest | Gradient boosting | ||
---|---|---|---|---|---|---|
Wine | Improvement | 133× | 122× | 289× | 8× | 7× |
Breast cancer | Improvement | 245× | 344× | 1,367× | 28× | 94× |
Development
Prerequisites
Tests can be run using Py.test. Grab a local copy of the source:
$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model
create a virtual environment and activating it:
$ python3 -m venv venv
$ source venv/bin/activate
and install the dependencies:
$ pip install -r requirements.txt
The final step is to build the Cython extensions:
$ python setup.py build_ext --inplace
Testing
You can execute tests with py.test by running:
$ python setup.py pytest
Contributing
Feel free to make a contribution. Please read CONTRIBUTING.md for more details.
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for sklearn_pmml_model-1.0.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c32ec1ff42226dc3c628ae926c9a8cbb719cafc22a64f65ec27a50f86d0f634f |
|
MD5 | 41dc81a59e773fc57e3355d74f985ecb |
|
BLAKE2b-256 | 7e34b9b529704323f6a57509c35246c5d6c1b9da8409fe8933e0fe5005a2300d |
Hashes for sklearn_pmml_model-1.0.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64f7511bf8fecc468c07d54d853b929dab489c8556ea8292ddf63b1214b65aa3 |
|
MD5 | e41d5ab1d90942dc10d804db4668aa20 |
|
BLAKE2b-256 | 94085aa980e58d9a61deb2da84e1fa8ef242a5de3eb88f5e30afbc45fb2eaede |
Hashes for sklearn_pmml_model-1.0.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 222dbf48b3081da239ff652b4593ec629cd9743ea8251766007f7b21ac3d5c9c |
|
MD5 | 184a7223b5cf9158a6105832e1ba9990 |
|
BLAKE2b-256 | 342d5bddaa9e326c6d373598565210ff4b0bd71b7d476b2946181d8839bc23fe |
Hashes for sklearn_pmml_model-1.0.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef0748caf7d409732921488d9202993f205169f1adcfbb27939fe135753efb4c |
|
MD5 | 44b646a325e49439b4a0c0438094d89d |
|
BLAKE2b-256 | 0d165066ff7cc0d7d38ce985d3db571352194d8e958d2536c49af62eac0cedcd |
Hashes for sklearn_pmml_model-1.0.1-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ea9890017736df4e5ae5df1af738564acfd7e7d2ebd3ca2d11681a9312a03b1 |
|
MD5 | b01cba7d12031765f5f36c98806c9442 |
|
BLAKE2b-256 | ab2ea754f3bef35f05353978aebed1981796da4b14f729112aabcfe9dd2ef4db |
Hashes for sklearn_pmml_model-1.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0f54c538f9a5d4d6eeeda61c35cbc8e7beffbde024b22c61efbe9e346405e9f |
|
MD5 | 30a94d3cbaf277836b60c50d8078bd36 |
|
BLAKE2b-256 | a1677845c661c38e111604a673e4978e930594b37b86cd7af4f1d4f6b14b7986 |
Hashes for sklearn_pmml_model-1.0.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34a1e8da3e62c508dd964a4494a6088c059dfd9d29e126566572d1fbcd4694e9 |
|
MD5 | e557a319c00b8b1369a01de916eef1fc |
|
BLAKE2b-256 | feb2302266682bb9e9150a32f8d0368f98041f6339a18354ee461491d0fe694f |
Hashes for sklearn_pmml_model-1.0.1-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca25eaa26e08da00244e9cd1e99f1cc2642e998017601aa808b594ae4f751ed9 |
|
MD5 | e4568f92b09c4a008ca027ed96bb62f4 |
|
BLAKE2b-256 | d397167a36c0ac38db99dd7114a051accd56a5d35298a7cc50b894924dfa02bb |
Hashes for sklearn_pmml_model-1.0.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9fcd089404c454fa3224c6d1f3a120e5a4acaa1ed55e0c51f447ce79d08305c |
|
MD5 | 03d3afbdf1066a74d62ac68297dfeb18 |
|
BLAKE2b-256 | 280820cde18b6f1f0a7eeec66c289241c82f49b4a7e9f93bf0019db52ee5a851 |
Hashes for sklearn_pmml_model-1.0.1-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a33fa7f11d389ad89c3fb691f0d66a76e3426846d1b5a71ea5730a5b25b57416 |
|
MD5 | c80a36fb9ae58f7651a13456e3b6e30d |
|
BLAKE2b-256 | 9a9f038fc780e0ce9053c41bed3f1858fd99ea34d98f4e6f3cc36477190bb47b |
Hashes for sklearn_pmml_model-1.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b961d5434ec0cb4c739e7f954ea5ed159a2332cf58b620556927ac383c633449 |
|
MD5 | fe52522ba592f209dc6086df3db4a4ff |
|
BLAKE2b-256 | 96d8015eed95e6fda022bc6603ea9331455a19f89c4c57445eb5898804ad888e |
Hashes for sklearn_pmml_model-1.0.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e52d192ce7bb93dc4ea83af1f140651bbf34c966d3f2c681b58df6886add3c9 |
|
MD5 | 87869c111b7fedab431da2b7e666c55b |
|
BLAKE2b-256 | e9c165513df69df1a6d0be4fba3492ae7fe225e7a2c13e38589b316a40894194 |
Hashes for sklearn_pmml_model-1.0.1-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a0a7fe40cb131c993ff9a32915e0cd9f7486d25868bcbef47cc00f3632d7df4 |
|
MD5 | d04d756137f52c559afc9519e24297d1 |
|
BLAKE2b-256 | 194c90ac6dbd4b2f543628a6e2af5459f0c765f5876202f6b814ff703bd60fee |
Hashes for sklearn_pmml_model-1.0.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec10b7b7424473e735be76f6aaf8d57ee4689b5677176e708a78e3993daffabc |
|
MD5 | 0bdd2f4510d4c8e1855ca4320b23fddd |
|
BLAKE2b-256 | bb527089abf38f0fa5371a415674a9820f178d1552599d4d88fbf4b561b9459d |
Hashes for sklearn_pmml_model-1.0.1-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20eca3326d099a90339c706bb9174e8f426d24224bbb76a38ae7a7ae143aab39 |
|
MD5 | 82be6851098cb954968a3019da9112ce |
|
BLAKE2b-256 | 3b624d318264d2f86e9bc4b770ebdbe85ef19a7fdd21d47188dce2ddc650d7cd |
Hashes for sklearn_pmml_model-1.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8cd4018f6f7942b384a218900a28d9105e456edf91f61dc9b19cdfe247465916 |
|
MD5 | 6ebab209e6c0cb69f1879493a53d5051 |
|
BLAKE2b-256 | 602755a534eefffc8dbe14a0f90e421515bc3173a7cee900526d77ec5b9273e6 |
Hashes for sklearn_pmml_model-1.0.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49d92ea65c33722b82b07db29ef7567b013341c1c34b547d35354f897874e91d |
|
MD5 | a183483aa343bda8ef87846933e2da8f |
|
BLAKE2b-256 | b3f27c5481e1855b0626e6fc0d771fc7992ad032d714d067df916f9f5ca593b2 |
Hashes for sklearn_pmml_model-1.0.1-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b60bfecc532ccba24cb649ace1fdeb158c2f43f130018f113fec23b7623542a3 |
|
MD5 | 2512ddf5186e8a72647d9db1cc65a714 |
|
BLAKE2b-256 | 91789095eeed3b99d36a41dcd2ffbe79440f1c1d3deaea0286fc03c3b6785ffa |
Hashes for sklearn_pmml_model-1.0.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b4d16bb4e5222f4012c9b9ed1031360a0ab2ec36fb3cef0a11db58354520fe2 |
|
MD5 | f5a4d291e6ef2437eac81b3a6c8f2df4 |
|
BLAKE2b-256 | 18942d5bbd7d7b6c9584c095f3843beb6b2a3ac42805c4e3128eb60ea4ecb6a2 |
Hashes for sklearn_pmml_model-1.0.1-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ffe95cdefb674ae57ed1dc58277f967e0fc8fc2af799f64b82221819ab0b3a9 |
|
MD5 | 5216db50b3017acac005513a8feea013 |
|
BLAKE2b-256 | 0d617c6aa3567029410b5d8f47cddddf6913586f42f39fdbc707ebf00f39dc53 |
Hashes for sklearn_pmml_model-1.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 029fd2ffa7c7418aa10a64ccf92f5915d27926821837dffb7a5f8c785900cdf4 |
|
MD5 | c29d506030cff531e0c5058754822dbc |
|
BLAKE2b-256 | c44a12b4538b7aef384d8715f37ee8147ad5db4ffcd02c9463edc7aef37a40be |
Hashes for sklearn_pmml_model-1.0.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0083f632448b0c34a624d7e3d3571a9a4e117bc6a3db21e31ea7616aea11abb0 |
|
MD5 | 6c3468cf5a15426731b9531cdae9d59f |
|
BLAKE2b-256 | 4a8cf68671e90507208ed578f6bc12366f7bf39521dadd1737ca87f26c9ab298 |
Hashes for sklearn_pmml_model-1.0.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16b7db871b937190494e944d6983c4f99dee17725c461e2e8b9cdac798b06da8 |
|
MD5 | 6776529714ea4f87b574bfae25c7a1a6 |
|
BLAKE2b-256 | 00aef0fbaf42977f250e37e1e3a2a44973170061ae6b9910f1d098a0a088e5ad |
Hashes for sklearn_pmml_model-1.0.1-cp36-cp36m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e821a742d758cbbf6d06faa159748820187b359bd21fc71d7d40a08ad3e78309 |
|
MD5 | 8fa111a66241f2684440d70da1fc25bf |
|
BLAKE2b-256 | e2fb2b0350086078d39e17441c760421d16b1bfca120c755bd4353a7a88fd573 |
Hashes for sklearn_pmml_model-1.0.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0afb87ea162b540b7f6d1781fc292f2a4a3b13610eb7f545ab00f4a81c4c32ab |
|
MD5 | 97484a648f28077f14e23f0a61173b0f |
|
BLAKE2b-256 | 8848dde545977e249480eb4f94d33cfec0108f9cf0ac63e6ff394fe431468f8c |
Hashes for sklearn_pmml_model-1.0.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86883b9ac8e86491ab9ec32d22b883f3243fec50dfa173468b4ec72a57280108 |
|
MD5 | 65ad6186ae020aa96d4bf1a0b8c251a5 |
|
BLAKE2b-256 | 70db979bd167e6de370572eaf05c47bdfe2d2a393b0bc1a36ec927a855f4ebec |