Skip to main content

A library to parse PMML models into Scikit-learn estimators.

Project description

sklearn-pmml-model

PyPI version CircleCI codecov ReadTheDocs

A library to parse PMML models into Scikit-learn estimators.

Installation

The easiest way is to use pip:

$ pip install sklearn-pmml-model

Status

This library is in beta, and currently not all models are supported. The library currently does support the following models:

Model Classification Regression Categorical features
Decision Trees 1
Random Forests 1
Gradient Boosting 1
Linear Regression
Ridge
Lasso
ElasticNet
Gaussian Naive Bayes

1 Categorical feature support using slightly modified internals, based on scikit-learn#12866.


The following part of the specification is covered:

  • DataDictionary
    • DataField (continuous, categorical, ordinal)
      • Value
      • Interval
  • TransformationDictionary / LocalTransformations
    • DerivedField
  • TreeModel
    • SimplePredicate
    • SimpleSetPredicate
  • Segmentation ('majorityVote' for Random Forests, 'modelChain' and 'sum' for Gradient Boosting)
  • Regression
    • RegressionTable
      • NumericPredictor
      • CategoricalPredictor
  • GeneralRegressionModel (only linear models)
    • PPMatrix
      • PPCell
    • ParamMatrix
      • PCell
  • NaiveBayesModel
    • BayesInputs
      • BayesInput
        • TargetValueStats
          • TargetValueStat
            • GaussianDistribution
        • PairCounts
          • TargetValueCounts
            • TargetValueCount

Example

A minimal working example is shown below:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
from sklearn_pmml_model.ensemble import PMMLForestClassifier

# Prepare data
iris = load_iris()
X = pd.DataFrame(iris.data)
X.columns = np.array(iris.feature_names)
y = pd.Series(np.array(iris.target_names)[iris.target])
y.name = "Class"
Xtr, Xte, ytr, yte = train_test_split(X, y, test_size=0.33, random_state=123)

clf = PMMLForestClassifier(pmml="models/randomForest.pmml")
clf.predict(Xte)
clf.score(Xte, yte)

More examples can be found in the subsequent packages: tree, ensemble, linear_model and naive_bayes.

Development

Prerequisites

Tests can be run using Py.test. Grab a local copy of the source:

$ git clone http://github.com/iamDecode/sklearn-pmml-model
$ cd sklearn-pmml-model

create a virtual environment and activating it:

$ python3 -m venv venv
$ source venv/bin/activate

and install the dependencies:

$ pip install -r requirements.txt

The final step is to build the Cython extensions:

$ python setup.py build_ext --inplace

Testing

You can execute tests with py.test by running:

$ python setup.py pytest

Contributing

Feel free to make a contribution. Please read CONTRIBUTING.md for details on the code of conduct, and the process for submitting pull requests.

License

This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-pmml-model-0.0.16.tar.gz (675.3 kB view hashes)

Uploaded Source

Built Distributions

sklearn_pmml_model-0.0.16-pp37-pypy37_pp73-win32.whl (300.1 kB view hashes)

Uploaded PyPy Windows x86

sklearn_pmml_model-0.0.16-pp36-pypy36_pp73-win32.whl (298.7 kB view hashes)

Uploaded PyPy Windows x86

sklearn_pmml_model-0.0.16-pp36-pypy36_pp73-manylinux2010_x86_64.whl (414.1 kB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-pp27-pypy_73-manylinux2010_x86_64.whl (432.6 kB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-cp39-cp39-win_amd64.whl (370.6 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

sklearn_pmml_model-0.0.16-cp39-cp39-win32.whl (332.0 kB view hashes)

Uploaded CPython 3.9 Windows x86

sklearn_pmml_model-0.0.16-cp39-cp39-macosx_11_0_x86_64.whl (435.2 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ x86-64

sklearn_pmml_model-0.0.16-cp39-cp39-macosx_10_9_x86_64.whl (440.8 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

sklearn_pmml_model-0.0.16-cp38-cp38-win_amd64.whl (370.8 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

sklearn_pmml_model-0.0.16-cp38-cp38-win32.whl (332.7 kB view hashes)

Uploaded CPython 3.8 Windows x86

sklearn_pmml_model-0.0.16-cp38-cp38-manylinux2010_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-cp38-cp38-manylinux2010_i686.whl (1.8 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

sklearn_pmml_model-0.0.16-cp38-cp38-macosx_10_9_x86_64.whl (434.2 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

sklearn_pmml_model-0.0.16-cp37-cp37m-win_amd64.whl (364.7 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

sklearn_pmml_model-0.0.16-cp37-cp37m-win32.whl (327.1 kB view hashes)

Uploaded CPython 3.7m Windows x86

sklearn_pmml_model-0.0.16-cp37-cp37m-manylinux2010_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-cp37-cp37m-manylinux2010_i686.whl (1.6 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

sklearn_pmml_model-0.0.16-cp37-cp37m-macosx_10_9_x86_64.whl (429.9 kB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

sklearn_pmml_model-0.0.16-cp36-cp36m-win_amd64.whl (362.9 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

sklearn_pmml_model-0.0.16-cp36-cp36m-win32.whl (325.5 kB view hashes)

Uploaded CPython 3.6m Windows x86

sklearn_pmml_model-0.0.16-cp36-cp36m-manylinux2010_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-cp36-cp36m-manylinux2010_i686.whl (1.6 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ i686

sklearn_pmml_model-0.0.16-cp36-cp36m-macosx_10_9_x86_64.whl (428.0 kB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

sklearn_pmml_model-0.0.16-cp35-cp35m-win_amd64.whl (356.7 kB view hashes)

Uploaded CPython 3.5m Windows x86-64

sklearn_pmml_model-0.0.16-cp35-cp35m-win32.whl (319.1 kB view hashes)

Uploaded CPython 3.5m Windows x86

sklearn_pmml_model-0.0.16-cp35-cp35m-manylinux2010_x86_64.whl (1.6 MB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

sklearn_pmml_model-0.0.16-cp35-cp35m-manylinux2010_i686.whl (1.5 MB view hashes)

Uploaded CPython 3.5m manylinux: glibc 2.12+ i686

sklearn_pmml_model-0.0.16-cp35-cp35m-macosx_10_9_x86_64.whl (415.0 kB view hashes)

Uploaded CPython 3.5m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page