Skip to main content

Transpile trained scikit-learn models to C, Java, JavaScript and others.

Project description

sklearn-porter

GitHub license Stack Overflow Join the chat at https://gitter.im/nok/sklearn-porter Twitter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.

Important

We're hard working on the first major release of sklearn-porter.
Until that we will just release bugfixes to the stable version.

Estimators

Estimator Programming language
Classifier Java * JS C Go PHP Ruby
svm.SVC , ✓ ᴵ
svm.NuSVC , ✓ ᴵ
svm.LinearSVC , ✓ ᴵ
tree.DecisionTreeClassifier , ✓ ᴱ, ✓ ᴵ , ✓ ᴱ , ✓ ᴱ , ✓ ᴱ , ✓ ᴱ , ✓ ᴱ
ensemble.RandomForestClassifier ✓ ᴱ, ✓ ᴵ ✓ ᴱ ✓ ᴱ ✓ ᴱ ✓ ᴱ ✓ ᴱ
ensemble.ExtraTreesClassifier ✓ ᴱ, ✓ ᴵ ✓ ᴱ ✓ ᴱ ✓ ᴱ ✓ ᴱ
ensemble.AdaBoostClassifier ✓ ᴱ, ✓ ᴵ ✓ ᴱ, ✓ ᴵ ✓ ᴱ
neighbors.KNeighborsClassifier , ✓ ᴵ , ✓ ᴵ
naive_bayes.GaussianNB , ✓ ᴵ
naive_bayes.BernoulliNB , ✓ ᴵ
neural_network.MLPClassifier , ✓ ᴵ , ✓ ᴵ
Regressor Java * JS C Go PHP Ruby
neural_network.MLPRegressor

✓ = is full-featured, ᴱ = with embedded model data, ᴵ = with imported model data, * = default language

Installation

Stable

Build Status stable branch PyPI PyPI

$ pip install sklearn-porter

Development

Build Status master branch

If you want the latest changes, you can install this package from the master branch:

$ pip uninstall -y sklearn-porter
$ pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

Usage

Export

The following example demonstrates how you can transpile a decision tree estimator to Java:

from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter

# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

# Export:
porter = Porter(clf, language='java')
output = porter.export(embed_data=True)
print(output)

The exported result matches the official human-readable version of the decision tree.

Integrity

You should always check and compute the integrity between the original and the transpiled estimator:

# ...
porter = Porter(clf, language='java')

# Compute integrity score:
integrity = porter.integrity_score(X)
print(integrity)  # 1.0

Prediction

You can compute the prediction(s) in the target programming language:

# ...
porter = Porter(clf, language='java')

# Prediction(s):
Y_java = porter.predict(X)
y_java = porter.predict(X[0])
y_java = porter.predict([1., 2., 3., 4.])

Notebooks

You can run and test all notebooks by starting a Jupyter notebook server locally:

$ make open.examples
$ make stop.examples

CLI

In general you can use the porter on the command line:

$ porter <pickle_file> [--to <directory>]
         [--class_name <class_name>] [--method_name <method_name>]
         [--export] [--checksum] [--data] [--pipe]
         [--c] [--java] [--js] [--go] [--php] [--ruby]
         [--version] [--help]

The following example shows how you can save a trained estimator to the pickle format:

# ...

# Extract estimator:
joblib.dump(clf, 'estimator.pkl', compress=0)

After that the estimator can be transpiled to JavaScript by using the following command:

$ porter estimator.pkl --js

The target programming language is changeable on the fly:

$ porter estimator.pkl --c
$ porter estimator.pkl --java
$ porter estimator.pkl --php
$ porter estimator.pkl --java
$ porter estimator.pkl --ruby

For further processing the argument --pipe can be used to pass the result:

$ porter estimator.pkl --js --pipe > estimator.js

For instance the result can be minified by using UglifyJS:

$ porter estimator.pkl --js --pipe | uglifyjs --compress -o estimator.min.js

Development

Environment

You have to install required modules for broader development:

$ make install.environment  # conda environment (optional)
$ make install.requirements.development  # pip requirements

Independently, the following compilers and intepreters are required to cover all tests:

Name Version Command
GCC >=4.2 gcc --version
Java >=1.6 java -version
PHP >=5.6 php --version
Ruby >=2.4.1 ruby --version
Go >=1.7.4 go version
Node.js >=6 node --version

Testing

The tests cover module functions as well as matching predictions of transpiled estimators. Start all tests with:

$ make test

The test files have a specific pattern: '[Algorithm][Language]Test.py':

$ pytest tests -v -o python_files='RandomForest*Test.py'
$ pytest tests -v -o python_files='*JavaTest.py'

While you are developing new features or fixes, you can reduce the test duration by changing the number of tests:

$ N_RANDOM_FEATURE_SETS=5 N_EXISTING_FEATURE_SETS=10 \
  pytest tests -v -o python_files='*JavaTest.py'

Quality

It's highly recommended to ensure the code quality. For that Pylint is used. Start the linter with:

$ make lint

Citation

If you use this implementation in you work, please add a reference/citation to the paper. You can use the following BibTeX entry:

@unpublished{skpodamo,
  author = {Darius Morawiec},
  title = {sklearn-porter},
  note = {Transpile trained scikit-learn estimators to C, Java, JavaScript and others},
  url = {https://github.com/nok/sklearn-porter}
}

License

The module is Open Source Software released under the MIT license.

Questions?

Don't be shy and feel free to contact me on Twitter or Gitter.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-porter-0.7.4.tar.gz (79.8 kB view details)

Uploaded Source

Built Distribution

sklearn_porter-0.7.4-py3-none-any.whl (144.4 kB view details)

Uploaded Python 3

File details

Details for the file sklearn-porter-0.7.4.tar.gz.

File metadata

  • Download URL: sklearn-porter-0.7.4.tar.gz
  • Upload date:
  • Size: 79.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for sklearn-porter-0.7.4.tar.gz
Algorithm Hash digest
SHA256 6b9ab9a5494da39b60c89f31cf0e9b0157878176075c747db21e805462d074ee
MD5 193da4f007401c98143733d7e4d45599
BLAKE2b-256 9c1d4ee3d245b85db0e095e16d9441131379fd66bacd89cfcd014c8fe4a0f3c8

See more details on using hashes here.

File details

Details for the file sklearn_porter-0.7.4-py3-none-any.whl.

File metadata

  • Download URL: sklearn_porter-0.7.4-py3-none-any.whl
  • Upload date:
  • Size: 144.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for sklearn_porter-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 25784db1500702b71e4b0336dec7b570127c8f2821e53da7e865ccbaeea40b5b
MD5 53987b51c2d560ca917086469398637b
BLAKE2b-256 4eb30d9f2b0800bc63b96dc9f707e3ec7565f49e4fba4fc0baf78cc61da0666f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page