Code-generation for various ML models into native code.

These details have not been verified by PyPI

Project links

Homepage

Project description

m2cgen

m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP, Dart, Haskell, Ruby, F#, Rust, Elixir).

Installation
Supported Languages
Supported Models
Classification Output
Usage
CLI
FAQ

Installation

Supported Python version is >= 3.7.

pip install m2cgen

Supported Languages

C
C#
Dart
F#
Go
Haskell
Java
JavaScript
PHP
PowerShell
Python
R
Ruby
Rust
Visual Basic (VBA-compatible)
Elixir

Supported Models

	Classification	Regression
Linear	scikit-learn LogisticRegression LogisticRegressionCV PassiveAggressiveClassifier Perceptron RidgeClassifier RidgeClassifierCV SGDClassifier lightning AdaGradClassifier CDClassifier FistaClassifier SAGAClassifier SAGClassifier SDCAClassifier SGDClassifier	scikit-learn ARDRegression BayesianRidge ElasticNet ElasticNetCV GammaRegressor HuberRegressor Lars LarsCV Lasso LassoCV LassoLars LassoLarsCV LassoLarsIC LinearRegression OrthogonalMatchingPursuit OrthogonalMatchingPursuitCV PassiveAggressiveRegressor PoissonRegressor RANSACRegressor(only supported regression estimators can be used as a base estimator) Ridge RidgeCV SGDRegressor TheilSenRegressor TweedieRegressor StatsModels Generalized Least Squares (GLS) Generalized Least Squares with AR Errors (GLSAR) Generalized Linear Models (GLM) Ordinary Least Squares (OLS) [Gaussian] Process Regression Using Maximum Likelihood-based Estimation (ProcessMLE) Quantile Regression (QuantReg) Weighted Least Squares (WLS) lightning AdaGradRegressor CDRegressor FistaRegressor SAGARegressor SAGRegressor SDCARegressor SGDRegressor
SVM	scikit-learn LinearSVC NuSVC OneClassSVM SVC lightning KernelSVC LinearSVC	scikit-learn LinearSVR NuSVR SVR lightning LinearSVR
Tree	DecisionTreeClassifier ExtraTreeClassifier	DecisionTreeRegressor ExtraTreeRegressor
Random Forest	ExtraTreesClassifier LGBMClassifier(rf booster only) RandomForestClassifier XGBRFClassifier	ExtraTreesRegressor LGBMRegressor(rf booster only) RandomForestRegressor XGBRFRegressor
Boosting	LGBMClassifier(gbdt/dart/goss booster only) XGBClassifier(gbtree(including boosted forests)/gblinear booster only)	LGBMRegressor(gbdt/dart/goss booster only) XGBRegressor(gbtree(including boosted forests)/gblinear booster only)

You can find versions of packages with which compatibility is guaranteed by CI tests here. Other versions can also be supported but they are untested.

Classification Output

Linear / Linear SVM / Kernel SVM

Binary

Scalar value; signed distance of the sample to the hyperplane for the second class.

Multiclass

Vector value; signed distance of the sample to the hyperplane per each class.

Comment

The output is consistent with the output of LinearClassifierMixin.decision_function.

SVM

Outlier detection

Scalar value; signed distance of the sample to the separating hyperplane: positive for an inlier and negative for an outlier.

Binary

Scalar value; signed distance of the sample to the hyperplane for the second class.

Multiclass

Vector value; one-vs-one score for each class, shape (n_samples, n_classes * (n_classes-1) / 2).

Comment

The output is consistent with the output of BaseSVC.decision_function when the decision_function_shape is set to ovo.

Tree / Random Forest / Boosting

Binary

Vector value; class probabilities.

Multiclass

Vector value; class probabilities.

Comment

The output is consistent with the output of the predict_proba method of DecisionTreeClassifier / ExtraTreeClassifier / ExtraTreesClassifier / RandomForestClassifier / XGBRFClassifier / XGBClassifier / LGBMClassifier.

Usage

Here's a simple example of how a linear model trained in Python environment can be represented in Java code:

from sklearn.datasets import load_diabetes
from sklearn import linear_model
import m2cgen as m2c

X, y = load_diabetes(return_X_y=True)

estimator = linear_model.LinearRegression()
estimator.fit(X, y)

code = m2c.export_to_java(estimator)

Generated Java code:

public class Model {
    public static double score(double[] input) {
        return ((((((((((152.1334841628965) + ((input[0]) * (-10.012197817470472))) + ((input[1]) * (-239.81908936565458))) + ((input[2]) * (519.8397867901342))) + ((input[3]) * (324.39042768937657))) + ((input[4]) * (-792.1841616283054))) + ((input[5]) * (476.74583782366153))) + ((input[6]) * (101.04457032134408))) + ((input[7]) * (177.06417623225025))) + ((input[8]) * (751.2793210873945))) + ((input[9]) * (67.62538639104406));
    }
}

You can find more examples of generated code for different models/languages here.

CLI

m2cgen can be used as a CLI tool to generate code using serialized model objects (pickle protocol):

$ m2cgen <pickle_file> --language <language> [--indent <indent>] [--function_name <function_name>]
         [--class_name <class_name>] [--module_name <module_name>] [--package_name <package_name>]
         [--namespace <namespace>] [--recursion-limit <recursion_limit>]

Don't forget that for unpickling serialized model objects their classes must be defined in the top level of an importable module in the unpickling environment.

Piping is also supported:

$ cat <pickle_file> | m2cgen --language <language>

FAQ

Q: Generation fails with RecursionError: maximum recursion depth exceeded error.

A: If this error occurs while generating code using an ensemble model, try to reduce the number of trained estimators within that model. Alternatively you can increase the maximum recursion depth with sys.setrecursionlimit(<new_depth>).

Q: Generation fails with ImportError: No module named <module_name_here> error while transpiling model from a serialized model object.

A: This error indicates that pickle protocol cannot deserialize model object. For unpickling serialized model objects, it is required that their classes must be defined in the top level of an importable module in the unpickling environment. So installation of package which provided model's class definition should solve the problem.

Q: Generated by m2cgen code provides different results for some inputs compared to original Python model from which the code were obtained.

A: Some models force input data to be particular type during prediction phase in their native Python libraries. Currently, m2cgen works only with float64 (double) data type. You can try to cast your input data to another type manually and check results again. Also, some small differences can happen due to specific implementation of floating-point arithmetic in a target language.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.10.0

Apr 26, 2022

0.9.0

Sep 18, 2020

0.8.0

Jun 18, 2020

0.7.0

Apr 7, 2020

0.6.0

Feb 17, 2020

0.5.0

Dec 1, 2019

0.4.0

Sep 28, 2019

0.3.1

Aug 15, 2019

0.3.0

May 21, 2019

0.2.1

Apr 17, 2019

0.2.0

Mar 22, 2019

0.1.1

Mar 5, 2019

0.1.0

Feb 12, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

m2cgen-0.10.0.tar.gz (55.8 kB view details)

Uploaded Apr 26, 2022 Source

Built Distribution

m2cgen-0.10.0-py3-none-any.whl (92.2 kB view details)

Uploaded Apr 26, 2022 Python 3

File details

Details for the file m2cgen-0.10.0.tar.gz.

File metadata

Download URL: m2cgen-0.10.0.tar.gz
Upload date: Apr 26, 2022
Size: 55.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for m2cgen-0.10.0.tar.gz
Algorithm	Hash digest
SHA256	`9adfc5c6e693b1bb2e757c379e14ff97c95d1a11f9f6536882ebd7e402d34aa8`
MD5	`a38ad77fc0388e2b347e3ea5df3a9798`
BLAKE2b-256	`7292eb4176f0e71bcd988a9df54e7103987a232082e5b6ef14a12d56b1c3bddf`

See more details on using hashes here.

File details

Details for the file m2cgen-0.10.0-py3-none-any.whl.

File metadata

Download URL: m2cgen-0.10.0-py3-none-any.whl
Upload date: Apr 26, 2022
Size: 92.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.4.2 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.8

File hashes

Hashes for m2cgen-0.10.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b9f3e85133e905a306b507139ea40e595eccf499a7f4842889773caea7b74beb`
MD5	`b5104c5a27bb4d084e803d7055200bee`
BLAKE2b-256	`0b1fd57169a8458481e0292b672f8939688b269109d5d559bb0fd1ef3276cd91`

See more details on using hashes here.

m2cgen 0.10.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

m2cgen

Installation

Supported Languages

Supported Models

Classification Output

Linear / Linear SVM / Kernel SVM

Binary

Multiclass

Comment

SVM

Outlier detection

Binary

Multiclass

Comment

Tree / Random Forest / Boosting

Binary

Multiclass

Comment

Usage

CLI

FAQ

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes