Skip to main content

Transportation of ML models

Project description



Codecov PyPI version built with Python3 Discord Channel

Overview

PyMilo is an open source Python package that provides a simple, efficient, and safe way for users to export pre-trained machine learning models in a transparent way. By this, the exported model can be used in other environments, transferred across different platforms, and shared with others. PyMilo allows the users to export the models that are trained using popular Python libraries like scikit-learn, and then use them in deployment environments, or share them without exposing the underlying code or dependencies. The transparency of the exported models ensures reliability and safety for the end users, as it eliminates the risks of binary or pickle formats.

PyPI Counter
Github Stars
Branch main dev
CI
Code Quality CodeFactor codebeat badge

Installation

PyPI

Source code

Conda

Usage

Import/Export

Imagine you want to train a LinearRegression model representing this equation: $y = x_0 + 2x_1 + 3$. You will create data points (X, y) and train your model as follows.

>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
>>> y = np.dot(X, np.array([1, 2])) + 3
 # y = 1 * x_0 + 2 * x_1 + 3
>>> model = LinearRegression().fit(X, y)
>>> pred = model.predict(np.array([[3, 5]]))
# pred = [16.] (=1 * 3 + 2 * 5 + 3)

Using PyMilo Export class you can easily serialize and export your trained model into a JSON file.

>>> from pymilo import Export
>>> Export(model).save("model.json")

You can check out your model as a JSON file now.

{
    "data": {
        "fit_intercept": true,
        "copy_X": true,
        "n_jobs": null,
        "positive": false,
        "n_features_in_": 2,
        "coef_": {
            "pymiloed-ndarray-list": [
                1.0000000000000002,
                1.9999999999999991
            ],
            "pymiloed-ndarray-dtype": "float64",
            "pymiloed-ndarray-shape": [
                2
            ],
            "pymiloed-data-structure": "numpy.ndarray"
        },
        "rank_": 2,
        "singular_": {
            "pymiloed-ndarray-list": [
                1.618033988749895,
                0.6180339887498948
            ],
            "pymiloed-ndarray-dtype": "float64",
            "pymiloed-ndarray-shape": [
                2
            ],
            "pymiloed-data-structure": "numpy.ndarray"
        },
        "intercept_": {
            "value": 3.0000000000000018,
            "np-type": "numpy.float64"
        }
    },
    "sklearn_version": "1.4.2",
    "pymilo_version": "0.8",
    "model_type": "LinearRegression"
}

You can see all the learned parameters of the model in this file and change them if you want. This JSON representation is a transparent version of your model.

Now let's load it back. You can do it easily by using PyMilo Import class.

>>> from pymilo import Import
>>> model = Import("model.json").to_model()
>>> pred = model.predict(np.array([[3, 5]]))
# pred = [16.] (=1 * 3 + 2 * 5 + 3)

This loaded model is exactly the same as the original trained model.

ML streaming

You can easily serve your ML model from a remote server using ML streaming feature of PyMilo.

⚠️ ML streaming feature exists in versions >=1.0

⚠️ In order to use ML streaming feature, make sure you've installed the streaming mode of PyMilo

Server

Let's assume you are in the remote server and you want to import the exported JSON file and start serving your model!

>>> from pymilo import Import
>>> from pymilo.streaming import PymiloServer
>>> my_model = Import("model.json").to_model()
>>> communicator = PymiloServer(model=my_model, port=8000).communicator
>>> communicator.run()

Now PymiloServer runs on port 8000 and exposes REST API to upload, download and retrieve attributes either data attributes like model._coef or method attributes like model.predict(x_test).

Client

By using PymiloClient you can easily connect to the remote PymiloServer and execute any functionalities that the given ML model has, let's say you want to run predict function on your remote ML model and get the result:

>>> from pymilo.streaming import PymiloClient
>>> pymilo_client = PymiloClient(mode=PymiloClient.Mode.LOCAL, server_url="SERVER_URL")
>>> pymilo_client.toggle_mode(PymiloClient.Mode.DELEGATE)
>>> result = pymilo_client.predict(x_test)

ℹ️ If you've deployed PymiloServer locally (on port 8000 for instance), then SERVER_URL would be http://127.0.0.1:8000

You can also download the remote ML model into your local and execute functions locally on your model.

Calling download function on PymiloClient will sync the local model that PymiloClient wraps upon with the remote ML model, and it doesn't save model directly to a file.

>>> pymilo_client.download()

If you want to save the ML model to a file in your local, you can use Export class.

>>> from pymilo import Export
>>> Export(pymilo_client.model).save("model.json")

Now that you've synced the remote model with your local model, you can run functions.

>>> pymilo_client.toggle_mode(mode=PymiloClient.Mode.LOCAL)
>>> result = pymilo_client.predict(x_test)

PymiloClient wraps around the ML model, either to the local ML model or the remote ML model, and you can work with PymiloClient in the exact same way that you did with the ML model, you can run exact same functions with same signature.

ℹ️ Through the usage of toggle_mode function you can specify whether PymiloClient applies requests on the local ML model pymilo_client.toggle_mode(mode=Mode.LOCAL) or delegates it to the remote server pymilo_client.toggle_mode(mode=Mode.DELEGATE)

Supported ML models

scikit-learn PyTorch
Linear Models ✅ -
Neural Networks ✅ -
Trees ✅ -
Clustering ✅ -
Naïve Bayes ✅ -
Support Vector Machines (SVMs) ✅ -
Nearest Neighbors ✅ -
Ensemble Models ✅ -
Pipeline Model ✅ -
Preprocessing Models ✅ -

Details are available in Supported Models.

Issues & bug reports

Just fill an issue and describe it. We'll check it ASAP! or send an email to pymilo@openscilab.com.

  • Please complete the issue template

You can also join our discord server

Discord Channel

Acknowledgments

Python Software Foundation (PSF) grants PyMilo library partially for version 1.0. PSF is the organization behind Python. Their mission is to promote, protect, and advance the Python programming language and to support and facilitate the growth of a diverse and international community of Python programmers.

Python Software Foundation

Trelis Research grants PyMilo library partially for version 1.0. Trelis Research provides tools and tutorials for businesses and developers looking to fine-tune and deploy large language models.

Trelis Research

Show your support

Star this repo

Give a ⭐️ if this project helped you!

Donate to our project

If you do like our project and we hope that you do, can you please support us? Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do ;-) .

PyMilo Donation

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Unreleased

1.0 - 2024-09-16

Added

  • Compression method test in ML Streaming RESTful testcases
  • CLI handler in tests/test_ml_streaming/run_server.py
  • Compression Enum in streaming.compressor.py
  • GZIPCompressor class in streaming.compressor.py
  • ZLIBCompressor class in streaming.compressor.py
  • LZMACompressor class in streaming.compressor.py
  • BZ2Compressor class in streaming.compressor.py
  • encrypt_compress function in PymiloClient
  • parse function in RESTServerCommunicator
  • is_callable_attribute function in PymiloServer
  • streaming.param.py
  • attribute_type function in RESTServerCommunicator
  • AttributeTypePayload class in RESTServerCommunicator
  • attribute_type function in RESTClientCommunicator
  • Mode Enum in PymiloClient
  • Import from url testcases
  • download_model function in utils.util.py
  • PymiloServer class in streaming.pymilo_server.py
  • PymiloClient class in PymiloClient
  • Communicator interface in streaming.interfaces.py
  • RESTClientCommunicator class in streaming.communicator.py
  • RESTServerCommunicator class in streaming.communicator.py
  • Compressor interface in streaming.interfaces.py
  • DummyCompressor class in streaming.compressor.py
  • Encryptor interface in streaming.interfaces.py
  • DummyEncryptor class in streaming.encryptor.py
  • ML Streaming RESTful testcases
  • streaming-requirements.txt

Changed

  • README.md updated
  • ML Streaming RESTful testcases
  • attribute_call function in RESTServerCommunicator
  • AttributeCallPayload class in RESTServerCommunicator
  • upload function in RESTClientCommunicator
  • download function in RESTClientCommunicator
  • __init__ function in RESTClientCommunicator
  • attribute_calls function in RESTClientCommunicator
  • requests added to requirements.txt
  • uvicorn, fastapi, requests and pydantic added to dev-requirements.txt
  • ML Streaming RESTful testcases
  • __init__ function in PymiloServer
  • __getattr__ function in PymiloClient
  • __init__ function in PymiloClient
  • toggle_mode function in PymiloClient
  • upload function in PymiloClient
  • download function in PymiloClient
  • __init__ function in PymiloServer
  • serialize_cfnode function in transporters.cfnode_transporter.py
  • __init__ function in Import class
  • serialize function in transporters.tree_transporter.py
  • deserialize function in transporters.tree_transporter.py
  • serialize function in transporters.sgdoptimizer_transporter.py
  • deserialize function in transporters.sgdoptimizer_transporter.py
  • serialize function in transporters.randomstate_transporter.py
  • deserialize function in transporters.randomstate_transporter.py
  • serialize function in transporters.bunch_transporter.py
  • deserialize function in transporters.bunch_transporter.py
  • serialize function in transporters.adamoptimizer_transporter.py
  • deserialize function in transporters.adamoptimizer_transporter.py
  • serialize_linear_model function in chains.linear_model_chain.py
  • serialize_ensemble function in chains.ensemble_chain.py
  • serialize function in GeneralDataStructureTransporter Transporter refactored
  • get_deserialized_list function in GeneralDataStructureTransporter Transporter refactored
  • Export class call by reference bug fixed

0.9 - 2024-07-01

Added

  • Anaconda workflow
  • prefix_list function in utils.util.py
  • KBinsDiscretizer preprocessing model
  • PowerTransformer preprocessing model
  • SplineTransformer preprocessing model
  • TargetEncoder preprocessing model
  • QuantileTransformer preprocessing model
  • RobustScaler preprocessing model
  • PolynomialFeatures preprocessing model
  • OrdinalEncoder preprocessing model
  • Normalizer preprocessing model
  • MaxAbsScaler preprocessing model
  • MultiLabelBinarizer preprocessing model
  • KernelCenterer preprocessing model
  • FunctionTransformer preprocessing model
  • Binarizer preprocessing model
  • Preprocessing models test runner

Changed

  • Command enum class in transporter.py
  • SerializationErrorTypes enum class in serialize_exception.py
  • DeserializationErrorTypes enum class in deserialize_exception.py
  • meta.yaml modified
  • NaN type in pymilo_param
  • NaN type transportation in GeneralDataStructureTransporter Transporter
  • BSpline Transportation in PreprocessingTransporter Transporter
  • one layer deeper transportation in PreprocessingTransporter Transporter
  • dictating outer ndarray dtype in GeneralDataStructureTransporter Transporter
  • preprocessing params fulfilled in pymilo_param
  • SUPPORTED_MODELS.md updated
  • README.md updated
  • serialize_possible_ml_model in the Ensemble chain

0.8 - 2024-05-06

Added

  • StandardScaler Transformer in pymilo_param.py
  • PreprocessingTransporter Transporter
  • ndarray shape config in GeneralDataStructure Transporter
  • util.py in chains
  • BinMapperTransporter Transporter
  • BunchTransporter Transporter
  • GeneratorTransporter Transporter
  • TreePredictorTransporter Transporter
  • AdaboostClassifier model
  • AdaboostRegressor model
  • BaggingClassifier model
  • BaggingRegressor model
  • ExtraTreesClassifier model
  • ExtraTreesRegressor model
  • GradientBoosterClassifier model
  • GradientBoosterRegressor model
  • HistGradientBoosterClassifier model
  • HistGradientBoosterRegressor model
  • RandomForestClassifier model
  • RandomForestRegressor model
  • IsolationForest model
  • RandomTreesEmbedding model
  • StackingClassifier model
  • StackingRegressor model
  • VotingClassifier model
  • VotingRegressor model
  • Pipeline model
  • Ensemble models test runner
  • Ensemble chain
  • SECURITY.md

Changed

  • Pipeline test updated
  • LabelBinarizer,LabelEncoder and OneHotEncoder got embedded in PreprocessingTransporter
  • Preprocessing support added to Ensemble chain
  • Preprocessing params initialized in pymilo_param
  • util.py in utils updated
  • test_pymilo.py updated
  • pymilo_func.py updated
  • linear_model_chain.py updated
  • neural_network_chain.py updated
  • decision_tree_chain.py updated
  • clustering_chain.py updated
  • naive_bayes_chain.py updated
  • neighbours_chain.py updated
  • svm_chain.py updated
  • GeneralDataStructure Transporter updated
  • LossFunction Transporter updated
  • AbstractTransporter updated
  • Tests config modified
  • Unequal sklearn version error added in pymilo_param.py
  • Ensemble params initialized in pymilo_param
  • Ensemble support added to pymilo_func.py
  • SUPPORTED_MODELS.md updated
  • README.md updated

0.7 - 2024-04-03

Added

  • pymilo_nearest_neighbor_test function added to test_pymilo.py
  • NeighborsTreeTransporter Transporter
  • LocalOutlierFactor model
  • RadiusNeighborsClassifier model
  • RadiusNeighborsRegressor model
  • NearestCentroid model
  • NearestNeighbors model
  • KNeighborsClassifier model
  • KNeighborsRegressor model
  • Neighbors models test runner
  • Neighbors chain

Changed

  • Tests config modified
  • Neighbors params initialized in pymilo_param
  • Neighbors support added to pymilo_func.py
  • SUPPORTED_MODELS.md updated
  • README.md updated

0.6 - 2024-03-27

Added

  • deserialize_primitive_type function in GeneralDataStructureTransporter
  • is_deserialized_ndarray function in GeneralDataStructureTransporter
  • deep_deserialize_ndarray function in GeneralDataStructureTransporter
  • deep_serialize_ndarray function in GeneralDataStructureTransporter
  • SVR model
  • SVC model
  • One Class SVM model
  • NuSVR model
  • NuSVC model
  • Linear SVR model
  • Linear SVC model
  • SVM models test runner
  • SVM chain

Changed

  • pymilo_param.py updated
  • pymilo_obj.py updated to use predefined strings
  • TreeTransporter updated
  • get_homogeneous_type function in util.py updated
  • GeneralDataStructureTransporter updated to use deep ndarray serializer & deserializer
  • check_str_in_iterable updated
  • Label Binarizer Transporter updated
  • Function Transporter updated
  • CFNode Transporter updated
  • Bisecting Tree Transporter updated
  • Tests config modified
  • SVM params initialized in pymilo_param
  • SVM support added to pymilo_func.py
  • SUPPORTED_MODELS.md updated
  • README.md updated

0.5 - 2024-01-31

Added

  • reset function in the Transport interface
  • reset function implementation in AbstractTransporter
  • Gaussian Naive Bayes declared as GaussianNB model
  • Multinomial Naive Bayes model declared as MultinomialNB model
  • Complement Naive Bayes model declared as ComplementNB model
  • Bernoulli Naive Bayes model declared as BernoulliNB model
  • Categorical Naive Bayes model declared as CategoricalNB model
  • Naive Bayes models test runner
  • Naive Bayes chain

Changed

  • Transport function of AbstractTransporter updated
  • fix the order of CFNode fields serialization in CFNodeTransporter
  • GeneralDataStructureTransporter support list of ndarray with different shapes
  • Tests config modified
  • Naive Bayes params initialized in pymilo_param
  • Naive Bayes support added to pymilo_func.py
  • SUPPORTED_MODELS.md updated
  • README.md updated

0.4 - 2024-01-22

Added

  • has_named_parameter method in util.py
  • CFSubcluster Transporter(inside CFNode Transporter)
  • CFNode Transporter
  • Birch model
  • SpectralBiclustering model
  • SpectralCoclustering model
  • MiniBatchKMeans model
  • feature_request.yml template
  • config.yml for issue template
  • BayesianGaussianMixture model
  • serialize_tuple method in GeneralDataStructureTransporter
  • import_function method in util.py
  • Function Transporter
  • FeatureAgglomeration model
  • HDBSCAN model
  • GaussianMixture model
  • OPTICS model
  • DBSCAN model
  • AgglomerativeClustering model
  • SpectralClustering model
  • MeanShift model
  • AffinityPropagation model
  • Kmeans model
  • Clustering models test runner
  • Clustering chain

Changed

  • LossFunctionTransporter enhanced to handle scikit 1.4.0 _loss_function_ field
  • Codacy Static Code Analyzer's suggestions applied
  • Spectral Clustering test folder refactored
  • Bug report template modified
  • GeneralDataStructureTransporter updated
  • Tests config modified
  • Clustering data set preparation added to data_exporter.py
  • Clustering params initialized in pymilo_param
  • Clustering support added to pymilo_func.py
  • Python 3.12 added to test.yml
  • dev-requirements.txt updated
  • Code quality badges added to README.md
  • SUPPORTED_MODELS.md updated
  • README.md updated

0.3 - 2023-09-27

Added

  • scikit-learn decision tree models
  • ExtraTreeClassifier model
  • ExtraTreeRegressor model
  • DecisionTreeClassifier model
  • DecisionTreeRegressor model
  • Tree Transporter
  • Decision Tree chain

Changed

  • Tests config modified
  • DecisionTree params initialized in pymilo_param
  • Decision Tree support added to pymilo_func.py

0.2 - 2023-08-02

Added

  • scikit-learn neural network models
  • MLP Regressor model
  • MLP Classifier model
  • BernoulliRBN model
  • SGDOptimizer transporter
  • RandomState(MT19937) transporter
  • Adamoptimizer transporter
  • Neural Network chain
  • Neural Network exceptions
  • ndarray_to_list method in GeneralDataStructureTransporter
  • list_to_ndarray method in GeneralDataStructureTransporter
  • neural_network_chain.py chain

Changed

  • GeneralDataStructure Transporter updated
  • LabelBinerizer Transporter updated
  • linear model chain updated
  • GeneralDataStructure transporter enhanced
  • LabelBinerizer transporter updated
  • transporters' chain router added to pymilo func
  • NeuralNetwork params initialized in pymilo_param
  • pymilo_test updated to support multiple models
  • linear_model_chain refactored

0.1 - 2023-06-29

Added

  • scikit-learn linear models support
  • Export class
  • Import class

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymilo-1.0.tar.gz (52.8 kB view details)

Uploaded Source

Built Distribution

pymilo-1.0-py3-none-any.whl (73.4 kB view details)

Uploaded Python 3

File details

Details for the file pymilo-1.0.tar.gz.

File metadata

  • Download URL: pymilo-1.0.tar.gz
  • Upload date:
  • Size: 52.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for pymilo-1.0.tar.gz
Algorithm Hash digest
SHA256 238f1e6dd60a6c24f1cb02c13b55196142934e6fc02a81920742485f0637c502
MD5 796c005f50bb11898ce9580ac2fde5f7
BLAKE2b-256 a2dd8478b2478d62a158e981cba74dce4d3ef960190e4d247857a2451f77f8f5

See more details on using hashes here.

File details

Details for the file pymilo-1.0-py3-none-any.whl.

File metadata

  • Download URL: pymilo-1.0-py3-none-any.whl
  • Upload date:
  • Size: 73.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for pymilo-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e47fbc0f6a9c3e63015447ceb331f302736156a81674ed60c74cb245c221314
MD5 940fc2582c617ad1026324dd29178cfa
BLAKE2b-256 1d32d4897320f1e85c3ea87e616b1eb093980620bca9777a0af28560fef044fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page