Skip to main content

Standard framework for wrapping ML and other algorithms

Project description

Rho ML

Rho ML provides a thin, thoughtful, and proven interface for putting Data Science to work in production and enterprise-grade environments. Rho uses Rho ML for workloads as varied as NLP, to Computer Vision to Decision Modeling for professional racing. We see Rho ML as having a few key benefits.

#. Any Model (we won't dictate your libraries of choice!)

  • Any Model with a Python interface
    • PyTorch
    • Tensorflow
    • spaCy
    • Keras
    • [insert your preferred library here]
    • ... or some other custom Python code #. Out-of-the-Box Versioning (yet customizable)
  • Versioning is a common blind-spot in data science as compared to the de facto standard of Semver in much of software engineering and modern CI/CD workflows.
  • Rho ML provides this out-of-the-box, no strings attached.
  • That said, we get that not all versioning is created equal, and provide easy access to customizing version patterns. #. Default Serialization and Deserialization (yet customizable)
  • Storing models for production workloads is non-trivial.
  • Frequently, libraries (including those listed above) provide their "hello world" and "quickstart" guides expecting you're on a local development machine with a "save to disk" type interface. Rho ML provides instant-access to easy, production-grade, methods to store and retrieve models.
  • The default option may not work, so Rho ML provides easy modification as necessary for advanced use cases. #. Cloud and Cache (speed versus cost)
  • A "model" is not created equal with respect to production workloads. Storing and retrieving from the cloud versus locally (cached locally) makes a tremendous difference in speed and cost when dealing with models that often exceed 10s of megabytes / gigabytes.
  • Rho ML provides a sensible default for managing storage in both scenarios. #. Shameless Plug (enterprise deployments)
  • Every Rho ML model has instant compatibilty with Sermos for enterprise-scale deployments that need 10s to 10s of millions of transactions, scheduled tasks, models behind public APIs, or complex pipelines.

Rho ML is extremely easy to use and has only two external dependencies attrs, and

Install

Install this software? Easy:

pip install rho-ml

Quickstart Guide

Here is a trivial example of a rules-based "model" implemented as a RhoModel, including serialization.

from rho_ml import RhoModel, ValidationFailedError, Version, LocalModelStorage

class MyModel(RhoModel):

    def predict_logic(self, prediction_data):
        """ Logic for running the model on some prediction data """
        return prediction_data * 5

    def validate_prediction_input(self, prediction_data):
        """ Ensure data has an appropriate type before prediction """
        if not (isinstance(prediction_data, int)
            or isinstance(prediction_data, float)):
            raise ValidationError("Prediction data wasn't numeric!")

    def validate_prediction_output(self, data):
        """ Ensure the prediction result is between 0 and 5 """
        if not 0 <= data <= 5:
            raise ValidationError("Prediction result should always be
            between 0 and 5!")


 some_instance = MyModel(name='some_name',
                         version=Version.from_string("0.0.1"))
 result = some_instance.predict(0.5, run_validation=True)  # works!
 result_2 = some_instance.predict(10, run_validation=True)  # fails!

 local_storage = LocalModelStorage(base_path='./some-folder')
 local_storage.store(some_instance)

 stored_key = local_storage.get_key_from_pattern(model_name='some_name',
                                                 version_pattern='0.*.*')
 deserialized = local_storage.retrieve(key=stored_key)

Core Concepts

Rho Model

The RhoModel base class is the central concept in RhoML. A RhoModel is a basic wrapper that enforces what we believe are the central tasks a machine learning model should accomplish, provides a consistent interface to 'all models', and provides the scaffolding for writing models that have validated input and output.

TODO: Add additional detail on each component of a RhoModel and provide several examples.

Model Locator

A "model locator" in Rho ML is the combination of the model name, the model version, and a delimiter between them.

This is important for storage and retrieval of models as they evolve over time. Using the default settings is highly recommended but each component is configurable.

By default:

  • Model names can be any alphanumeric character
  • Delimeter is "_" (the underscore character)
  • Model versions must adhere to semver versioning

e.g. MyModel_0.1.0

Serialization

TODO: Describe concept of serializing/deserializing.

Testing

To run the tests you need to have pyenv running on your system, along with all python versions listed in tox.ini under envlist.

  • Install the required Python versions noted in tox.ini, e.g.

    pyenv install 3.7.4
    

Install the testing requirements locally.

pip install -e .[test]

Now, run the tests:

tox

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rho-ml-0.9.1.tar.gz (21.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page