Standard framework for wrapping ML and other algorithms
Project description
Rho ML
Rho ML provides a thin, thoughtful, and proven interface for putting Data Science to work in production and enterprise-grade environments. Rho uses Rho ML for workloads as varied as NLP, to Computer Vision to Decision Modeling for professional racing. We see Rho ML as having a few key benefits.
#. Any Model (we won't dictate your libraries of choice!)
- Any Model with a Python interface
- PyTorch
- Tensorflow
- spaCy
- Keras
- [insert your preferred library here]
- ... or some other custom Python code #. Out-of-the-Box Versioning (yet customizable)
- Versioning is a common blind-spot in data science as compared to the de facto standard of Semver in much of software engineering and modern CI/CD workflows.
- Rho ML provides this out-of-the-box, no strings attached.
- That said, we get that not all versioning is created equal, and provide easy access to customizing version patterns. #. Default Serialization and Deserialization (yet customizable)
- Storing models for production workloads is non-trivial.
- Frequently, libraries (including those listed above) provide their "hello world" and "quickstart" guides expecting you're on a local development machine with a "save to disk" type interface. Rho ML provides instant-access to easy, production-grade, methods to store and retrieve models.
- The default option may not work, so Rho ML provides easy modification as necessary for advanced use cases. #. Cloud and Cache (speed versus cost)
- A "model" is not created equal with respect to production workloads. Storing and retrieving from the cloud versus locally (cached locally) makes a tremendous difference in speed and cost when dealing with models that often exceed 10s of megabytes / gigabytes.
- Rho ML provides a sensible default for managing storage in both scenarios. #. Shameless Plug (enterprise deployments)
- Every Rho ML model has instant compatibilty with Sermos for enterprise-scale deployments that need 10s to 10s of millions of transactions, scheduled tasks, models behind public APIs, or complex pipelines.
Rho ML is extremely easy to use and has only two external dependencies attrs, and
Install
Install this software? Easy:
pip install rho-ml
Quickstart Guide
Here is a trivial example of a rules-based "model" implemented as a RhoModel
,
including serialization.
from rho_ml import RhoModel, ValidationFailedError, Version, LocalModelStorage
class MyModel(RhoModel):
def predict_logic(self, prediction_data):
""" Logic for running the model on some prediction data """
return prediction_data * 5
def validate_prediction_input(self, prediction_data):
""" Ensure data has an appropriate type before prediction """
if not (isinstance(prediction_data, int)
or isinstance(prediction_data, float)):
raise ValidationError("Prediction data wasn't numeric!")
def validate_prediction_output(self, data):
""" Ensure the prediction result is between 0 and 5 """
if not 0 <= data <= 5:
raise ValidationError("Prediction result should always be
between 0 and 5!")
some_instance = MyModel(name='some_name',
version=Version.from_string("0.0.1"))
result = some_instance.predict(0.5, run_validation=True) # works!
result_2 = some_instance.predict(10, run_validation=True) # fails!
local_storage = LocalModelStorage(base_path='./some-folder')
local_storage.store(some_instance)
stored_key = local_storage.get_key_from_pattern(model_name='some_name',
version_pattern='0.*.*')
deserialized = local_storage.retrieve(key=stored_key)
Core Concepts
Rho Model
The RhoModel
base class is the central concept in RhoML
. A RhoModel
is a basic wrapper that enforces what we believe are the central tasks a
machine learning model should accomplish, provides a consistent interface
to 'all models', and provides the scaffolding for writing models that have
validated input and output.
TODO: Add additional detail on each component of a RhoModel and provide several examples.
Model Locator
A "model locator" in Rho ML is the combination of the model name, the model version, and a delimiter between them.
This is important for storage and retrieval of models as they evolve over time. Using the default settings is highly recommended but each component is configurable.
By default:
- Model names can be any alphanumeric character
- Delimeter is "_" (the underscore character)
- Model versions must adhere to semver versioning
e.g. MyModel_0.1.0
Serialization
TODO: Describe concept of serializing/deserializing.
Testing
To run the tests you need to have pyenv
running on your system, along with
all python versions listed in tox.ini
under envlist
.
-
Install the required Python versions noted in
tox.ini
, e.g.pyenv install 3.7.4
Install the testing requirements locally.
pip install -e .[test]
Now, run the tests:
tox
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file rho-ml-0.12.1.tar.gz
.
File metadata
- Download URL: rho-ml-0.12.1.tar.gz
- Upload date:
- Size: 22.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3016f9e2d45ee0027b9f25ad06d54c429f0df33b3e81457f5ceba214a20e8cd7 |
|
MD5 | d2aa8a6b0ebaa46a65a89378cd4475ae |
|
BLAKE2b-256 | b4dd45a8697b9654456b1cc4554a7480940dcec8b2ab0a4af07e0202f6055bd3 |