Standard framework for wrapping ML and other algorithms
Rho ML provides a thin, thoughtful, and proven interface for putting Data Science to work in production and enterprise-grade environments. Rho uses Rho ML for workloads as varied as NLP, to Computer Vision to Decision Modeling for professional racing. We see Rho ML as having a few key benefits.
#. Any Model (we won't dictate your libraries of choice!)
- Any Model with a Python interface
- Versioning is a common blind-spot in data science as compared to the de facto standard of Semver in much of software engineering and modern CI/CD workflows.
- Rho ML provides this out-of-the-box, no strings attached.
- That said, we get that not all versioning is created equal, and provide easy access to customizing version patterns. #. Default Serialization and Deserialization (yet customizable)
- Storing models for production workloads is non-trivial.
- Frequently, libraries (including those listed above) provide their "hello world" and "quickstart" guides expecting you're on a local development machine with a "save to disk" type interface. Rho ML provides instant-access to easy, production-grade, methods to store and retrieve models.
- The default option may not work, so Rho ML provides easy modification as necessary for advanced use cases. #. Cloud and Cache (speed versus cost)
- A "model" is not created equal with respect to production workloads. Storing and retrieving from the cloud versus locally (cached locally) makes a tremendous difference in speed and cost when dealing with models that often exceed 10s of megabytes / gigabytes.
- Rho ML provides a sensible default for managing storage in both scenarios. #. Shameless Plug (enterprise deployments)
- Every Rho ML model has instant compatibilty with Sermos for enterprise-scale deployments that need 10s to 10s of millions of transactions, scheduled tasks, models behind public APIs, or complex pipelines.
Rho ML is extremely easy to use and has only two external dependencies attrs, and
Install this software? Easy:
pip install rho-ml
Here is a trivial example of a rules-based "model" implemented as a
from rho_ml import RhoModel, ValidationFailedError, Version, LocalModelStorage class MyModel(RhoModel): def predict_logic(self, prediction_data): """ Logic for running the model on some prediction data """ return prediction_data * 5 def validate_prediction_input(self, prediction_data): """ Ensure data has an appropriate type before prediction """ if not (isinstance(prediction_data, int) or isinstance(prediction_data, float)): raise ValidationError("Prediction data wasn't numeric!") def validate_prediction_output(self, data): """ Ensure the prediction result is between 0 and 5 """ if not 0 <= data <= 5: raise ValidationError("Prediction result should always be between 0 and 5!") some_instance = MyModel(name='some_name', version=Version.from_string("0.0.1")) result = some_instance.predict(0.5, run_validation=True) # works! result_2 = some_instance.predict(10, run_validation=True) # fails! local_storage = LocalModelStorage(base_path='./some-folder') local_storage.store(some_instance) stored_key = local_storage.get_key_from_pattern(model_name='some_name', version_pattern='0.*.*') deserialized = local_storage.retrieve(key=stored_key)
RhoModel base class is the central concept in
is a basic wrapper that enforces what we believe are the central tasks a
machine learning model should accomplish, provides a consistent interface
to 'all models', and provides the scaffolding for writing models that have
validated input and output.
TODO: Add additional detail on each component of a RhoModel and provide several examples.
A "model locator" in Rho ML is the combination of the model name, the model version, and a delimiter between them.
This is important for storage and retrieval of models as they evolve over time. Using the default settings is highly recommended but each component is configurable.
- Model names can be any alphanumeric character
- Delimeter is "_" (the underscore character)
- Model versions must adhere to semver versioning
TODO: Describe concept of serializing/deserializing.
To run the tests you need to have
pyenv running on your system, along with
all python versions listed in
Install the required Python versions noted in
pyenv install 3.7.4
Install the testing requirements locally.
pip install -e .[test]
Now, run the tests:
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.