Machine learning model contracts with machine learning infrastructure
Project description
Twinn-ml-interface
Twinn-ml-interface is a Python package for data contracts between machine learning code and infrastructure.
Author: Royal HaskoningDHV
Installation
The easiest way to install is package is using pip:
pip install twinn-ml-interface
Model Interface
Purpose
The Model Interface defines the required methods and attributes that any ML model needs to have in order to run in our infrastructure.
Testing compliance of your model with the data contract
Instance of the Model Interface
Once all the attributes and methods from the Protocol ModelInterfaceV4
are implemented, including the correct type-hints / annotations, we can check if our models is compliant with the interface if it passes the isinstance
check with ModelInterfaceV4
. You can find a base test in twinn_ml_interface/interface/model_test.py
. The Darrow-Poc is an example of a model that follows the ModelInterfaceV4.
Mock Executors
The executor
class takes care of running the model either for training or predictions in our infrastructure. Here, we implemented a mock executor to emulate that behaviour to some extent, which hopefully makes it a little clearer in what context the model class will be used. Any model compliant with the ModelInterface should be able to train and predict using the ExecutorMock
that can be found in twinn_ml_interface/mocks/mocks.py
. The Darrow-Poc is an example of a model that follows ModelInterfaceV4
and, for instance, can run using the ExecutorMock
.
The steps and methods that the infrastructure and the mock executor run during training are:
- Read config:
get_target_template()
get_train_window_finder_config_template()
- Initialize the model
initialize()
- Given the configuration for the train window finder in the previous steps, validate possible windows:
validate_input_data()
- Read the data configuration to download all the needed data in a window selected by the previous step:
get_data_config_template()
- Transform the input data as needed:
preprocess()
- Train:
train()
- Store the model:
dump()
When the training is finished, the model can be used for predicting. The prediction steps are:
- Retrieve the model from storage and load it:
load()
- Fetch the data needed for prediction based on either:
base_features
- if presentget_data_config_template()
- otherwise
- Predict:
predict()
- Load configuration to post predictions:
get_result_template()
Example of the Model Interface
Darrow Poc
The Darrow-Poc is an example of a model that follows ModelInterfaceV4
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for twinn_ml_interface-0.3.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db3adde35b3dbbaf8dc44ec04e6725e2ee2ae205892ac21ce0417cf10e6ebbc9 |
|
MD5 | a4bf29b238c2337f56f5c71659b2d095 |
|
BLAKE2b-256 | bd348b2b08dc4f4332f042e99658525b93c0fa5b6831878c951e7f82d97066e5 |