datarobot-mlflow client to synchronize an MLFlow model with DataRobot model
Project description
mlflow-integration
Provide means of exporting a model from MLFlow model registry and pushing it to DataRobot model registry
Key-values are created from training parameters, metrics, tags, and artifacts in the MLflow model.
Setup
- Python 3.7 or later
- DataRobot 9.0 or later
pip install datarobot-mlflow
- if using Azure:
pip install "datarobot-mlflow[azure]"
- if using Azure:
Considerations
This integration library uses an API endpoint under Public Preview. The DataRobot user owning the API token used below must have:
Enable Extended Compliance Documentation
setOwner
orUser
permission for the DataRobot model package
DataRobot information needed
- URL of DataRobot instance, example:
https://app.datarobot.com
- ID of the model package to receive key-values; example:
64227b4bf82db411c90c3209
- API token for DataRobot:
export MLOPS_API_TOKEN=<API token from DataRobot Developer Tools>
Local MLflow information needed
- MLflow tracking URI; example
"file:///Users/me/mlflow/examples/mlruns"
- Model name; example
"cost-model"
- Model version; example
"2"
Azure DataBricks MLFlow with Service Principal information needed
- MLflow tracking URI; example
"azureml://region.api.azureml.ms/mlflow/v1.0/subscriptions/subscription-id/resourceGroups/resource-group-name/providers/Microsoft.MachineLearningServices/workspaces/azure-ml-workspace-name"
- Model name; example
"cost-model"
- Model version; example
"2"
- Provide service principal details in environment:
export AZURE_TENANT_ID="<tenant-id>"
export AZURE_CLIENT_ID="<client-id>"
export AZURE_CLIENT_SECRET="<secret>"
Example: Import from MLflow
DR_MODEL_ID="<MODEL_PACKAGE_ID>"
env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
--mlflow-url http://localhost:8080 \
--mlflow-model cost-model \
--mlflow-model-version 2 \
--dr-model $DR_MODEL_ID \
--dr-url https://app.datarobot.com \
--with-artifacts \
--verbose \
--action sync
Example: validate Azure credentials
export MLOPS_API_TOKEN="n/a" # not used for Azure auth check, but must be present
env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
--verbose \
--auth-type azure-service-principal \
--service-provider-type azure-databricks \
--action validate-auth
# example output for missing environment variables:
Required environment variable is not defined: AZURE_TENANT_ID
Required environment variable is not defined: AZURE_CLIENT_ID
Required environment variable is not defined: AZURE_CLIENT_SECRET
Azure AD Service Principal credentials are not valid; check environment variables
# example output for successful authentication:
Azure AD Service Principal credentials are valid for obtaining access token
Actions
The following operations are available for --action
:
sync
: import parameters, tags, metrics, and artifacts from MLflow model.list-mlflow-keys
: list parameters, tags, metrics, and artifacts in an MLflow model. Requires--mlflow-url
,--mlflow-model
, and--mlflow-model-version
.validate-auth
: see "validate Azure credentials" example above.
Options
The following options can be added to the drflow_cli
command line:
--mlflow-url
: MLflow Tracking URI--mlflow-model
: MLflow model name--mlflow-model-version
: MLflow model version--dr-url
: Main URL of the DataRobot instance--dr-model
: DataRobot Model Package ID. Registered Model Versions are also supported.--prefix
: a string to prepend to the names of all key-values posted to DataRobot. Default is empty.--debug
: set Python logging level tologging.DEBUG
. Default level islogging.WARNING
.--verbose
: prints to stdout information about the following:- retrieving model from MLflow; prints model information
- setting model data in DataRobot: prints each key-value posted
--with-artifacts
: download MLflow model artifacts to/tmp/model
--service-provider-type
: service provider to use forvalidate-auth
. Supported values are:azure-databricks
: for Databricks MLflow within Azure
--auth-type
: authentication type forvalidate-auth
. Supported values are:azure-service-principal
: for Azure Service Principal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file datarobot_mlflow-0.1.dev2-py3-none-any.whl
.
File metadata
- Download URL: datarobot_mlflow-0.1.dev2-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 966e41aaf2bdd229271c085c3b778b26486bceffaa8ae419754921f6561d2000 |
|
MD5 | 960f28e621866a9bd7c65a88bcbd2cb2 |
|
BLAKE2b-256 | 2e4ab55efdfaec0b9d33437655546bb61b9977ee034d2b6ed7e045d79cdc9965 |