Azure Machine Learning Model Monitoring SDK V2
Project description
Azure Machine Learning Model Monitoring SDK
The azureml-ai-monitoring
package provides an SDK to enable Model Data Collector (MDC) for custom logging allows customers to collect data at arbitrary points in their data pre-processing pipeline. Customers can leverage SDK in score.py
to log data to desired sink before, during, and after any data transformations.
Start by importing the azureml-ai-monitoring
package in score.py
import pandas as pd
import json
from azureml.ai.monitoring import Collector
def init():
global inputs_collector, outputs_collector
# instantiate collectors with appropriate names, make sure align with deployment spec
inputs_collector = Collector(name='model_inputs')
outputs_collector = Collector(name='model_outputs')
def run(data):
# json data: { "data" : { "col1": [1,2,3], "col2": [2,3,4] } }
pdf_data = preprocess(json.loads(data))
# tabular data: { "col1": [1,2,3], "col2": [2,3,4] }
input_df = pd.DataFrame(pdf_data)
# collect inputs data, store correlation_context
context = inputs_collector.collect(input_df)
# perform scoring with pandas Dataframe, return value is also pandas Dataframe
output_df = predict(input_df)
# collect outputs data, pass in correlation_context so inputs and outputs data can be correlated later
outputs_collector.collect(output_df, context)
return output_df.to_dict()
def preprocess(json_data):
# preprocess the payload to ensure it can be converted to pandas DataFrame
return json_data["data"]
def predict(input_df):
# process input and return with outputs
...
return output_df
Create environment with base image mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04
and conda dependencies, then build the environment.
channels:
- conda-forge
dependencies:
- python=3.8
- numpy=1.23.5
- pandas=1.5.2
- pip=22.3.1
- pip:
- azureml-defaults==1.38.0
- requests==2.28.1
- azureml-ai-monitoring~=0.1.0b1
name: model-env
Create deployment with custom logging enabled (model_inputs and model_outputs are enabled) and the environment you just built, please update the yaml according to your scenario.
#source ../configs/model-data-collector/data-storage-basic-OnlineDeployment.YAML
$schema: http://azureml/sdk-2-0/OnlineDeployment.json
endpoint_name: my_endpoint #unchanged
name: blue #unchanged
model: azureml:my-model-m1:1 #azureml:models/<name>:<version> #unchanged
environment: azureml:custom-logging-env:1 #unchanged
data_collector:
collections:
model_inputs:
enabled: true
model_outputs:
enabled: true
By default, we'll raise the exception when there is unexpected behavior (like custom logging is not enabled, collection is not enabled, not supported data type), if you want a configurable on_error, you can do it with
collector = Collector(name="inputs", on_error=lambda e: logging.info("ex:{}".format(e)))
Change Log
v0.1.0b1 (2023.4.25)
New Features
- Support model data collection for pandas Dataframe.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for azureml_ai_monitoring-0.1.0b1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e2895939895091551a4c9b22c60fe78db9b52af757f872658f74cf511309467 |
|
MD5 | 9c61a0b3d7fa691b105c9bd47eda0072 |
|
BLAKE2b-256 | 1b866e010fab1af538ca1f4e45a43d3405ce03af49f4a4c8356f3d93bda1ae7d |