No project description provided
Project description
Acceldata ML Observability SDK
The SDK helps Data organizations track their ML models, data that deliver business value.
Pre-requisites
- Registering yourself with Acceldata Data Observability Cloud platform
Driven through Acceldata Cloud Platform
- Enabling ML Observability toolkit
Driven through Acceldata Cloud Platform
- Generating API keys
Driven through Acceldata ML Observability UI
- Setting up env vars
export CLOUD_ACCESS_KEY=XXXX0000
export CLOUD_SECRET_KEY=XXXX0000
export ACCELO_API_ACCESS_KEY=XXXX0000
export ACCELO_API_SECRET_KEY=XXXX0000
export ACCELO_API_ENDPOINT=https://some_acceldata_endpoint
- Install the SDK
pip install accelo
Set Go!
Sample Usage Patterns
Before we delve into code, let's just see an example of a pattern in which you can use the SDK.
Project Creation
Modes
- UI - Users will be able to create projects via the Catalog UI where they can either have a model view or a project view
- API - Users can create a project in their training pipeline. If a project already exists, API throws a custom error that can be used to avoid any failures in the training pipeline
Model Registration and Baseline logging (training pipeline)
- User registers a model against a project
- Model registration API expects the project id, model name and bunch of other metadata that can be used to track models on the catalog UI
Prediction logging (serving pipeline)
- The serving pipeline can be used to log the predictions to Acceldata datastore
- The API expects model id, model version, and predictions along with their id columns as mandatory params.
Actual logging (actuals pipeline)
The actuals for any features may arrive at a later point and the API provides 2 ways to log the actuals.
- UUIDs: generated by the API during the serving pipeline stage; but the users are expected to keep track of them and map them to the appropriate actuals
- ID COLUMNS: If users specify certain columns to considered as the ID’s, the API will be able to automatically log the actuals against the API’s and the backend services will be able to compare the actuals to predictions based on these ID COLUMNS
Note: Please refer to the API documentation for more information.
Basic APIs
Finally, let's see how you can annotate the SDK into your production code pipelines. Below are some examples of how a Data Scientist or ML Engineer can annotate the SDK into the existing ML code and observe them using Acceldata ML Observability platform.
Import the library
from accelo_mlops import AcceloClient
Creating a client with a workspace
The workspace is the top level name that you would want to associate your organization with. This can also be thought of like a tenant name.
client = AcceloClient(workspace='your_organization_name')
Creating a Project
Now, when it comes to code, the atomic unit is a Project
. The project name can be a team name, domain name within
a company or any other logical separation Data Science groups.
client.create_project(name='marketing-team',
description='All models related to the marketing team reside here. '
)
Register a Model
Now, assuming that you have developed a model that you want to observe using the Acceldata ML Observability platform.
The model object is called classifier
.
model_metadata = {
'frequency': 'DAILY',
'model_type': 'binary_classification',
'performance_metric': 'f1_score',
'model_obj': classifier
}
additional_params = {
'owner': 'research@preview.com',
'last_trained': '2021-08-01',
'training_job_name': 'click_prediction_ml_pipeline',
'label': 'flower_type',
'total_consumers': 2
}
client.register_model(project_id=12,
model_name='click_prediction_model',
model_version='v1',
model_metadata=model_metadata,
**additional_params
)
Let's see what above variables mean.
- classifier: this is the model object
- model_meatadata: this is a mandatory dictionary users have to pass to the register model call to make most use of the ML observability platform.
- additional_params: this is a optional dictionary users can use to log any additional details about the model which might be useful when viewed in the ML Catalog.
Now, it's time to log the data that was used in model.
Log baseline data
client.log_baseline(
model_id=client.model_id,
model_version='v1',
baseline_data=X_train,
labels=y_train,
label_name='click',
id_cols=['campaign_id'],
publish_date='2021-08-02'
)
This API call logs your baseline data to Acceldata data store and will be further used for analysis that you sign up for.
Log predictions
ids = client.log_predictions(
model_id=client.model_id,
model_version='v1',
feature_data=feature_data,
predictions=preds,
publish_date='2021-06-02'
)
Note: As of now, we support batch predictions only but soon enough, will be able to support logging online predictions.
Log actuals
At a later time, when actuals arrive, you'd be able to log them using below API.
client.log_actuals(
model_id=client.model_id,
model_version='v1',
id_cols_df=id_columns_frane,
actuals=y_test,
publish_date='2021-06-03'
)
You are now done logging both metadata and the data itself.
Detailed activity logs can be viewed in the ad-mlops.log
file in the directory where your code file exists, however, location of the log file is configurable.
What happens after you create a project and register a model?
Metadata
The model and the other metadata are now part of Acceldata ML Catalog and can be viewed on the UI.
Data
The baseline, prediction, actual
data are logged into the Acceldata Store. This data will be used for further analysis.
Dashboard
You will be able to track model performance, data drifts, etc by visiting this dashboard.
Alerts
You can set alerts on charts, generate reports, etc using the dashboard or the catalog.
Contact Us
Please get in touch with us at contact@acceldata.io
for access to Acceldata catalog, dashboard, and assistance with bringing ML Observability into your organization.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file accelo-0.0.1.tar.gz
.
File metadata
- Download URL: accelo-0.0.1.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/3.7.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76d3e0d9eee830b1c814df325aba3e0c5d174b832926399a1e3e2db07c56d07d |
|
MD5 | a2578b83156e72f723757e813dbcd968 |
|
BLAKE2b-256 | cc4d5ca971f2d711102077e47c198be91977c7a6011bf537be96af32ab747c0a |
File details
Details for the file accelo-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: accelo-0.0.1-py3-none-any.whl
- Upload date:
- Size: 29.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/3.7.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b29f28d808e695add6e6c0e49ec47cf254f51f3412b4107d52d1843a7a596e35 |
|
MD5 | 29652a38af81306d1ee127334c09ab56 |
|
BLAKE2b-256 | 9c93fa644e945adc0da915f3ac86af0b099fbf5b1a7c4c2ccd2817ddc3bf8ed8 |