Skip to main content

No project description provided

Project description

Acceldata ML Observability SDK

The SDK helps Data organizations track their ML models, data that deliver business value.

Pre-requisites

  • Registering yourself with Acceldata Data Observability Cloud platform

Driven through Acceldata Cloud Platform

  • Enabling ML Observability toolkit

Driven through Acceldata Cloud Platform

  • Generating API keys

Driven through Acceldata ML Observability UI

  • Setting up env vars
export CLOUD_ACCESS_KEY=XXXX0000
export CLOUD_SECRET_KEY=XXXX0000
export ACCELO_API_ACCESS_KEY=XXXX0000
export ACCELO_API_SECRET_KEY=XXXX0000
export ACCELO_API_ENDPOINT=https://some_acceldata_endpoint
  • Install the SDK
pip install accelo

Set Go!

Sample Usage Patterns

Before we delve into code, let's just see an example of a pattern in which you can use the SDK.

Project Creation

Modes

  1. UI - Users will be able to create projects via the Catalog UI where they can either have a model view or a project view
  2. API - Users can create a project in their training pipeline. If a project already exists, API throws a custom error that can be used to avoid any failures in the training pipeline

Model Registration and Baseline logging (training pipeline)

  • User registers a model against a project
  • Model registration API expects the project id, model name and bunch of other metadata that can be used to track models on the catalog UI

Prediction logging (serving pipeline)

  • The serving pipeline can be used to log the predictions to Acceldata datastore
  • The API expects model id, model version, and predictions along with their id columns as mandatory params.

Actual logging (actuals pipeline)

The actuals for any features may arrive at a later point and the API provides 2 ways to log the actuals.

  • UUIDs: generated by the API during the serving pipeline stage; but the users are expected to keep track of them and map them to the appropriate actuals
  • ID COLUMNS: If users specify certain columns to considered as the ID’s, the API will be able to automatically log the actuals against the API’s and the backend services will be able to compare the actuals to predictions based on these ID COLUMNS

Note: Please refer to the API documentation for more information.

Basic APIs

Finally, let's see how you can annotate the SDK into your production code pipelines. Below are some examples of how a Data Scientist or ML Engineer can annotate the SDK into the existing ML code and observe them using Acceldata ML Observability platform.

Import the library

from accelo_mlops import AcceloClient

Creating a client with a workspace

The workspace is the top level name that you would want to associate your organization with. This can also be thought of like a tenant name.

client = AcceloClient(workspace='your_organization_name')

Creating a Project

Now, when it comes to code, the atomic unit is a Project. The project name can be a team name, domain name within a company or any other logical separation Data Science groups.

client.create_project(name='marketing-team', 
                      description='All models related to the marketing team reside here. '
)

Register a Model

Now, assuming that you have developed a model that you want to observe using the Acceldata ML Observability platform. The model object is called classifier.

model_metadata = {
    'frequency': 'DAILY', 
    'model_type': 'binary_classification',
    'performance_metric': 'f1_score', 
    'model_obj': classifier
}
additional_params = {
    'owner': 'research@preview.com',
    'last_trained': '2021-08-01',
    'training_job_name': 'click_prediction_ml_pipeline',
    'label': 'flower_type',
    'total_consumers': 2
}

client.register_model(project_id=12, 
                      model_name='click_prediction_model', 
                      model_version='v1', 
                      model_metadata=model_metadata, 
                      **additional_params
)

Let's see what above variables mean.

  • classifier: this is the model object
  • model_meatadata: this is a mandatory dictionary users have to pass to the register model call to make most use of the ML observability platform.
  • additional_params: this is a optional dictionary users can use to log any additional details about the model which might be useful when viewed in the ML Catalog.

Now, it's time to log the data that was used in model.

Log baseline data

client.log_baseline(
    model_id=client.model_id,
    model_version='v1',
    baseline_data=X_train,
    labels=y_train,
    label_name='click',
    id_cols=['campaign_id'],
    publish_date='2021-08-02'
)

This API call logs your baseline data to Acceldata data store and will be further used for analysis that you sign up for.

Log predictions

ids = client.log_predictions(
    model_id=client.model_id,
    model_version='v1',
    feature_data=feature_data,
    predictions=preds,
    publish_date='2021-06-02'
)

Note: As of now, we support batch predictions only but soon enough, will be able to support logging online predictions.

Log actuals

At a later time, when actuals arrive, you'd be able to log them using below API.

client.log_actuals(
    model_id=client.model_id,
    model_version='v1',
    id_cols_df=id_columns_frane,
    actuals=y_test,
    publish_date='2021-06-03'
)

You are now done logging both metadata and the data itself.

Detailed activity logs can be viewed in the ad-mlops.log file in the directory where your code file exists, however, location of the log file is configurable.

What happens after you create a project and register a model?

Metadata

The model and the other metadata are now part of Acceldata ML Catalog and can be viewed on the UI.

Data

The baseline, prediction, actual data are logged into the Acceldata Store. This data will be used for further analysis.

Dashboard

You will be able to track model performance, data drifts, etc by visiting this dashboard.

Alerts

You can set alerts on charts, generate reports, etc using the dashboard or the catalog.

Contact Us

Please get in touch with us at contact@acceldata.io for access to Acceldata catalog, dashboard, and assistance with bringing ML Observability into your organization.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accelo-0.0.1.tar.gz (19.0 kB view hashes)

Uploaded Source

Built Distribution

accelo-0.0.1-py3-none-any.whl (29.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page