Skip to main content

Python package for data logging and monitoring.

Project description

Raymon: analyse data & model health

Build Coverage Code style: black License PyPI

What is Raymon?

Raymon helps Machine Learning teams analyse data, data health and model performance. Using Raymon, users can extract features describing data quality, data novelty, model confidence and prediction performance from model predictions. Then, they can use these features to validate production data and generate reports for data drift, data degradation and model degradation.

We can support any data type. Currently, we offer extractors for structured data and vision data, but you can easily implement your own extractor which means we can any data type and any extractor that you want.

Raymon’s focus is on simplicity, practicality and extendability. We offer a set of extractors that are cheap to compute and simple to understand.

Raymon is open source and can be used standalone but integrates nicely with the Raymon.ai ML Observability hub, for example to make predictions traceable and debuggable.

Quick Links

At a glance

Installation

pip install raymon

Building a model profile

Building a ModelProfile captures all kinds of data characteristics of your models inputs, outputs, actuals and predictions.

profile = ModelProfile(
    name="HousePricesCheap",
    version="3.0.0",
    components=[
        InputComponent(
            name="outlier_score",
            extractor=SequenceSimpleExtractor(
                prep=coltf, extractor=KMeansOutlierScorer()),
        ),
        OutputComponent(name="prediction", extractor=ElementExtractor(element=0)),
        ActualComponent(name="actual", extractor=ElementExtractor(element=0)),
        EvalComponent(name="abs_error", extractor=AbsoluteRegressionError()),
    ] + generate_components(X_train[feature_selector].dtypes, 
                            complass=InputComponent), # Generates a component for every column in the DF
    scores=[
        MeanScore(
            name="MAE",
            inputs=["abs_error"],
            preference="low",
        ),
        MeanScore(
            name="mean_outlier_score",
            inputs=["outlier_score"],
            preference="low",
        ),
    ],
)
profile.build(input=X_val[feature_selector], 
              output=y_pred_val[:, None], 
              actual=y_val[:, None])
profile.view()

image

Validating production data

Profiles can then be used in production code to validate your incoming data and model performance monitoring.

tags = profile.validate_input(request)
output_tags = profile.validate_output(request_pred)
actual_tags = profile.validate_actual(request_actual)
eval_tags = profile.validate_eval(output=request_pred, 
                                  actual=request_actual)
# or all at once:
all_tags = profile.validate_all(input=request, 
                                output=request_pred, 
                                actual=request_actual)

Inspect and contrast model profiles

You can contast different model profiles against each other too. For example, to compare the profile at model train time, with the profile on production data, or to compare subsets of production data.

profile.view_contrast(profile_exp)

interactive-demo

Logging text, data and tags

Moreover, if you want to use the rest of the platform, Raymon makes model predictions traceable and debuggable. Raymon enables you to log text, data and tags from anywhere in your code. You can later use these tags and data objects to debug and improve your systems.

import pandas as pd
import numpy as np
from PIL import Image

import raymon.types as rt
from raymon import Trace, RaymonAPILogger, Tag


logger = RaymonAPILogger(project_id=project_id)
trace = Trace(logger=logger, trace_id=None)

# Logging text messages
trace.info("You can log whatever you want here")

# Tagging traces
trace.tag([
        Tag(name="sdk_version", value="1.4.2", type="label"),
        Tag(name="prediction_time_ms", value="120", type="metric")
    ])

# Logging data
img = Image.open("./data_sample/castinginspection/def_front/cast_def_0_0.jpeg")
df = pd.DataFrame(arr, columns=['a', 'b'])

trace.log(ref="pandas-ref", data=rt.DataFrame(df))
trace.log(ref="image-ref", data=rt.Image(img))

For more information, check out our docs & examples!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raymon-0.0.38.linux-x86_64.tar.gz (18.5 MB view hashes)

Uploaded Source

Built Distribution

raymon-0.0.38-py3-none-any.whl (18.5 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page