Skip to main content

Profile and monitor your ML data pipeline end-to-end

Project description


The open standard for data logging

DocumentationSlack CommunityPython QuickstartWhyLabs Quickstart

License PyPi Version Code style: black PyPi Downloads CI Maintainability

What is whylogs

whylogs is an open source library for logging any kind of data. With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to:

  1. Track changes in their dataset
  2. Create data constraints to know whether their data looks the way it should
  3. Quickly visualize key summary statistics about their datasets

These three functionalities enable a variety of use cases for data scientists, machine learning engineers, and data engineers:

  • Detect data drift in model input features
  • Detect training-serving skew, concept drift, and model performance degradation
  • Validate data quality in model inputs or in a data pipeline
  • Perform exploratory data analysis of massive datasets
  • Track data distributions & data quality for ML experiments
  • Enable data auditing and governance across the organization
  • Standardize data documentation practices across the organization
  • And more

Quickstart

Install whylogs using the pip package manager in a terminal by running:

pip install whylogs

Then you can log data in python as simply as this:

import whylogs as why
import pandas as pd

df = pd.read_csv("path/to/file.csv")
results = why.log(df)

And voilà, you now have a whylogs profile. To learn more about what a whylogs profile is and what you can do with it, check out our docs and our examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whylogs-1.4.8.tar.gz (1.8 MB view hashes)

Uploaded Source

Built Distribution

whylogs-1.4.8-py3-none-any.whl (1.9 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page