Skip to main content

A lightweight Python package for comparing datasets and detecting unexpected changes in machine learning systems.

Project description

mldebug

CI codecov

PyPI Python License

A lightweight Python package for comparing datasets and detecting unexpected changes in machine learning systems.

Why mldebug

Machine learning systems fail silently when data changes.

Common production issues include:

  • feature distribution drift
  • increasing missing values
  • unseen categorical values
  • training vs production mismatch

mldebug provides a unified way to detect these issues before they become model failures.

What it does

mldebug compares:

  • a reference dataset (e.g. training data)
  • a current dataset (e.g. production data)

It runs a suite of checks and returns a structured report of detected issues.

Installation

pip install mldebug

Quick Start

from mldebug import run_checks
import numpy as np

reference = {
    "age": np.array([20, 21, 22]),
    "income": np.array([1000, 1200, 1100]),
    "country": np.array(["ES", "ES", "FR"]),
}

current = {
    "age": np.array([30, 35, 40]),
    "income": np.array([900, 800, 850]),
    "country": np.array(["ES", "DE", "DE"]),
}

schema = {
    "age": "numeric",
    "income": "numeric",
    "country": "categorical",
}

report = run_checks(reference=reference, current=current, schema=schema)

Inspect detected issues

Human-readable output

for issue in report.issues:
    print(issue)
[WARNING] psi_drift - country: PSI drift detected (18.0152)

Summary

print(report.summary())
{
  "total": 1,
  "by_severity": {
    "info": 0,
    "warning": 1,
    "critical": 0
  },
  "status": "issues_detected"
}

Structured output

print(report.to_dict())
{
  "issues": [
    {
      "name": "psi_drift",
      "metric": "psi",
      "severity": "warning",
      "message": "country: PSI drift detected (18.0152)",
      "feature": "country",
      "value": 18.01521528247136,
      "threshold": 0.2
    }
  ]
}

Logs

for line in report.to_logs():
    print(line)
[WARNING] psi_drift - country: PSI drift detected (18.0152)

Supported Checks

mldebug runs a combination of:

Numeric features

Categorical features

Documentation

See documentation pages.

Status

Active development (v0.x). APIs may evolve before v1.0.0.

See CHANGELOG.md for version history and updates.

Development Setup

Requirements

Environment Setup

git clone https://github.com/anpenta/mldebug
cd mldebug
direnv allow

Development Workflow

Tasks are managed via poe (available in the project environment via direnv).

Run tests

poe test

Run linting

poe lint

Check linting

poe lint-check

Run full CI parity checks

poe test-all
poe lint-check-all

CI/CD

CI runs multi-Python version testing and linting. All pull requests must pass the checks before merging.

See CI workflow for details.

Contributing

We welcome contributions.

  1. Clone the repository
  2. Create a feature branch
  3. Make your changes
  4. Ensure all CI checks pass
  5. Open a pull request

Dependency Management

Dependencies are managed using uv and defined in pyproject.toml.

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mldebug-0.1.2.tar.gz (76.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mldebug-0.1.2-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file mldebug-0.1.2.tar.gz.

File metadata

  • Download URL: mldebug-0.1.2.tar.gz
  • Upload date:
  • Size: 76.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mldebug-0.1.2.tar.gz
Algorithm Hash digest
SHA256 606891c0eee07324a4ac882c089afbf660e7965e6c7e47fc8170abdb534b08d2
MD5 fd8591b1f8799563c892c03d0abb37d5
BLAKE2b-256 900621148f1723e3b44a15c14b9675195df1380e1a83047921a4f9c103872ebd

See more details on using hashes here.

Provenance

The following attestation bundles were made for mldebug-0.1.2.tar.gz:

Publisher: ci.yml on anpenta/mldebug

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mldebug-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mldebug-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mldebug-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5d6a61d982e7aba5c8a1bc216caeadcd63a9e36ab2ce36d81e562390ab442c10
MD5 5e4748d0134f4ab2ea319426cf4f5860
BLAKE2b-256 599405c3697db27c9e11d3b5eb4c7d1d93ea03b32ba1b208efeefa2f48d88553

See more details on using hashes here.

Provenance

The following attestation bundles were made for mldebug-0.1.2-py3-none-any.whl:

Publisher: ci.yml on anpenta/mldebug

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page