Skip to main content

A lightweight Python package for comparing datasets and detecting unexpected changes in machine learning systems.

Project description

mldebug

CI codecov

PyPI Python License

A lightweight Python package for comparing datasets and detecting unexpected changes in machine learning systems.

Why mldebug

Machine learning systems fail silently when data changes.

Common production issues include:

  • feature distribution drift
  • increasing missing values
  • unseen categorical values
  • training vs production mismatch

mldebug provides a unified way to detect these issues before they become model failures.

What it does

mldebug compares:

  • a reference dataset (e.g. training data)
  • a current dataset (e.g. production data)

It runs a suite of checks and returns a structured report of detected issues.

Installation

pip install mldebug

Quick Start

from mldebug import run_checks
import numpy as np

reference = {
    "age": np.array([20, 21, 22]),
    "income": np.array([1000, 1200, 1100]),
    "country": np.array(["ES", "ES", "FR"]),
}

current = {
    "age": np.array([30, 35, 40]),
    "income": np.array([900, 800, 850]),
    "country": np.array(["ES", "DE", "DE"]),
}

schema = {
    "age": "numeric",
    "income": "numeric",
    "country": "categorical",
}

report = run_checks(reference=reference, current=current, schema=schema)

Inspect detected issues

Human-readable output

for issue in report.issues:
    print(issue)
[WARNING] psi_drift - country: PSI drift detected (18.0152)

Summary

print(report.summary())
{
  "total": 1,
  "by_severity": {
    "info": 0,
    "warning": 1,
    "critical": 0
  },
  "status": "issues_detected"
}

Structured output

print(report.to_dict())
{
  "issues": [
    {
      "name": "psi_drift",
      "metric": "psi",
      "severity": "warning",
      "message": "country: PSI drift detected (18.0152)",
      "feature": "country",
      "value": 18.01521528247136,
      "threshold": 0.2
    }
  ]
}

Logs

for line in report.to_logs():
    print(line)
[WARNING] psi_drift - country: PSI drift detected (18.0152)

Supported Checks

mldebug runs a combination of:

Numeric features

Categorical features

Documentation

See documentation pages.

Status

Active development (v0.x). APIs may evolve before v1.0.0.

See CHANGELOG.md for version history and updates.

Development Setup

Requirements

Environment Setup

git clone https://github.com/anpenta/mldebug
cd mldebug
direnv allow

Development Workflow

Tasks are managed via poe (available in the project environment via direnv).

Run tests

poe test

Run linting

poe lint

Check linting

poe lint-check

Run full CI parity checks

poe test-all
poe lint-check-all

CI/CD

CI runs multi-Python version testing and linting. All pull requests must pass the checks before merging.

See CI workflow for details.

Contributing

We welcome contributions.

  1. Clone the repository
  2. Create a feature branch
  3. Make your changes
  4. Ensure all CI checks pass
  5. Open a pull request

Dependency Management

Dependencies are managed using uv and defined in pyproject.toml.

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mldebug-0.1.1.tar.gz (76.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mldebug-0.1.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file mldebug-0.1.1.tar.gz.

File metadata

  • Download URL: mldebug-0.1.1.tar.gz
  • Upload date:
  • Size: 76.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mldebug-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d09a67d60e9bf2d851922fcac4e17acd957d8f1ff98303cd1023e8ca3b06490f
MD5 8a184e06a763b531b0c2f35dd6ec8bc2
BLAKE2b-256 f64df991d0b36c92c471e367263a6a4ec6d690dce1bd417a040a33e5504e444d

See more details on using hashes here.

Provenance

The following attestation bundles were made for mldebug-0.1.1.tar.gz:

Publisher: ci.yml on anpenta/mldebug

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mldebug-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mldebug-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mldebug-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1a9139a9f526a52108853f625f282cf0c4723289e709bd0bf1884839ec5c1739
MD5 21d0f8011f650b334a0dd54930c2fd90
BLAKE2b-256 ba5d531bac36ae23c2f0854545f8b3285aca4bf7bd17101e4b919fe49f3fd254

See more details on using hashes here.

Provenance

The following attestation bundles were made for mldebug-0.1.1-py3-none-any.whl:

Publisher: ci.yml on anpenta/mldebug

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page