Lint for ML training pipelines: catch silent bugs (leakage, drift, schema mismatch) before they ruin your model.

These details have not been verified by PyPI

Project links

Project description

dash_mlguard

Lint for ML training pipelines. One import, one call, one PDF report — catch the silent bugs that ruin models in production before you ship them.

pip install dash-mlguard          # core (pandas + numpy)
pip install dash-mlguard[pdf]     # adds PDF report support (fpdf2)

import dash_mlguard

report = dash_mlguard.check(X_train, y_train, X_test=X_test, y_test=y_test)
print(report)

if not report.ok():
    raise SystemExit("Fix the critical issues before training.")

That's the whole API. Pandas DataFrames, NumPy arrays, dicts, and lists all work as inputs. dash_mlguard does not train any model — it's deterministic, runs in seconds, and depends only on pandas + numpy (PDF output is an optional extra).

Why this exists

Every ML pipeline has small mistakes that go unnoticed: a column derived from the label sneaks in, the test set was sampled before the split was made, two columns are byte-identical, the same user appears in train and test. Each one looks fine in code review and silently inflates your accuracy. Then production happens.

dash_mlguard catches those mistakes before they break your pipeline. It's a static-analysis layer for training data — the way eslint is for JavaScript.

It's deliberately scoped: only training-data and pipeline integrity. It doesn't train models, tune hyperparameters, or visualize distributions — pandas, sklearn, and ydata-profiling already do those things well.

What it catches

Code	Severity	What it catches
`TL001`	critical / warning	Exact-duplicate rows leaking from train into test
`TL002`	warning	Near-duplicate rows (numeric round-off contamination)
`TL003`	critical / warning / info	Target leakage — feature ↔ label association, tiered (≥0.98 / ≥0.85 / ≥0.70)
`TL004`	warning	Constant or near-constant features
`TL005`	warning	Duplicate feature columns
`TL006`	warning	Train/test distribution drift (KS for numeric, PSI for categorical)
`TL007`	critical / warning	Severe class imbalance
`TL008`	warning	Missingness rate differs between train and test
`TL009`	critical	Schema mismatch (columns or dtypes differ)
`TL010`	warning	ID-like features (cardinality ≈ row count)
`TL011`	critical / warning	Temporal leakage — test rows at or before the latest train timestamp
`TL012`	critical / warning	Group leakage — same group ID (user / session / patient) in train and test
`TL013`	critical	Preprocessing leakage — pipeline state depends on data outside the train split
`TL014`	warning	Target-aware encoder without cross-validation wrapping

Each finding tells you the affected column(s), the severity, and how to fix it — not just that something is wrong.

Why it actually helps

The big-deal bugs in production ML aren't algorithm bugs. They're data hygiene bugs that pass code review:

A feature derived from the label sneaks in. The model gets 99% accuracy. Production gets 60%.
The same user's rows end up in train and test. Cross-validation looks great. Production looks terrible.
A timestamp column is fed in as a feature. The model overfits to row identity.
The test set was shuffled across time. Your "evaluation" is measuring transfer, not skill.
StandardScaler.fit_transform(X) was called before the train/test split. Test statistics leaked into training.

dash_mlguard.check(...) is a single call that catches these before training, with concrete fixes.

Demo: with vs without dash_mlguard

The repo ships examples/demo.py — a synthetic fraud-detection dataset (8 000 transactions, 600 users, 90-day window) with three mistakes baked into the naive pipeline:

Shuffled split instead of chronological → temporal leakage
Row-level split that puts the same users in train and test → group leakage
StandardScaler.fit_transform(X) before splitting → preprocessing leakage

Run it:

cd examples
pip install -r requirements.txt
pip install dash-mlguard[pdf]
python demo.py

You get this verdict:

Metric	Naive (3 bugs)	Honest (dash_mlguard-cleaned)	Inflation
accuracy	0.8717	0.8495	+0.0222
f1	0.6805	0.6569	+0.0236
roc_auc	0.9065	0.8959	+0.0106

The naive numbers look fine. They're not — they're the score of a model that's secretly cheating. dash_mlguard flags all three bugs as critical and refuses to ok() the run.

The demo also writes a single audit document — see examples/sample_report.pdf and examples/sample_report.html for what the output looks like.

Generate a PDF / HTML audit report

report = dash_mlguard.check(
    X_train, y_train, X_test, y_test,
    time_col="timestamp",        # enables TL011 (temporal leakage)
    group_key="user_id",         # enables TL012 (group leakage)
)

report.to_pdf(
    "audit.pdf",
    title="dash_mlguard audit -- fraud model v3",
    dataset_name="transactions Q1 2024",
    metrics_before={"accuracy": 0.8717, "f1": 0.6805, "roc_auc": 0.9065},
    metrics_after ={"accuracy": 0.8495, "f1": 0.6569, "roc_auc": 0.8959},
)

# Or, for embedding in a notebook / dashboard:
html = report.to_html(title="...", metrics_before=..., metrics_after=...)

The report contains: pass/fail banner, summary cards, performance comparison with deltas, every finding with what / detail / fix / columns — designed to print or share with a stakeholder.

Audit a sklearn pipeline

dash_mlguard.check() looks at data. dash_mlguard.audit_pipeline() looks at code:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
import dash_mlguard

candidate = Pipeline([
    ("scale", StandardScaler()),
    ("clf",   GradientBoostingClassifier(random_state=42)),
])

report = dash_mlguard.audit_pipeline(candidate, X, y)   # raw, unsplit X, y
print(report)

It clones the pipeline twice, fits one on the train split and one on the full dataset, and compares transform(X_test) outputs. If they diverge, the pipeline has data-dependent state (scaler stats, imputer means, encoder maps) that would leak when fit on full data — flagged as TL013 critical.

It also flags target-aware encoders (TargetEncoder, CatBoostEncoder, etc.) as TL014 if they appear without explicit CV wrapping.

API reference

dash_mlguard.check(
    X_train, y_train,
    X_test=None, y_test=None,
    *,
    task="auto",                      # "auto" | "classification" | "regression"
    time_col=None,                    # column name in X_train/X_test for TL011
    group_key=None,                   # column name OR Series for TL012
    group_key_test=None,              # defaults to group_key when it's a string
) -> Report

dash_mlguard.audit_pipeline(
    pipeline, X, y,
    *,
    task="auto",
    test_size=0.30,
    random_state=42,
    atol=1e-6,
) -> Report

Report:

report.ok() — True if no critical findings.
report.findings, report.critical, report.warnings, report.infos — lists of Finding.
print(report) — human-readable terminal summary.
report.to_dict() — JSON-serializable dict (good for CI logs / artifacts).
report.to_html(...) — single-page self-contained HTML.
report.to_pdf(path, ...) — single audit document. Requires dash_mlguard[pdf].

Each Finding has: code, severity (critical / warning / info), message, fix, columns, details.

Use it in CI

import dash_mlguard, sys

report = dash_mlguard.check(X_train, y_train, X_test, y_test,
                     time_col="timestamp", group_key="user_id")
report.to_pdf("audit.pdf", title="CI audit")   # optional artifact
sys.exit(0 if report.ok() else 1)

A failed report.ok() blocks the merge. The PDF / HTML can be uploaded as a CI artifact for review.

Scope, on purpose

dash_mlguard is only a linter for training-data and pipeline-integrity bugs. It doesn't:

train models (use sklearn / lightning / xgboost),
tune hyperparameters (use Optuna / Ray Tune),
track experiments (use MLflow / W&B),
profile data (use ydata-profiling / sweetviz),
explain predictions (use SHAP / lime).

Doing one thing well is the point. If dash_mlguard.check() returns clean, you can trust your pipeline isn't silently broken — and that's all it claims to do.

Development

git clone https://github.com/<your-username>/dash_mlguard
cd dash_mlguard
pip install -e ".[dev]"
pytest                        # 29 tests, ~3 seconds

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dash_mlguard-0.3.0.tar.gz (28.1 kB view details)

Uploaded Jun 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dash_mlguard-0.3.0-py3-none-any.whl (24.8 kB view details)

Uploaded Jun 1, 2026 Python 3

File details

Details for the file dash_mlguard-0.3.0.tar.gz.

File metadata

Download URL: dash_mlguard-0.3.0.tar.gz
Upload date: Jun 1, 2026
Size: 28.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dash_mlguard-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`c82de1f46b45e597c4b98d26e8b360427fe937ff48533ac711ea139663f2a948`
MD5	`fe6473d9b9ac0b4786b9f700235e297d`
BLAKE2b-256	`f9a32c776da1f906d4a930b0cdc1303996291a6a11ce465dcb87e22e5051f045`

See more details on using hashes here.

File details

Details for the file dash_mlguard-0.3.0-py3-none-any.whl.

File metadata

Download URL: dash_mlguard-0.3.0-py3-none-any.whl
Upload date: Jun 1, 2026
Size: 24.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for dash_mlguard-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f5e0b0ab35fb0d5051cd83c0d9d5c6b9043f99bb55f0eb3411f897eab8bbeb89`
MD5	`ef7daa707563cbca7ac587df7dee9c88`
BLAKE2b-256	`fe93ed7bf8e0453fcb5feede3f88c2c614aed8d1a44b94871781fdab247268bb`

See more details on using hashes here.

dash-mlguard 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dash_mlguard

Why this exists

What it catches

Why it actually helps

Demo: with vs without dash_mlguard

Generate a PDF / HTML audit report

Audit a sklearn pipeline

API reference

Use it in CI

Scope, on purpose

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes