Audit ML models beyond accuracy — calibration, fairness, latent health, and deployment verdicts.

These details have not been verified by PyPI

Project links

Project description

TrustLens

Audit ML models beyond accuracy — calibration, fairness, latent health, and deployment verdicts.

Quickstart · How It Works · Demo Video · Docs · Project Showcase

Your model has 92% accuracy. It's still not safe for deployment.

Accuracy measures what went right. TrustLens measures what can go wrong — in production, on subgroups, and at high confidence.

Why TrustLens

Standard evaluation stops at accuracy. Silent failures happen when:

A model is overconfident — "90% sure" but right only 60% of the time
Performance collapses on subgroups — gender, age, or region hidden inside a good aggregate score
The model is confidently wrong — high-confidence errors that indicate systemic risk
Latent representations overlap — classes bleed together where the model can't tell them apart

TrustLens surfaces all four with a single audit, and outputs a machine-readable deployment verdict.

Supported Frameworks

TrustLens uses a Prediction Resolver Architecture to automatically handle different ML frameworks:

scikit-learn — Full support for all ClassifierMixin estimators.
XGBoost — Native support for XGBClassifier and raw Booster objects.
Planned — LightGBM, CatBoost, PyTorch, TensorFlow/Keras.

TrustLens automatically detects your model's framework. You don't need to change your code when switching from sklearn to XGBoost.

Quickstart

pip install trustlens
# Extended visualization support
pip install trustlens[full]

Run a one-line audit to see why 94% accuracy isn't the full story:

from trustlens import quick_analyze

quick_analyze(dataset="breast_cancer")

TRUST SCORE: 68/100 [D]
Assessment : Low Trust — Blocked by high diagnostic risk

  Base Score        : 76
  Penalties Applied : -7.7 (Failure Risk)
  Final Score       : 68

→ Model shows high failure risk and is NOT ready for deployment.

How It Works

TrustLens runs four diagnostic modules and combines them into a single Trust Score (0–100) with a CI/CD-ready deployment verdict.

Module	What It Catches
Calibration	Confidence vs. correctness mismatch, overconfidence, ECE
Fairness	Subgroup performance gaps, equalized-odds violations
Representation	Latent space health, class separation, overlap detection
Decision Engine	Composite Trust Score + `Ready` / `Blocked` verdict

Full Audit

Automatic Detection (Sklearn / XGBoost)

from trustlens import analyze
from xgboost import XGBClassifier

model = XGBClassifier().fit(X_train, y_train)

# TrustLens automatically detects XGBoost and resolves predictions
report = analyze(
    model=model,
    X=X_test,
    y_true=y_test,
    sensitive_features={"gender": gender_test}
)

report.show()

Manual Prediction Override

For external inference systems or unsupported frameworks, you can pass predictions directly:

report = analyze(
    model=None, # optional when passing y_pred/y_prob
    X=X_test,
    y_true=y_test,
    y_pred=external_preds,
    y_prob=external_probs
)

Audit Metadata & Provenance

Every report tracks its own backend provenance for auditability:

print(report.metadata["framework"])  # "xgboost"
print(report.metadata["backend"])    # {'resolver': 'xgboost', 'framework_version': '2.0.3', ...}

Save & Export

# Save as a unified JSON artifact (best for experiment trackers)
report.save("report.json")

# Save as a full directory bundle (best for human review)
report.save("trust_report/")

Output artifacts (Directory Bundle)

trust_report/
├── trust_score.json    ← deployment verdict & composite score
├── report.json         ← raw diagnostic metrics
├── metadata.json       ← environment, version, backend provenance
├── report.txt          ← human-readable summary
└── visuals/            ← per-module diagnostic plots (PNG)

CI/CD gating

Gate model promotion on trust_score.json — no custom scripting needed:

{
  "score": 68,
  "grade": "D",
  "verdict": "Low Trust — Blocked by high failure risk",
  "is_blocked": true
}

Diagnostics in Practice

Calibration _{Does confidence align with correctness?}	Fairness & Bias _{Are subgroups treated equally?}
Latent Space Health _{Is class separation clean?}	Deployment Verdict _{Is this model safe to ship?}

Demo

15-minute walkthrough: diagnostics, trust scoring, fairness analysis, and visual dashboards.

Want a deeper look at the architecture and design decisions? → Interactive Project Showcase

Run the Full Demo

python demo.py

Generates multi-model comparisons, fairness deep-dives, latent space projections, JSON audits, and visual dashboards across all modules.

Contributing

All contributions welcome — new metrics, diagnostic plugins, and visualizations.

→ Contributing Guide · Open an Issue · Docs

Citation

@software{trustlens2026,
  author = {Shahid Ul Islam},
  title  = {TrustLens: Audit ML models beyond accuracy},
  year   = {2026},
  url    = {https://github.com/Khanz9664/TrustLens}
}

Built by Shahid Ul Islam · Portfolio · LinkedIn

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.4.0

May 15, 2026

0.3.0

May 6, 2026

0.2.0

Apr 24, 2026

0.1.2

Apr 16, 2026

0.1.1

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trustlens-0.4.0.tar.gz (88.6 kB view details)

Uploaded May 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trustlens-0.4.0-py3-none-any.whl (78.6 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file trustlens-0.4.0.tar.gz.

File metadata

Download URL: trustlens-0.4.0.tar.gz
Upload date: May 15, 2026
Size: 88.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for trustlens-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`713b74c781fe7bd8b15578fb3ffa5b63d5cfc10999677742adbd81069b6c1f4a`
MD5	`c476b0413c1148893d5e1d2bc1b078a4`
BLAKE2b-256	`33d9c4bc0f8791ab39c58cfbec0cf30485d37deb55eb8ff466b0e4d8e7b634c5`

See more details on using hashes here.

File details

Details for the file trustlens-0.4.0-py3-none-any.whl.

File metadata

Download URL: trustlens-0.4.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 78.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for trustlens-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86eec860bacbcf86da0e3f4f2889f9752356655b14c3f8f0854ab13178027687`
MD5	`a6f819deb8a7f432df47e52de512e862`
BLAKE2b-256	`7e84cb0559eb1053f530a91403ef0f4bfb8d53afc39847e14e2074d0f6ead985`

See more details on using hashes here.

trustlens 0.4.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

TrustLens

Audit ML models beyond accuracy — calibration, fairness, latent health, and deployment verdicts.

Why TrustLens

Supported Frameworks

Quickstart

How It Works

Full Audit

Automatic Detection (Sklearn / XGBoost)

Manual Prediction Override

Audit Metadata & Provenance

Save & Export

Output artifacts (Directory Bundle)

CI/CD gating

Diagnostics in Practice

Demo

Run the Full Demo

Contributing

Citation

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes