Analyze machine learning model reliability beyond accuracy.

These details have not been verified by PyPI

Project links

Project description

TrustLens

Your model has 92% accuracy. That's not enough.

The open-source Python library that answers the questions accuracy never does. Calibration · Failure Analysis · Bias Detection · Representation Analysis — in one function call.

⭐ Star the repo to support the project!

🛠 Actively looking for contributors - beginner-friendly issues available

Get Started · Live Demo · PyPI · Discussions

The Problem Nobody Ships Around

You trained a model. It hits 92% accuracy on your validation set. You ship it.

Three months later:

A minority-class user gets consistently wrong predictions.
The model is 90% confident on its worst mistakes.
A regulator asks "why did it make that decision?" — and you have no answer.

Sound familiar? You're not alone.

Accuracy tells you how often your model is right. It tells you nothing about when it fails, why it fails, or who it fails.

TrustLens makes those failures visible — before they reach production.

🎯 Who This Is For

ML Engineers building mission-critical production systems.
Data Scientists who need to justify model decisions to stakeholders.
Researchers benchmarking the reliability of new architectures.
AI Teams focused on safety, fairness, and regulatory compliance.

One-Line Magic

👉 Full reliability analysis in one line.

from trustlens import quick_analyze

# Loads dataset, trains a baseline, and runs the full analysis.
quick_analyze(dataset="breast_cancer").show()

No setup. No boilerplate. Just insight. Output includes Trust Score, calibration curves, bias metrics, and failure analysis.

🚀 Quickstart

1. Install

pip install trustlens

2. Analyze Your Model

from trustlens import analyze

report = analyze(
    model,          # any sklearn-compatible model (including XGBoost and LightGBM)
    X_val,          # validation features
    y_val,          # ground truth labels
    y_prob=proba,   # predicted probabilities
)

print(report.trust_score)
report.show()

3. Save & Log

# Export to JSON (perfect for CI/CD pipelines and tracking)
report.save("report.json")

# Export to human-readable TXT for sharing
report.save("report.txt")

Example Dashboard

Everything your team needs to see, in one presentation-ready view.

TrustLens Dashboard

report.summary_plot()

This is what "model trust" actually looks like.

Features & Output

TrustLens goes beyond pass/fail — it explains why your model should or shouldn't be trusted. It provides a deep dive into the four dimensions of model trust.

The Trust Score

A single, actionable number: 0 to 100. Computed from four independently interpretable dimensions:

Dimension	What it measures	Weight
Calibration	Do probabilities reflect reality?	35%
Failure	Does confidence correlate with accuracy?	30%
Bias	Are all groups treated equally?	25%
Representation	Is the embedding space well-structured?	10%

Weights are empirically chosen and will be configurable in future releases.

Find Your Most Dangerous Mistakes

report.show_failures(top_k=5)

Surfaces high-confidence misclassifications — the "silent killers" of production ML. TrustLens identifies where the model is certain it's right, but is actually wrong.

Real-World Use Cases

Medical AI Identify overconfidence in edge cases before a diagnostic model reaches a patient. TrustLens flags high ECE (>0.15) early.

Fraud Detection Quantify your false-negative problem. If your confidence gap is low, your model is equally confident on the fraud it catches and the fraud it misses.

Hiring & Lending Automated subgroup analysis reveals performance gaps across demographics before they become regulatory liabilities.

Enterprise MLOps Connect to MLflow or W&B to track Trust Score decay across training runs and automated deployments.

🏗 Architecture

TrustLens is built as a modular, extensible framework:

trustlens/
├── metrics/           # Brier, ECE, Confidence Gap, Subgroup Bias
├── visualization/     # Dashboarding, Reliability Curves, Embedding Plots
├── explainability/    # [Experimental] Grad-CAM, Faithfulness tests (requires PyTorch)
├── plugins/           # Extensible plugin system for custom metrics
├── api.py             # zero-friction entry points
├── report.py          # Serialisation and human-readable exports
└── trust_score.py     # Weighted trust consensus

Modules marked [Experimental] are functional but not part of the core pipeline. See docs/EXPERIMENTAL.md for details.

🌟 Contributors

Want to see your name here? Check out good first issue 👇

🛠 Contributing

TrustLens is a production-grade tool, and our community is already exploring and contributing new features. We welcome developers of all levels!

🟢 Beginner

Issues #1, #7, #13: Implementing core metrics.
Issues #4, #5: Testing and documentation improvements.

🟡 Intermediate

Issue #19: HTML Report generation.
Issue #35: Weights & Biases (W&B) integration.
Issue #51: trustlens analyze CLI support.

Comment on an issue and I will guide you! I am committed to making this a welcoming home for first-time open-source contributors. 🚀

Read the full Contributing Guide →

🛣 Roadmap Teaser

We are actively building features that make TrustLens the standard for model reliability:

CLI Support: trustlens analyze --dataset iris
Integration: First-class support for MLflow and W&B.
Fairness: Implementation of Equalized Odds and Demographic Parity.

Check the Full Roadmap for more details.

Citation

If you use TrustLens in research or production, please cite it:

@software{trustlens2026,
  author = {Shahid Ul Islam},
  title  = {TrustLens: Debug your ML models beyond accuracy},
  year   = {2026},
  url    = {https://github.com/Khanz9664/TrustLens},
}

Author

Shahid Ul Islam — ML Engineer & Creator of TrustLens GitHub · Portfolio · LinkedIn

If TrustLens helped you understand your model better, give it a ⭐ — it helps others discover it.

PyPI · GitHub · Discussions

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

May 15, 2026

0.3.0

May 6, 2026

This version

0.2.0

Apr 24, 2026

0.1.2

Apr 16, 2026

0.1.1

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trustlens-0.2.0.tar.gz (56.9 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trustlens-0.2.0-py3-none-any.whl (54.6 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file trustlens-0.2.0.tar.gz.

File metadata

Download URL: trustlens-0.2.0.tar.gz
Upload date: Apr 24, 2026
Size: 56.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for trustlens-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`9549a0d7856ff867787b4c6087b814bdaa6db99271c6c052b8bb26b4e6990aec`
MD5	`1ebb83fe95afe7969e9f8bd953b8267d`
BLAKE2b-256	`4541af3e9df3cbd7a7be69e105813306cb5445f0d1892419fa4e40bb7f12c722`

See more details on using hashes here.

File details

Details for the file trustlens-0.2.0-py3-none-any.whl.

File metadata

Download URL: trustlens-0.2.0-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 54.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for trustlens-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`33002c030643479c8cc3e7d3b105d0c8fcad11fc4a25ece658c7364fc8b2dc22`
MD5	`75c3487b941a2a00897bd6326cd0d3ca`
BLAKE2b-256	`52fb07062e1cfe27d4799346d374593c8f8ff263c5e63fdf9a42139904e34d35`

See more details on using hashes here.

trustlens 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

TrustLens

Your model has 92% accuracy. That's not enough.

The Problem Nobody Ships Around

🎯 Who This Is For

One-Line Magic

🚀 Quickstart

1. Install

2. Analyze Your Model

3. Save & Log

Example Dashboard

Features & Output

The Trust Score

Find Your Most Dangerous Mistakes

Real-World Use Cases

🏗 Architecture

🌟 Contributors

🛠 Contributing

🟢 Beginner

🟡 Intermediate

🛣 Roadmap Teaser

Citation

Author

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes