Fast AI evaluator for scikit-learn models

Project description

ai-critic 🧠

The Quality Gate for Machine Learning Models

ai-critic is an intelligent evaluation and decision system designed to determine whether a machine learning model is safe, reliable, and trustworthy enough to be deployed in real-world environments.

Unlike traditional ML evaluation tools that focus almost exclusively on performance metrics, ai-critic acts as a Quality Gate — a final checkpoint that actively probes models to uncover hidden risks that frequently cause silent failures in production.

ai-critic does not ask “How accurate is this model?” It asks “Can this model be trusted in the real world?”

🎯 Why ai-critic Exists

Most production ML failures are not accuracy problems.

They are caused by:

Data leakage hidden inside features
Overfitting disguised as strong validation scores
Models that collapse under small noise
Fragile dependency on a single feature
Structurally unsafe configurations

These failures usually appear after deployment, when they are already expensive — or dangerous — to fix.

ai-critic exists to detect these risks before deployment.

🚀 Installation

Install directly from PyPI:

pip install ai-critic

Python 3.8+ is recommended.

⚡ Quick Start (Fast Verdict)

If you want a clear, conservative deployment recommendation, this is all you need.

from ai_critic import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

X, y = make_classification(
    n_samples=1000,
    n_features=20,
    random_state=42
)

model = RandomForestClassifier(
    max_depth=5,
    random_state=42
)

critic = AICritic(model, X, y)

report = critic.evaluate(view="executive")

print(report)

Example Output

Verdict: ⚠️ Risk Detected
Risk Level: medium
Deploy Recommended: False
Main Reason: Structural or robustness risks detected

If ai-critic approves deployment, it means no meaningful risks were detected by multiple independent checks.

The system is intentionally skeptical by design.

🧭 What Does the Verdict Mean?

Field	Meaning
`verdict`	Human-readable summary
`risk_level`	low / medium / high
`deploy_recommended`	Final quality gate decision
`main_reason`	Primary blocking factor

Clarity is prioritized over ambiguity.

🧠 How ai-critic Thinks (Core Concept)

ai-critic is not a metric calculator. It is a decision system.

Internally, it works in three layers:

Evaluators → Detect signals and risks
Critic Gate → Decide if intervention is needed
Deployment Policy → Decide if deployment is safe

🧱 The Four Pillars of the Audit

ai-critic evaluates models across four independent risk dimensions:

Pillar	Detects	Why It Matters
📊 Data Integrity	Leakage, shortcuts, correlations	Inflated metrics
🧠 Model Structure	Over-complexity, unsafe configs	Poor generalization
📈 Performance Sanity	Suspicious CV behavior	False confidence
🧪 Robustness	Noise sensitivity	Production collapse

Each pillar emits signals, not binary judgments.

Those signals are aggregated by the Critic Gate.

🧪 Robustness Testing (Noise Injection)

Production data is never clean.

ai-critic injects controlled noise into inputs and measures degradation:

robustness = report["details"]["robustness"]

print(robustness["performance_drop"])
print(robustness["verdict"])

Possible outcomes:

stable → acceptable degradation
fragile → high sensitivity
misleading → likely inflated performance

🔍 Explainability & Feature Sensitivity

ai-critic performs feature sensitivity analysis to detect:

Feature-level leakage
Over-reliance on a single signal
Shortcut learning

How it works:

A feature is perturbed or permuted
The model is re-evaluated
Performance drop is measured

Large drops indicate critical dependency.

This approach is:

Model-agnostic
Lightweight
Interpretable
Framework-independent

🧠 Recommendations Engine

ai-critic does not stop at “deploy or not”.

It generates actionable recommendations, such as:

Reduce model complexity
Increase regularization
Possible data leakage detected
High noise sensitivity
Structural overfitting signals

These recommendations are rule-based and data-driven, not LLM hallucinations.

🚦 Deployment Decision

The final decision is produced via:

decision = critic.deploy_decision()

print(decision)

Output includes:

Deployment approval or rejection
Risk level
ML confidence score
Blocking issues
Recommendations

🧠 Critic Gate (New)

The Critic Gate decides whether suggestions should even be made.

This prevents:

Over-criticism
Noise-based warnings
Fatigue from excessive suggestions

The gate considers:

Overall score
Dataset size
Verdict severity
Structural risk

This turns ai-critic into a judgment system, not a nagging tool.

🔄 Feedback Loop & Learning Critic

ai-critic can learn from outcomes.

You can optionally provide feedback:

ai-critic --feedback success

This enables:

Smarter future decisions
Better thresholds
Context-aware criticism

The critic improves without exposing your data.

🖥️ Command Line Interface (CLI)

ai-critic ships with a professional CLI:

ai-critic \
  --model model.pkl \
  --data dataset.csv \
  --target label

CLI output includes:

Gate decision
Deployment recommendation
Risk level
Suggestions

Use --json for automation and pipelines.

🧩 Multi-Framework Support

Supported via adapters:

scikit-learn
PyTorch
TensorFlow

The API remains consistent.

🛡️ What ai-critic Is NOT

❌ A hyperparameter optimizer
❌ A leaderboard benchmarking tool
❌ A replacement for domain expertise
❌ A blind approval system

🧠 Design Philosophy

ai-critic assumes:

Metrics can lie
Data is imperfect
Models fail silently
Trust must be earned

That makes it ideal as a final quality gate, not a tuning toy.

🧠 Final Note

ai-critic is not here to make models look good. It exists to prevent unsafe models from looking good enough to deploy.

A failed audit does not mean your model is bad. It means your model is not yet safe to trust.

That distinction is everything.

Project details

Release history Release notifications | RSS feed

3.5.1

May 6, 2026

3.5.0

Apr 18, 2026

3.4.6

Apr 14, 2026

3.4.5

Apr 5, 2026

3.4.1

Apr 5, 2026

3.3.0

Mar 22, 2026

3.2.0

Mar 16, 2026

3.0.0

Feb 15, 2026

This version

2.1.0

Feb 9, 2026

2.0.0

Feb 4, 2026

1.2.0

Jan 29, 2026

1.1.0

Jan 27, 2026

1.0.0

Jan 25, 2026

0.2.5

Jan 25, 2026

0.2.4

Jan 23, 2026

0.2.3

Jan 23, 2026

0.2.2

Jan 22, 2026

0.2.1

Jan 19, 2026

0.2.0

Jan 18, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_critic-2.1.0.tar.gz (21.2 kB view details)

Uploaded Feb 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_critic-2.1.0-py3-none-any.whl (25.6 kB view details)

Uploaded Feb 9, 2026 Python 3

File details

Details for the file ai_critic-2.1.0.tar.gz.

File metadata

Download URL: ai_critic-2.1.0.tar.gz
Upload date: Feb 9, 2026
Size: 21.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`545c537643bfc63641cdc3e717f02e7676c6d58143fea9a39425fb88434ca067`
MD5	`01e0ccb6e53c06831e0192e27aaf7071`
BLAKE2b-256	`ec4a5bf03efc4553f71fada55cfd55ca5607007112d1cc755413459257540a19`

See more details on using hashes here.

File details

Details for the file ai_critic-2.1.0-py3-none-any.whl.

File metadata

Download URL: ai_critic-2.1.0-py3-none-any.whl
Upload date: Feb 9, 2026
Size: 25.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`789801e137861986ad6b3535063ae780065a47e7121074bc98f67e4dd9a740ab`
MD5	`f898ebfe129c63b299cbbc32852ce5a1`
BLAKE2b-256	`06b07d1116a23829ea69681c80617510bfe5a89c45ad72053d6441f7658b7c91`

See more details on using hashes here.

ai-critic 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ai-critic 🧠

The Quality Gate for Machine Learning Models

🎯 Why ai-critic Exists

🚀 Installation

⚡ Quick Start (Fast Verdict)

Example Output

🧭 What Does the Verdict Mean?

🧠 How ai-critic Thinks (Core Concept)

🧱 The Four Pillars of the Audit

🧪 Robustness Testing (Noise Injection)

🔍 Explainability & Feature Sensitivity

🧠 Recommendations Engine

🚦 Deployment Decision

🧠 Critic Gate (New)

🔄 Feedback Loop & Learning Critic

🖥️ Command Line Interface (CLI)

🧩 Multi-Framework Support

🛡️ What ai-critic Is NOT

🧠 Design Philosophy

🧠 Final Note

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes