Graph-based evaluation engine for machine learning models

Project description

🚀 AI Critic 3.5.0 (Production Readiness Engine)

pip install ai-critic

AI Critic is a graph-based evaluation engine for machine learning models, designed to go beyond isolated metrics.

It runs a structured evaluation pipeline that analyzes multiple dimensions — performance, robustness, explainability, data quality, and structure — delivering a unified, interpretable, and actionable report.

🔥 WHAT’S NEW IN 3.5.0

🧠 Production-First Design

One-line evaluation: evaluate()
Simplified API for fast adoption
Built for real-world deployment decisions

⚡ Standard Usage (NEW)

AI Critic is now designed to be used right after training:

import ai_critic

report = ai_critic.evaluate(model, X, y)

🚫 Quality Gate (NEW — CRITICAL)

Turn evaluation into a deployment decision:

from ai_critic import evaluate
from ai_critic.gate import enforce

report = evaluate(model, X, y)

enforce(report, threshold=75)

If the model is not good enough → deployment is blocked.

📦 Standardized Report (JSON-first)

All results follow the same schema:

report = {
    "scores": {},        # technical scores (0–1)
    "details": {},       # raw evaluator outputs
    "risk": {},          # interpretable score (0–100)
    "summary": {},       # human-readable insights
    "suggestions": []    # recommended actions
}

👉 This makes AI Critic:

API-ready
Easy to log and persist
Production-ready

⚡ Improved Graph Engine

Dependency-aware execution (topological sort)
Parallel execution support
Deterministic evaluation order

🎯 Multi-layer Scoring System

Technical score (0–1) → aggregation layer
Risk score (0–100) → decision layer

💡 Integrated Suggestion Engine

Automatically generates recommendations based on model behavior

🧩 Plugin System

Clean evaluator interface
Dependency-aware plugins
Easily extensible evaluation pipeline

⚡ QUICK START

🧠 One-liner (recommended)

import ai_critic

report = ai_critic.evaluate(model, X, y)

print(report["risk"])
print(report["summary"])

🔐 Production usage (recommended)

from ai_critic import evaluate
from ai_critic.gate import enforce

report = evaluate(model, X, y)

# 🚫 blocks bad models
enforce(report, threshold=75)

🧪 Full control (advanced)

from api.client import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

data = load_iris()
X, y = data.data, data.target

model = RandomForestClassifier().fit(X, y)

critic = AICritic(weights={
    "performance": 1.0,
    "robustness": 1.5
})

report = critic.evaluate(model, X, y, parallel=True)

🧩 INTERNAL PIPELINE

evaluate()
   ↓
EvaluationGraph (nodes)
   ↓
raw_results
   ↓
ScoreAggregator (0–1)
   ↓
build_report()
   ↓
scoring.py (risk 0–100)
   ↓
summary.py (human-readable)
   ↓
SuggestionEngine

🧱 CORE COMPONENTS

1. Evaluation Graph

A DAG-based execution system:

Automatically resolves dependencies
Executes nodes in correct order
Enables parallel execution

Example:

performance → robustness → explainability

2. Score Aggregator

Combines evaluator outputs:

critic = AICritic(weights={
    "performance": 1.0,
    "robustness": 2.0
})

3. Evaluator Plugins

Fully extensible via plugins:

from ai_critic.plugins.base import EvaluatorPlugin
from ai_critic.plugins.registry import EvaluatorRegistry

class FairnessEvaluator(EvaluatorPlugin):
    name = "fairness"
    dependencies = ["performance"]
    weight = 1.0

    def evaluate(self, model, dataset, context=None):
        return {
            "score": 0.92,
            "verdict": "stable",
            "message": "Fairness is acceptable"
        }

EvaluatorRegistry.register(FairnessEvaluator())

4. Risk Scoring (0–100)

Transforms technical signals into decision-ready output:

report["risk"] = {
    "global_score": 78.5,
    "verdict": "usable_with_caution",
    "component_scores": {...},
    "penalties": [...]
}

5. Human Summary

High-level interpretation:

report["summary"] = {
    "executive_summary": {
        "verdict": "⚠️ Risky",
        "deploy_recommended": False
    }
}

6. Suggestion Engine

Actionable insights:

[
    "Check for data leakage",
    "Improve robustness with regularization"
]

🖥️ CLI

Run directly from terminal:

ai-critic --model model.pkl --data dataset.csv --target label

🔥 CI/CD Mode (recommended)

ai-critic --model model.pkl --data dataset.csv --target label --fail-on-risk

👉 Fails automatically if model risk is too high.

🧠 DESIGN PHILOSOPHY

1. Single Source of Truth

One unified data format → no inconsistencies

2. Graph-first Thinking

Evaluation is a dependency-driven pipeline, not isolated functions

3. JSON-native

Everything is ready for:

APIs
dashboards
logging
SaaS platforms

4. Actionable AI

Not just metrics — decisions:

Should you deploy?
Where is the risk?
What should be improved?

🔥 POSITIONING

AI Critic is not just a metrics library.

It is a:

🧠 Production gatekeeper for machine learning models

🚀 ROADMAP

REST API (/evaluate)
Visual dashboard
Model monitoring (post-deployment)
Continuous evaluation (CI/CD)
Global benchmarking between models

📄 LICENSE

MIT License

Project details

Release history Release notifications | RSS feed

3.5.1

May 6, 2026

This version

3.5.0

Apr 18, 2026

3.4.6

Apr 14, 2026

3.4.5

Apr 5, 2026

3.4.1

Apr 5, 2026

3.3.0

Mar 22, 2026

3.2.0

Mar 16, 2026

3.0.0

Feb 15, 2026

2.1.0

Feb 9, 2026

2.0.0

Feb 4, 2026

1.2.0

Jan 29, 2026

1.1.0

Jan 27, 2026

1.0.0

Jan 25, 2026

0.2.5

Jan 25, 2026

0.2.4

Jan 23, 2026

0.2.3

Jan 23, 2026

0.2.2

Jan 22, 2026

0.2.1

Jan 19, 2026

0.2.0

Jan 18, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_critic-3.5.0.tar.gz (22.2 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_critic-3.5.0-py3-none-any.whl (29.1 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file ai_critic-3.5.0.tar.gz.

File metadata

Download URL: ai_critic-3.5.0.tar.gz
Upload date: Apr 18, 2026
Size: 22.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.5.0.tar.gz
Algorithm	Hash digest
SHA256	`1c30985756fa416bd47c8fd0211a0429dd9e52e29c3e143dc90ba61c3abbfe38`
MD5	`bdc03a5c205f628f9251c1f5c56abf54`
BLAKE2b-256	`3be11297cecfca80d120a486fe3852bc359b4750de66f98abd17fdaaa5fe319a`

See more details on using hashes here.

File details

Details for the file ai_critic-3.5.0-py3-none-any.whl.

File metadata

Download URL: ai_critic-3.5.0-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 29.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e1ccdcd3b740c6a652a2417ab9de0a3f1dd75cefa5efa9e21eb770f532317664`
MD5	`44e234ced832dcc06df6255abc9779d4`
BLAKE2b-256	`ceb8793697695ff2255757446532d9a979b3f0aae9ba3773a5fcd3f3be9c9476`

See more details on using hashes here.

ai-critic 3.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

🚀 AI Critic 3.5.0 (Production Readiness Engine)

🔥 WHAT’S NEW IN 3.5.0

🧠 Production-First Design

⚡ Standard Usage (NEW)

🚫 Quality Gate (NEW — CRITICAL)

📦 Standardized Report (JSON-first)

⚡ Improved Graph Engine

🎯 Multi-layer Scoring System

💡 Integrated Suggestion Engine

🧩 Plugin System

⚡ QUICK START

🧠 One-liner (recommended)

🔐 Production usage (recommended)

🧪 Full control (advanced)

🧩 INTERNAL PIPELINE

🧱 CORE COMPONENTS

1. Evaluation Graph

2. Score Aggregator

3. Evaluator Plugins

4. Risk Scoring (0–100)

5. Human Summary

6. Suggestion Engine

🖥️ CLI

🔥 CI/CD Mode (recommended)

🧠 DESIGN PHILOSOPHY

1. Single Source of Truth

2. Graph-first Thinking

3. JSON-native

4. Actionable AI

🔥 POSITIONING

🚀 ROADMAP

📄 LICENSE

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes