Skip to main content

Graph-based evaluation engine for machine learning models

Project description

🚀 AI Critic 3.5.0 (Production Readiness Engine)

pip install ai-critic

AI Critic is a graph-based evaluation engine for machine learning models, designed to go beyond isolated metrics.

It runs a structured evaluation pipeline that analyzes multiple dimensions — performance, robustness, explainability, data quality, and structure — delivering a unified, interpretable, and actionable report.


🔥 WHAT’S NEW IN 3.5.0

🧠 Production-First Design

  • One-line evaluation: evaluate()
  • Simplified API for fast adoption
  • Built for real-world deployment decisions

⚡ Standard Usage (NEW)

AI Critic is now designed to be used right after training:

import ai_critic

report = ai_critic.evaluate(model, X, y)

🚫 Quality Gate (NEW — CRITICAL)

Turn evaluation into a deployment decision:

from ai_critic import evaluate
from ai_critic.gate import enforce

report = evaluate(model, X, y)

enforce(report, threshold=75)

If the model is not good enough → deployment is blocked.


📦 Standardized Report (JSON-first)

All results follow the same schema:

report = {
    "scores": {},        # technical scores (0–1)
    "details": {},       # raw evaluator outputs
    "risk": {},          # interpretable score (0–100)
    "summary": {},       # human-readable insights
    "suggestions": []    # recommended actions
}

👉 This makes AI Critic:

  • API-ready
  • Easy to log and persist
  • Production-ready

⚡ Improved Graph Engine

  • Dependency-aware execution (topological sort)
  • Parallel execution support
  • Deterministic evaluation order

🎯 Multi-layer Scoring System

  • Technical score (0–1) → aggregation layer
  • Risk score (0–100) → decision layer

💡 Integrated Suggestion Engine

  • Automatically generates recommendations based on model behavior

🧩 Plugin System

  • Clean evaluator interface
  • Dependency-aware plugins
  • Easily extensible evaluation pipeline

⚡ QUICK START

🧠 One-liner (recommended)

import ai_critic

report = ai_critic.evaluate(model, X, y)

print(report["risk"])
print(report["summary"])

🔐 Production usage (recommended)

from ai_critic import evaluate
from ai_critic.gate import enforce

report = evaluate(model, X, y)

# 🚫 blocks bad models
enforce(report, threshold=75)

🧪 Full control (advanced)

from api.client import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

data = load_iris()
X, y = data.data, data.target

model = RandomForestClassifier().fit(X, y)

critic = AICritic(weights={
    "performance": 1.0,
    "robustness": 1.5
})

report = critic.evaluate(model, X, y, parallel=True)

🧩 INTERNAL PIPELINE

evaluate()
   ↓
EvaluationGraph (nodes)
   ↓
raw_results
   ↓
ScoreAggregator (0–1)
   ↓
build_report()
   ↓
scoring.py (risk 0–100)
   ↓
summary.py (human-readable)
   ↓
SuggestionEngine

🧱 CORE COMPONENTS

1. Evaluation Graph

A DAG-based execution system:

  • Automatically resolves dependencies
  • Executes nodes in correct order
  • Enables parallel execution

Example:

performance → robustness → explainability

2. Score Aggregator

Combines evaluator outputs:

critic = AICritic(weights={
    "performance": 1.0,
    "robustness": 2.0
})

3. Evaluator Plugins

Fully extensible via plugins:

from ai_critic.plugins.base import EvaluatorPlugin
from ai_critic.plugins.registry import EvaluatorRegistry

class FairnessEvaluator(EvaluatorPlugin):
    name = "fairness"
    dependencies = ["performance"]
    weight = 1.0

    def evaluate(self, model, dataset, context=None):
        return {
            "score": 0.92,
            "verdict": "stable",
            "message": "Fairness is acceptable"
        }

EvaluatorRegistry.register(FairnessEvaluator())

4. Risk Scoring (0–100)

Transforms technical signals into decision-ready output:

report["risk"] = {
    "global_score": 78.5,
    "verdict": "usable_with_caution",
    "component_scores": {...},
    "penalties": [...]
}

5. Human Summary

High-level interpretation:

report["summary"] = {
    "executive_summary": {
        "verdict": "⚠️ Risky",
        "deploy_recommended": False
    }
}

6. Suggestion Engine

Actionable insights:

[
    "Check for data leakage",
    "Improve robustness with regularization"
]

🖥️ CLI

Run directly from terminal:

ai-critic --model model.pkl --data dataset.csv --target label

🔥 CI/CD Mode (recommended)

ai-critic --model model.pkl --data dataset.csv --target label --fail-on-risk

👉 Fails automatically if model risk is too high.


🧠 DESIGN PHILOSOPHY

1. Single Source of Truth

One unified data format → no inconsistencies


2. Graph-first Thinking

Evaluation is a dependency-driven pipeline, not isolated functions


3. JSON-native

Everything is ready for:

  • APIs
  • dashboards
  • logging
  • SaaS platforms

4. Actionable AI

Not just metrics — decisions:

  • Should you deploy?
  • Where is the risk?
  • What should be improved?

🔥 POSITIONING

AI Critic is not just a metrics library.

It is a:

🧠 Production gatekeeper for machine learning models


🚀 ROADMAP

  • REST API (/evaluate)
  • Visual dashboard
  • Model monitoring (post-deployment)
  • Continuous evaluation (CI/CD)
  • Global benchmarking between models

📄 LICENSE

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_critic-3.5.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_critic-3.5.0-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file ai_critic-3.5.0.tar.gz.

File metadata

  • Download URL: ai_critic-3.5.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.5.0.tar.gz
Algorithm Hash digest
SHA256 1c30985756fa416bd47c8fd0211a0429dd9e52e29c3e143dc90ba61c3abbfe38
MD5 bdc03a5c205f628f9251c1f5c56abf54
BLAKE2b-256 3be11297cecfca80d120a486fe3852bc359b4750de66f98abd17fdaaa5fe319a

See more details on using hashes here.

File details

Details for the file ai_critic-3.5.0-py3-none-any.whl.

File metadata

  • Download URL: ai_critic-3.5.0-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e1ccdcd3b740c6a652a2417ab9de0a3f1dd75cefa5efa9e21eb770f532317664
MD5 44e234ced832dcc06df6255abc9779d4
BLAKE2b-256 ceb8793697695ff2255757446532d9a979b3f0aae9ba3773a5fcd3f3be9c9476

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page