Skip to main content

Fast AI evaluator for scikit-learn models

Project description

ai-critic 3.0.0

pip install ai-critic

Latest version Released: 2026

AI Critic — Evaluation Graph Engine for ML models.


Navigation

  • Project description
  • Release history
  • Download files

Verified details

Maintainer Luiz Filipe Seabra de Marco


Unverified details

License: MIT License (MIT) Author: Luiz Filipe Seabra de Marco Tags: machine learning, model evaluation, ml validation, robustness, explainability, cross validation, ai audit, ml scoring, evaluation engine Requires: Python >=3.8 Provides-Extra: dev

Classifiers

Development Status 5 - Production/Stable

Intended Audience Developers Science/Research

License OSI Approved :: MIT License

Operating System OS Independent

Programming Language Python :: 3 Python :: 3.8 Python :: 3.9 Python :: 3.10 Python :: 3.11

Topic Software Development :: Libraries Scientific/Engineering :: Artificial Intelligence


Project description

AI Critic: The Evaluation Graph Engine for Machine Learning

AI Critic is a modular, graph-based evaluation engine designed to analyze machine learning models in a structured, extensible, and deterministic way.

Instead of providing isolated metrics, AI Critic executes an Evaluation Graph composed of independent evaluation nodes. Each node analyzes one dimension of model quality — such as performance, robustness, or explainability — and produces standardized outputs.

The final result is an aggregated score with a clear verdict.

In summary:

You provide a model, data (X, y), and AI Critic executes a structured evaluation pipeline that produces:

  • Cross-validation diagnostics
  • Robustness under noise
  • Feature sensitivity analysis
  • Overall quality score
  • Clear deployment verdict

No telemetry. No black-box ML meta-model. No overengineering.

Just deterministic evaluation architecture.


🧠 Evaluation Graph Architecture

AI Critic 3.0 introduces the Evaluation Graph Engine.

Each evaluator is a node:

  • PerformanceEvaluator
  • RobustnessEvaluator
  • ExplainabilityEvaluator

Nodes:

  • Are independent
  • Can declare dependencies
  • Produce standardized output
  • Return a normalized score

The graph executes them sequentially and aggregates results.

This architecture enables:

  • Future plugin system
  • Custom evaluation nodes
  • Parallel execution
  • Enterprise-level extensibility

🚀 Key Features

📊 Cross-Validation Intelligence

Automatically detects classification vs regression and selects the correct CV strategy.

  • StratifiedKFold for classification
  • KFold for regression
  • Detects suspiciously perfect scores
  • Reports validation strategy used

🛡 Robustness Under Noise

Tests model stability by injecting controlled Gaussian noise.

  • Measures performance degradation
  • Classifies model as stable or fragile
  • Converts robustness drop into normalized score

🔍 Feature Sensitivity (Explainability Proxy)

Model-agnostic permutation analysis:

  • Measures performance drop per feature
  • Detects shortcut learning
  • Flags potential leakage risk
  • Produces explainability score

🎯 Unified Scoring System

All evaluators produce:

{
  "score": float,
  "verdict": str,
  ...
}

The ScoreAggregator computes:

  • Overall score (0–1)

  • Final verdict:

    • excellent
    • good
    • moderate
    • poor

🧩 Modular Graph Engine

The core engine allows:

  • Adding custom evaluation nodes
  • Replacing scoring strategies
  • Integrating into CI pipelines
  • Embedding inside ML platforms

💡 Quick Start

Basic Usage

from ai_critic import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

data = load_iris()
X, y = data.data, data.target

model = RandomForestClassifier()
model.fit(X, y)

critic = AICritic()
report = critic.evaluate(model, X, y)

print(report["scores"])

Output:

{
  "overall": 0.87,
  "verdict": "good"
}

📈 Detailed Output Structure

{
  "scores": {
      "overall": 0.83,
      "verdict": "good"
  },
  "details": {
      "performance": {...},
      "robustness": {...},
      "explainability": {...}
  }
}

Each evaluator returns structured diagnostic metadata.


🖥 CLI Usage

ai-critic --model model.pkl --data dataset.csv --target label

Output:

=== AI CRITIC REPORT ===

Overall score: 0.812
Verdict: good

JSON mode:

ai-critic --model model.pkl --data dataset.csv --target label --json

🧪 Evaluation Dimensions

1️⃣ Performance

  • Cross-validation mean score
  • Standard deviation
  • Suspiciously perfect detection

2️⃣ Robustness

  • Noise injection test
  • Performance drop calculation
  • Stability classification

3️⃣ Explainability

  • Feature permutation sensitivity
  • Shortcut detection
  • Leakage risk signal

⚙️ Installation

pip install ai-critic

Dependencies:

  • scikit-learn
  • numpy
  • matplotlib (optional for visualization)

🏗 Extending AI Critic

You can create custom nodes:

from ai_critic.core.node import EvaluationNode

class FairnessEvaluator(EvaluationNode):

    name = "fairness"
    dependencies = []

    def evaluate(self, context):
        return {
            "score": 0.9,
            "verdict": "acceptable"
        }

Then inject into the graph:

critic.graph = EvaluationGraph([
    PerformanceEvaluator(),
    RobustnessEvaluator(),
    ExplainabilityEvaluator(),
    FairnessEvaluator()
])

🎯 Design Philosophy

AI Critic is built on three principles:

  1. Deterministic evaluation
  2. Structural modularity
  3. No hidden learning layer

It is not an AutoML system. It is not a model trainer.

It is an evaluation engine.


📄 License

Distributed under the MIT License. See LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_critic-3.0.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_critic-3.0.0-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_critic-3.0.0.tar.gz.

File metadata

  • Download URL: ai_critic-3.0.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.0.0.tar.gz
Algorithm Hash digest
SHA256 7e70799e031fdc719281f1672a6084aba61c3941937589473fc8471bdf9d887a
MD5 44ad88a1abf8fa2a10420505ae80cb9f
BLAKE2b-256 d199df20a444419c2a1572f16a77218551927a61ce9024624be4d3e223cfcac7

See more details on using hashes here.

File details

Details for the file ai_critic-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: ai_critic-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 852dfaaa0b110ef742618762c96a77f8b0b9bd24943491063a8459ffda2f21f5
MD5 127db4945f8825ff461a968ef8434b3f
BLAKE2b-256 5498bc5cad552d4d1fb238d719b66d7371df0a597d5b4a7c2b8c91e3f42aff9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page