Fast AI evaluator for scikit-learn models

Project description

ai-critic 🧠: The Quality Gate for Machine Learning Models

ai-critic is a specialized decision-making tool designed to audit the reliability and readiness for deployment of scikit-learn compatible Machine Learning models.

Instead of just measuring performance (accuracy, F1 score), ai-critic acts as a "Quality Gate," operating the model in search of hidden risks that can lead to production failures, such as data leaks, structural overfitting, and vulnerability to noise.

🚀 1. Getting Started (The Basics)

This section is ideal for beginners who need a quick verdict on the health of their model.

1.1. Installation

Install the library directly from PyPI:

pip install ai-critic

1.2. The Quick Verdict

With just a few lines, you can get an executive evaluation and a deployment recommendation.

from ai_critic import AICritic
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification

# 1. Prepare your data and model
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
model = RandomForestClassifier(max_depth=5, random_state=42)

# 2. Initialize Criticism
# AICritic performs all audits internally
critic = AICritic(model, X, y)

# 3. Obtain the Executive Summary
report = critic.evaluate(view="executive")

print(f"Verdict: {report['verdict']}")
print(f"Risk: {report['risk_level']}")
print(f"Reason Main: {report['main_reason']}")

#Expected Output:

# Verdict: ✅ Acceptable
# Risk: Low
# Main Reason: No critic risks detected.

💡 2. Understanding the Critique (The Intermediary)

For the data scientist who needs to understand why the model received a verdict and what the next steps are.

2.1. The Four Pillars of the Audit

The ai-critic evaluates your model across four critic dimensions.

Category	Main Risk	Code Module
📈 Validation	Suspicious CV Scores	`ai_critic.performance`
🧪 Robustness	Noise Vulnerability	`ai_critic.robustness`

2.2. Visual and Technical Analysis

The evaluate method allows you to view the results and access the complete technical report.

# Continuing the previous example...

# 1. Generate the full report and visualizations
# plot=True generates Correlation, Learning Curve, and Robustness graphs
full_report = critic.evaluate(view="all", plot=True)

# 2. Access the Technical Summary for Recommendations
technical_summary = full_report["technical"]

print("\n--- Technical Recommendations ---")
for i, risk in enumerate(technical_summary["key_risks"]):
print(f"Risk {i+1}: {risk}")
print(f"Recommendation: {technical_summary['recommendations'][i]}")

# Example of Risk (if there were one):
# Risk 1: The depth of the tree may be too high for the size of the dataset.

# Recommendation: Reduce model complexity or adjust hyperparameters.


###2.3. Robustness Test

A robust model should maintain its performance even with small disturbances in the data. The `ai-critic` test assesses this by injecting noise into the input data.

```python
# Accessing the specific result of the Robustness module
robustness_result = full_report["details"]["robustness"]

print("\n--- Robustness Test ---")
print(f"Original CV Score: {robustness_result['cv_score_original']:.4f}")
print(f"CV Score with Noise: {robustness_result['cv_score_noisy']:.4f}")
print(f"Performance Drop: {robustness_result['performance_drop']:.4f}")
print(f"Robustness Verdict: {robustness_result['verdict']}")

# Possible Verdicts:
# - Stable: Acceptable drop.

# - Fragile: Significant drop (risk).

# - Misleading: Original performance inflated by leakage.

⚙️ 3. Integration and Governance (The Advanced)

This section is for MLOps engineers and architects looking to integrate ai-critic into automated pipelines and create custom deployment logic.

###3.1. The Deployment Gate (deploy_decision)

The deploy_decision() method is the final control point. It returns a structured object that classifies problems into Hard Blockers (prevent deployment) and Soft Blockers (require attention, but can be accepted with reservations).

Python

Example of use in a CI/CD pipeline

decision = critic.deploy_decision()

if decision["deploy"]: print("✅ Deployment Approved. Risk Level: Low.") other: print(f"❌ Deployment Blocked. Risk Level: {decision['risk_level'].upper()}") print("Blocking Issues:") for issue in decision["blocking_issues"]: print(f"- {problem}")

The decision object also includes a heuristic confidence score (0.0 to 1.0)

print(f"Heuristic Confidence in Model: {decision['confidence']:.2f}")


###3.2. AccessFor custom *governance* rules or logic, you can access the raw data of each module through the `"details"` view.

```python
# Accessing Data Leakage Details
data_details = critic.evaluate(view="details")["data"]

if data_details["data_leakage"]["suspected"]:

print("\n--- Data Leak Alert ---")

for detail in data_details["data_leakage"]["details"]:

print(f"Feature {detail['feature_index']} with correlation of {detail['correlation']:.4f}")

# Accessing Structural Overfitting Details
config_details = critic.evaluate(view="details")["config"]

if config_details["structural_warnings"]:

print("\n--- Structural Alert ---")

for warning in config_details["structural_warnings"]:

print(f"Warning: {warning['message']} (Max Depth: {warning['max_depth']}, Recommended: {warning['recommended_max_depth']})")

3.3. Best Practices and Use Cases

Use	Recommended Action
CI/CD	Use `deploy_decision()` as an automated quality gate.
Tuning	Use the technical view to guide hyperparameter optimization.
Governance	Log the details view for auditing and compliance.
Communication	Use the executive view to report risks to non-technical stakeholders.

📄 License

Distributed under the MIT License.

🧠 Final Note

ai-critic is not a benchmarking tool. It's a decision-making tool.

If a model fails here, it doesn't mean it's "bad," but rather that it shouldn't be trusted yet. The goal is to inject the necessary skepticism to build truly robust AI systems.

Project details

Release history Release notifications | RSS feed

3.5.1

May 6, 2026

3.5.0

Apr 18, 2026

3.4.6

Apr 14, 2026

3.4.5

Apr 5, 2026

3.4.1

Apr 5, 2026

3.3.0

Mar 22, 2026

3.2.0

Mar 16, 2026

3.0.0

Feb 15, 2026

2.1.0

Feb 9, 2026

2.0.0

Feb 4, 2026

1.2.0

Jan 29, 2026

1.1.0

Jan 27, 2026

1.0.0

Jan 25, 2026

This version

0.2.5

Jan 25, 2026

0.2.4

Jan 23, 2026

0.2.3

Jan 23, 2026

0.2.2

Jan 22, 2026

0.2.1

Jan 19, 2026

0.2.0

Jan 18, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_critic-0.2.5.tar.gz (12.3 kB view details)

Uploaded Jan 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_critic-0.2.5-py3-none-any.whl (11.7 kB view details)

Uploaded Jan 25, 2026 Python 3

File details

Details for the file ai_critic-0.2.5.tar.gz.

File metadata

Download URL: ai_critic-0.2.5.tar.gz
Upload date: Jan 25, 2026
Size: 12.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-0.2.5.tar.gz
Algorithm	Hash digest
SHA256	`2bebb9fcb951d325aaa882592c733e2d19f7c4ff412b578be3a1f5aeefff6626`
MD5	`f91a6d140f06060c2b430365fb932670`
BLAKE2b-256	`ad05d2ce1a562539af2b5d420a2b99256d44c2cbbb6b7a2118aaca46bc6650d4`

See more details on using hashes here.

File details

Details for the file ai_critic-0.2.5-py3-none-any.whl.

File metadata

Download URL: ai_critic-0.2.5-py3-none-any.whl
Upload date: Jan 25, 2026
Size: 11.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for ai_critic-0.2.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`35a5b49ed7a683b29d442232e41b5c2a1251800032522b5d31b032d5e80b89d4`
MD5	`969f2a478e13232733c4564130df5cdd`
BLAKE2b-256	`d6c101fc421fd7c7c9303c532a7853cc3541ccf2f1fc62347108c07a911e5b28`

See more details on using hashes here.

ai-critic 0.2.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ai-critic 🧠: The Quality Gate for Machine Learning Models

🚀 1. Getting Started (The Basics)

1.1. Installation

1.2. The Quick Verdict

💡 2. Understanding the Critique (The Intermediary)

2.1. The Four Pillars of the Audit

⚙️ 3. Integration and Governance (The Advanced)

Example of use in a CI/CD pipeline

The decision object also includes a heuristic confidence score (0.0 to 1.0)

3.3. Best Practices and Use Cases

📄 License

🧠 Final Note

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes