Universal auditor model that suppresses unreliable AI predictions in human-AI systems. Works with sklearn, PyTorch, HuggingFace, OpenAI, Anthropic, and any custom model.

These details have not been verified by PyPI

Project description

AuditorAI

Stop your AI model from showing wrong predictions to users.

The problem: Your AI model gets predictions wrong sometimes, and those wrong predictions reach your users — causing bad decisions, lost trust, and real harm.

The solution: AuditorAI adds a second model (the "auditor") that learns when your primary model is likely wrong and suppresses those predictions before they reach the user. The suppressed cases get routed to a human instead.

The result: Only confident, reliable predictions are shown. Wrong predictions are caught and handled by humans, improving your overall system accuracy.

Quick numbers

Dataset	AI alone	With AuditorAI	Auditor AUROC	Flag rate
Breast Cancer	RandomForest	+auditor	0.93	2%
Wine	GradientBoosting	+auditor	0.75	3%
Digits	LogisticRegression	+auditor	0.93	4%

The auditor identifies unreliable predictions with high precision while flagging only 2–4% of cases for human review.

Install

pip install auditorai

What you need	Command
Base (sklearn models)	`pip install auditorai`
PyTorch models	`pip install "auditorai[pytorch]"`
HuggingFace models	`pip install "auditorai[hf]"`
OpenAI models	`pip install "auditorai[openai]"`
Anthropic models	`pip install "auditorai[anthropic]"`
Everything	`pip install "auditorai[all]"`

Quickstart

With a sklearn model

from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from auditorai import AuditorSystem, wrap

# 1. Load data and train your model as usual
X, y = load_breast_cancer(return_X_y=True)
X_temp, X_test, y_temp, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_temp, y_temp, test_size=0.25, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# 2. Wrap your model and train the auditor (3 lines)
system = AuditorSystem(wrap(model))
system.train(X_val, y_val)       # uses held-out data
system.auto_tune(X_val, y_val)   # finds best threshold

# 3. Get audited predictions
result = system.predict(X_test)
print(result["show_mask"])       # True = safe to show
print(result["suppress_mask"])   # True = let human decide
print(result["p_wrong"])         # confidence score per prediction

With a PyTorch model

from auditorai import AuditorSystem, wrap
# pip install "auditorai[pytorch]"

adapter = wrap(your_torch_model,
               adapter_type="pytorch",
               n_classes=3)
system = AuditorSystem(adapter)
system.train(X_val, y_val)
result = system.predict(X_test)

With an OpenAI model

from auditorai import AuditorSystem, wrap
# pip install "auditorai[openai]"

def parse_response(text):
    # parse your model output -> (class_index, confidence)
    return int(text.strip()), 0.85

adapter = wrap("gpt-4o-mini",
               adapter_type="openai",
               parse_response=parse_response,
               n_classes=2)
system = AuditorSystem(adapter)
system.train(texts_val, y_val)
result = system.predict(texts_test)

How it works

Step 1:  Your model makes a prediction
            ↓
Step 2:  AuditorAI scores it (0 = confident, 1 = likely wrong)
            ↓
Step 3:  Score ≥ threshold? → Suppress (human decides)
         Score < threshold? → Show prediction to user
            ↓
Step 4:  Only confident predictions reach your users

Key insight: The auditor trains on held-out validation data — data your primary model has never seen during training. This means it learns your model's real-world failure patterns, not memorized ones. The auditor uses uncertainty signals like confidence, entropy, and margin between top predictions to detect when your model is likely wrong.

CLI

Works from the command line with zero Python:

# Run on a built-in dataset
auditorai run --data breast_cancer --report

# Run on your own CSV (last column = label)
auditorai run --data mydata.csv --model-type gradient_boosting

# Sweep thresholds to find the best one
auditorai sweep --data breast_cancer --steps 20

# Validate a saved auditor on new data
auditorai validate --adapter-path outputs/models --data breast_cancer

# All options
auditorai run --help

What you get

1. Evaluation report (printed to terminal)

==================================================
  AUDITOR SYSTEM - EVALUATION REPORT
==================================================
  AI-only accuracy:         94.7%
  Joint system accuracy:    94.2%
  Auditor AUROC:            0.512
  Suppression rate:         1.8%
  Cases shown:              112
  Cases suppressed:         2
  Auditor precision:        50.0%
  Auditor recall:           16.7%
==================================================

2. Prediction dict (in Python)

result = system.predict(X_test)

# For each sample in X_test:
result["show_mask"]       # bool array — True means show this prediction
result["suppress_mask"]   # bool array — True means suppress this prediction
result["p_wrong"]         # float array — probability this prediction is wrong
result["ai_predictions"]  # the actual predicted class labels

3. Plots saved to `outputs/`

File	What it shows
`score_dist.png`	Auditor score distribution (correct vs. wrong predictions)
`threshold_sweep.png`	Accuracy gain vs. suppression threshold curve
`breakdown.png`	Shown vs. suppressed × correct vs. error breakdown

Supported models

Model type	Adapter	Extra install
Any sklearn model	`SklearnAdapter`	(included)
XGBoost / LightGBM	`SklearnAdapter`	(included)
PyTorch `nn.Module`	`PyTorchAdapter`	`pip install "auditorai[pytorch]"`
HuggingFace pipeline	`HuggingFaceAdapter`	`pip install "auditorai[hf]"`
OpenAI (GPT-4o etc)	`APIAdapter`	`pip install "auditorai[openai]"`
Anthropic (Claude)	`APIAdapter`	`pip install "auditorai[anthropic]"`
Any custom model	Subclass `ModelAdapter`	(included)

Custom adapter pattern

from auditorai import ModelAdapter, AuditorSystem
import numpy as np

class MyAdapter(ModelAdapter):
    def __init__(self, model):
        self.model = model

    def predict(self, X) -> np.ndarray:
        return self.model.my_predict(X)

    def predict_proba(self, X) -> np.ndarray:
        scores = self.model.my_scores(X)
        return np.column_stack([1 - scores, scores])  # must sum to 1.0

system = AuditorSystem(MyAdapter(my_model))
system.train(X_val, y_val)

FAQ

Q: Does my model need to support predict_proba? No. sklearn models without predict_proba (like SVC) are automatically wrapped with CalibratedClassifierCV to produce calibrated probabilities. No extra code needed.

Q: What data do I pass to system.train()? Validation data — data your primary model has NOT trained on. This is critical. Passing training data produces an unreliable auditor that can't detect real errors.

Q: What does suppress_mask=True mean for my application? It means AuditorAI is not confident in that prediction. What you do with it is up to you — show a warning, route to a human reviewer, or request more information from the user.

Q: How do I choose the threshold? Use system.auto_tune(X_val, y_val) and it picks the threshold that maximizes joint accuracy automatically. Or use auditorai sweep from the CLI to see the full tradeoff curve and pick manually.

Q: Will this work on text / images / tabular data? Yes. The auditor works on your model's probability outputs, not the raw inputs. As long as your adapter returns valid probabilities (rows sum to 1.0), the data type does not matter.

Project structure

auditorai/
├── adapters/
│   ├── base.py                ← ModelAdapter ABC + wrap() function
│   ├── sklearn_adapter.py     ← wraps any sklearn model
│   ├── pytorch_adapter.py     ← wraps PyTorch nn.Module
│   ├── huggingface_adapter.py ← wraps HF pipelines and models
│   └── api_adapter.py         ← wraps OpenAI / Anthropic / custom HTTP
├── core/
│   ├── auditor.py             ← AuditorModel: trains on primary errors
│   ├── router.py              ← threshold sweep and routing logic
│   ├── system.py              ← AuditorSystem: main entry point
│   └── evaluate.py            ← reports and plots
├── cli/
│   └── main.py                ← auditorai run / sweep / validate
└── utils/
    ├── data.py                ← load_any() smart data loader
    └── logging.py             ← shared logger

Contributing

git clone https://github.com/Apurva0614/Auditorai.git
cd Auditorai
pip install -e ".[dev]"
pytest tests/ -v

See CONTRIBUTING.md for full guidelines.

Research

This implementation is based on the auditor model framework for human-AI decision systems. The core idea — training a second model to predict when the primary AI is wrong, then suppressing those predictions to let a human decide — was formalized in Auditor Models for Efficient Human-AI Collaboration (De-Arteaga, M. et al., 2025, medRxiv). AuditorAI makes this research practical by providing a drop-in library that works with any ML framework.

License

MIT — use it freely, commercially or otherwise.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jun 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auditorai-0.2.0.tar.gz (37.8 kB view details)

Uploaded Jun 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

auditorai-0.2.0-py3-none-any.whl (37.6 kB view details)

Uploaded Jun 3, 2026 Python 3

File details

Details for the file auditorai-0.2.0.tar.gz.

File metadata

Download URL: auditorai-0.2.0.tar.gz
Upload date: Jun 3, 2026
Size: 37.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auditorai-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`71f40f12f29960adbfaa8d9ff769c161c97bfa90f0c6e8ec92158a0fb10f99e9`
MD5	`c29b23ac56cbedc74d567d829d8fd60d`
BLAKE2b-256	`a37bbcf0efbb77ebe120b5ec81dbd07424c0da2ba2b068d744328c6d283085d6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for auditorai-0.2.0.tar.gz:

Publisher: publish.yml on Apurva0614/Auditorai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: auditorai-0.2.0.tar.gz
- Subject digest: 71f40f12f29960adbfaa8d9ff769c161c97bfa90f0c6e8ec92158a0fb10f99e9
- Sigstore transparency entry: 1707878963
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: Apurva0614/Auditorai@adbcb42c1ad84675ca884763db88ee6fde71eeca
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Apurva0614
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@adbcb42c1ad84675ca884763db88ee6fde71eeca
- Trigger Event: release

File details

Details for the file auditorai-0.2.0-py3-none-any.whl.

File metadata

Download URL: auditorai-0.2.0-py3-none-any.whl
Upload date: Jun 3, 2026
Size: 37.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auditorai-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`94f88a595abb7d1aa83916639aa8ffb524b18d398bfa23a62a5f516998579ce9`
MD5	`0123df404345be7794a2ce31a124f46c`
BLAKE2b-256	`6dff7e73bd56bf466a0b32d90321ba91d91c45ba39b9614eab200002d7d05c46`

See more details on using hashes here.

Provenance

The following attestation bundles were made for auditorai-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Apurva0614/Auditorai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: auditorai-0.2.0-py3-none-any.whl
- Subject digest: 94f88a595abb7d1aa83916639aa8ffb524b18d398bfa23a62a5f516998579ce9
- Sigstore transparency entry: 1707879006
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: Apurva0614/Auditorai@adbcb42c1ad84675ca884763db88ee6fde71eeca
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Apurva0614
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@adbcb42c1ad84675ca884763db88ee6fde71eeca
- Trigger Event: release

auditorai 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AuditorAI

Stop your AI model from showing wrong predictions to users.

Quick numbers

Install

Quickstart

With a sklearn model

With a PyTorch model

With an OpenAI model

How it works

CLI

What you get

1. Evaluation report (printed to terminal)

2. Prediction dict (in Python)

3. Plots saved to outputs/

Supported models

Custom adapter pattern

FAQ

Project structure

Contributing

Research

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

3. Plots saved to `outputs/`