AI Risk Governance Framework — model registry, audit logs, risk dashboards, anomaly detection, regulatory reports, and human review workflows.

These details have not been verified by PyPI

Project description

airiskguard

AI Risk Governance Framework for LLM applications, AI agents, and ML systems. Provides risk checkers, audit logs, model registry, dashboards, anomaly detection, regulatory reports, and human review workflows.

Installation

pip install airiskguard

With optional extras:

pip install airiskguard[fastapi]         # FastAPI integration
pip install airiskguard[flask]           # Flask integration
pip install airiskguard[transformers]    # ML-based hallucination detection (NLI)
pip install airiskguard[dev]             # Development tools

Quick Start

Guard an LLM call in three lines:

from airiskguard import RiskGuard

guard = RiskGuard()

# Check user prompt before sending to LLM
pre = await guard.evaluate(
    input_data=user_message,
    output_data="",
    model_id="gpt-4",
    checks=["security", "compliance"],
)
if pre.blocked:
    return "Sorry, I can't process that request."

# ... call your LLM ...

# Check LLM response before returning to user
post = await guard.evaluate(
    input_data=user_message,
    output_data=llm_response,
    model_id="gpt-4",
    checks=["hallucination", "compliance"],
)
if post.blocked:
    return "Response filtered for safety."

For synchronous code, use guard.evaluate_sync(...) instead.

Usage Guide

Guarding LLM Calls

Wrap any LLM API call (OpenAI, Anthropic, etc.) with pre- and post-evaluation:

from airiskguard import RiskGuard

guard = RiskGuard(config={
    "enabled_checkers": ["security", "compliance", "hallucination"],
    "block_threshold": "high",
})

async def chat(user_message: str) -> str:
    # Pre-check: block prompt injection, PII leakage, jailbreaks
    pre = await guard.evaluate(
        input_data=user_message,
        output_data="",
        model_id="chatbot-v1",
        checks=["security", "compliance"],
    )
    if pre.blocked:
        return "Your message was flagged for safety reasons."

    # Call your LLM
    llm_response = await call_openai(user_message)

    # Post-check: catch hallucinations, compliance violations
    post = await guard.evaluate(
        input_data=user_message,
        output_data=llm_response,
        model_id="chatbot-v1",
        checks=["hallucination", "compliance"],
    )
    if post.blocked:
        return "I'm unable to provide that response."

    return llm_response

See examples/llm_openai_chat.py for a complete example.

RAG Pipeline Safety

Check both retrieved context and generated responses:

# Check retrieved documents for compliance (PII, prohibited content)
doc_check = await guard.evaluate(
    input_data=query,
    output_data="\n".join(retrieved_docs),
    model_id="rag-pipeline",
    checks=["compliance"],
)

# Check generated answer for hallucination with source URLs
answer_check = await guard.evaluate(
    input_data=query,
    output_data=generated_answer,
    model_id="rag-pipeline",
    checks=["hallucination"],
    context={"known_urls": source_urls},
)

The hallucination checker uses known_urls in the context to distinguish real source URLs from fabricated ones. See examples/rag_pipeline.py.

Multi-Agent Systems

Use a shared RiskGuard instance across agents for unified audit trails and dashboards:

guard = RiskGuard()

# Each agent uses its own model_id for tracking
planner_report = await guard.evaluate(
    input_data=task, output_data=plan,
    model_id="planner-agent",
)

coder_report = await guard.evaluate(
    input_data=plan, output_data=code,
    model_id="coder-agent",
)

# Per-agent dashboards
planner_stats = await guard.dashboard.get_summary(model_id="planner-agent")
coder_stats = await guard.dashboard.get_summary(model_id="coder-agent")

Escalate when accumulated risk across an agent chain is too high. See examples/multi_agent.py.

Tool-Calling Agents

Validate tool inputs before execution and check outputs before returning to the LLM:

from airiskguard import RiskGuard
from airiskguard.checkers.base import BaseChecker
from airiskguard.checkers.registry import register_checker
from airiskguard.types import CheckResult, RiskLevel

# Custom checker for dangerous tool patterns
class ToolSafetyChecker(BaseChecker):
    name = "tool_safety"

    async def check(self, input_data, output_data, context=None):
        tool_name = input_data.get("tool", "") if isinstance(input_data, dict) else ""
        flags = []
        score = 0.0
        blocked_tools = {"rm", "delete_file", "drop_table", "exec_raw_sql"}
        if tool_name in blocked_tools:
            flags.append(f"blocked_tool: {tool_name}")
            score = 0.95
        risk = RiskLevel.CRITICAL if score >= 0.8 else RiskLevel.LOW
        return CheckResult(
            checker_name=self.name, risk_level=risk,
            passed=score < 0.5, score=score, details={"flags": flags},
        )

register_checker("tool_safety", ToolSafetyChecker)

guard = RiskGuard(config={"enabled_checkers": ["tool_safety", "security", "compliance"]})

See examples/tool_calling_agent.py for a complete agent loop.

Chatbot Middleware (FastAPI)

Add risk governance to a chat API with one-line middleware or explicit evaluation:

from fastapi import FastAPI
from airiskguard import RiskGuard
from airiskguard.integrations.fastapi import add_risk_guard

app = FastAPI()
guard = RiskGuard()

# Option 1: automatic middleware (adds x-risk-score, x-risk-level headers)
add_risk_guard(app, config={"enabled_checkers": ["security", "compliance"]})

# Option 2: explicit evaluation in endpoints
@app.post("/chat")
async def chat(request: dict):
    report = await guard.evaluate(
        input_data=request["message"],
        output_data="",
        model_id="chatbot",
        checks=["security"],
    )
    if report.blocked:
        return {"error": "Message blocked", "risk_level": report.overall_risk.value}
    # ... generate response ...

See examples/fastapi_app.py for a full chat API with streaming.

Streaming Responses

For streaming LLM responses, accumulate chunks and check after generation completes:

chunks = []
async for chunk in llm_stream(user_message):
    chunks.append(chunk)
    yield chunk  # stream to user

full_response = "".join(chunks)

# Post-check the complete response
report = await guard.evaluate(
    input_data=user_message,
    output_data=full_response,
    model_id="chatbot-v1",
    checks=["hallucination", "compliance"],
)
if report.blocked:
    # Log for review; response already streamed
    await guard.review.flag_for_review("chatbot-v1", report)

Custom Checkers

Write domain-specific checkers by extending BaseChecker:

from airiskguard.checkers.base import BaseChecker
from airiskguard.checkers.registry import register_checker
from airiskguard.types import CheckResult, RiskLevel

class ToxicityChecker(BaseChecker):
    name = "toxicity"

    def __init__(self, threshold: float = 0.7):
        self.threshold = threshold

    async def check(self, input_data, output_data, context=None):
        # Your detection logic here (call an API, run a model, etc.)
        toxicity_score = await detect_toxicity(output_data)

        if toxicity_score >= self.threshold:
            risk = RiskLevel.HIGH
            passed = False
        else:
            risk = RiskLevel.LOW
            passed = True

        return CheckResult(
            checker_name=self.name,
            risk_level=risk,
            passed=passed,
            score=toxicity_score,
            details={"toxicity_score": toxicity_score},
        )

# Register so RiskGuard can load it by name
register_checker("toxicity", ToxicityChecker)

# Use it
guard = RiskGuard(config={
    "enabled_checkers": ["toxicity", "security"],
    "checker_configs": {"toxicity": {"threshold": 0.6}},
})

Configuration

YAML Configuration

# airiskguard.yaml
storage_backend: sqlite          # memory | sqlite | json
storage_path: ./airiskguard.db

block_threshold: high            # low | medium | high | critical
review_threshold: medium
score_block_threshold: 0.85

enabled_checkers:
  - security
  - compliance
  - hallucination
  - bias

checker_configs:
  compliance:
    detect_pii: true
    detect_prohibited: true
    custom_rules:
      - name: api_key_pattern
        pattern: '(?:sk|pk)[-_][a-zA-Z0-9]{32,}'
  hallucination:
    use_nli: false               # true requires transformers extra
  security:
    check_encoding: true

audit_enabled: true
review_enabled: true
review_auto_escalate: true       # auto-escalate CRITICAL to review
dashboard_enabled: true

Load via path or dict:

guard = RiskGuard(config="airiskguard.yaml")
# or
guard = RiskGuard(config={"block_threshold": "high", "enabled_checkers": ["security"]})

Configuration Reference

Key	Type	Default	Description
`storage_backend`	str	`"memory"`	`"memory"`, `"sqlite"`, or `"json"`
`storage_path`	str	`""`	Path for sqlite/json backends
`block_threshold`	str	`"critical"`	Auto-block if risk >= this level
`review_threshold`	str	`"high"`	Flag for human review if risk >= this
`score_block_threshold`	float	`0.9`	Block if numeric score >= this
`enabled_checkers`	list	all five	Which checkers to load
`checker_configs`	dict	`{}`	Per-checker configuration
`audit_enabled`	bool	`true`	Enable immutable audit trail
`review_enabled`	bool	`true`	Enable human review workflow
`review_auto_escalate`	bool	`true`	Auto-escalate CRITICAL items
`dashboard_enabled`	bool	`true`	Record evaluation metrics
`anomaly_contamination`	float	`0.1`	IsolationForest contamination param
`drift_significance`	float	`0.05`	KS test p-value threshold

Risk Checkers

Checker	Detects
`security`	Prompt injection (~30 patterns), jailbreak (~20 patterns), encoding attacks, system prompt leakage
`compliance`	PII (SSN, email, credit card, phone), prohibited content, custom regex rules
`hallucination`	Fabricated URLs, unverifiable citations, contradictions, overconfident language, NLI-based contradiction
`bias`	Disparate impact (4/5ths rule), demographic parity, equalized odds, biased language
`fraud`	Amount anomaly (z-score), velocity abuse, suspicious patterns (round amounts, currency mismatch)

Checker Details

Security — Detects prompt injection attempts ("ignore previous instructions", system prompt markers, roleplay attacks), jailbreak patterns ("DAN mode", "unrestricted mode"), and encoding attacks (base64-encoded injections, homoglyphs). Also checks LLM output for system prompt leakage.

Compliance — Scans both input and output for PII (SSN: weight 0.9, credit card: 0.9, email: 0.4, phone: 0.5, IP: 0.3). Detects prohibited content (violence instructions, illegal activity, self-harm: score 0.95). Supports custom regex rules.

Hallucination — Heuristic mode detects fabricated URLs (not in context["known_urls"]), suspicious citations ("According to Author (YYYY)"), overconfident language ("100%", "guaranteed"), and internal contradictions (always/never pairs). Optional NLI mode uses cross-encoder/nli-deberta-v3-small for semantic contradiction detection.

Bias — Computes disparate impact ratio against the 4/5ths rule threshold using context["group_outcomes"]. Checks demographic parity gap, equalized odds (TPR/FPR differences using context["predictions"] and context["labels"]), and biased language patterns.

Fraud — Transaction-focused: z-score anomaly on amounts, per-user velocity tracking, pattern rules (round large amounts, currency/country mismatch).

Features

Model Registry — register, version, and manage model lifecycles (draft, validation, production, deprecated, retired)
Audit Log — immutable SHA-256 hash-chain audit trail with tamper verification
Risk Dashboard — aggregate metrics, trends, per-checker breakdowns, JSON export
Anomaly Detection — IsolationForest for anomalies, Kolmogorov-Smirnov test for drift
Regulatory Reports — GDPR, SOX, EU AI Act compliance reports (JSON + HTML)
Human Review — threshold-based flagging with approve/reject/escalate workflows and async callbacks
Framework Integration — FastAPI, Flask, ASGI, WSGI middleware with automatic risk headers
Decorator Pattern — @risk_guard() for wrapping any sync/async function
Custom Checkers — extend BaseChecker and register for domain-specific risk detection

Architecture

RiskGuard (orchestrator)
├── Checkers: security, compliance, hallucination, bias, fraud, [custom]
├── AuditLog: immutable hash-chain (SHA-256) per decision
├── RiskDashboard: per-model metrics, trends, checker breakdowns
├── ModelRegistry: lifecycle management (DRAFT → PRODUCTION → RETIRED)
├── ReviewWorkflow: flag → approve/reject/escalate with callbacks
├── AnomalyDetector: IsolationForest + KS drift
├── ReportGenerator: GDPR, SOX, EU AI Act
└── Storage: MemoryStorage | SQLiteStorage | JSONFileStorage

Each evaluate() call runs selected checkers, aggregates risk, logs to audit trail, records dashboard metrics, and optionally flags for human review — all in a single async call.

Examples

Example	Description
`llm_openai_chat.py`	Wrapping OpenAI chat completions with pre/post risk checks
`rag_pipeline.py`	RAG pipeline with document compliance + hallucination checking
`multi_agent.py`	Multi-agent orchestrator with per-agent tracking and escalation
`tool_calling_agent.py`	Tool-calling agent with input/output validation and custom checker
`fastapi_app.py`	FastAPI chat API with streaming and risk headers
`flask_app.py`	Flask integration with synchronous evaluation
`standalone_usage.py`	Direct API usage with all core features

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

May 4, 2026

This version

0.3.0

Mar 28, 2026

0.2.0

Mar 15, 2026

0.1.0

Mar 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airiskguard-0.3.0.tar.gz (470.1 kB view details)

Uploaded Mar 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

airiskguard-0.3.0-py3-none-any.whl (70.8 kB view details)

Uploaded Mar 28, 2026 Python 3

File details

Details for the file airiskguard-0.3.0.tar.gz.

File metadata

Download URL: airiskguard-0.3.0.tar.gz
Upload date: Mar 28, 2026
Size: 470.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for airiskguard-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`748ad45ec313e6d8e2d8c731bd7152e52cf0526b28a47e10339f11f5d9e77baf`
MD5	`b4bc5e681b5037e5568ec5ef2c31126e`
BLAKE2b-256	`f95f1628347fae637000124a87ac3278046af97a004d9a1de1e76f8b72686a1b`

See more details on using hashes here.

File details

Details for the file airiskguard-0.3.0-py3-none-any.whl.

File metadata

Download URL: airiskguard-0.3.0-py3-none-any.whl
Upload date: Mar 28, 2026
Size: 70.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for airiskguard-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cb87d1f287549a336e0d123fcba8b2473d7d4b2ceac74e34cf7559229b8f232f`
MD5	`1f4ab1dfec836e521efaa30cdfc55506`
BLAKE2b-256	`81f81ff60c3f85f299b1fc561c2e54f9bf5c9f2fd88f48529f3b92e193a459b5`

See more details on using hashes here.

airiskguard 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

airiskguard

Installation

Quick Start

Usage Guide

Guarding LLM Calls

RAG Pipeline Safety

Multi-Agent Systems

Tool-Calling Agents

Chatbot Middleware (FastAPI)

Streaming Responses

Custom Checkers

Configuration

YAML Configuration

Configuration Reference

Risk Checkers

Checker Details

Features

Architecture

Examples

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes