Skip to main content

AI Risk Governance Framework — model registry, audit logs, risk dashboards, anomaly detection, regulatory reports, and human review workflows.

Project description

airiskguard

AI Risk Governance Framework for LLM applications, AI agents, and ML systems. Provides risk checkers, audit logs, model registry, dashboards, anomaly detection, regulatory reports, and human review workflows.

Installation

pip install airiskguard

With optional extras:

pip install airiskguard[fastapi]         # FastAPI integration
pip install airiskguard[flask]           # Flask integration
pip install airiskguard[transformers]    # ML-based hallucination detection (NLI)
pip install airiskguard[dev]             # Development tools

Quick Start

Guard an LLM call in three lines:

from airiskguard import RiskGuard

guard = RiskGuard()

# Check user prompt before sending to LLM
pre = await guard.evaluate(
    input_data=user_message,
    output_data="",
    model_id="gpt-4",
    checks=["security", "compliance"],
)
if pre.blocked:
    return "Sorry, I can't process that request."

# ... call your LLM ...

# Check LLM response before returning to user
post = await guard.evaluate(
    input_data=user_message,
    output_data=llm_response,
    model_id="gpt-4",
    checks=["hallucination", "compliance"],
)
if post.blocked:
    return "Response filtered for safety."

For synchronous code, use guard.evaluate_sync(...) instead.

Usage Guide

Guarding LLM Calls

Wrap any LLM API call (OpenAI, Anthropic, etc.) with pre- and post-evaluation:

from airiskguard import RiskGuard

guard = RiskGuard(config={
    "enabled_checkers": ["security", "compliance", "hallucination"],
    "block_threshold": "high",
})

async def chat(user_message: str) -> str:
    # Pre-check: block prompt injection, PII leakage, jailbreaks
    pre = await guard.evaluate(
        input_data=user_message,
        output_data="",
        model_id="chatbot-v1",
        checks=["security", "compliance"],
    )
    if pre.blocked:
        return "Your message was flagged for safety reasons."

    # Call your LLM
    llm_response = await call_openai(user_message)

    # Post-check: catch hallucinations, compliance violations
    post = await guard.evaluate(
        input_data=user_message,
        output_data=llm_response,
        model_id="chatbot-v1",
        checks=["hallucination", "compliance"],
    )
    if post.blocked:
        return "I'm unable to provide that response."

    return llm_response

See examples/llm_openai_chat.py for a complete example.

RAG Pipeline Safety

Check both retrieved context and generated responses:

# Check retrieved documents for compliance (PII, prohibited content)
doc_check = await guard.evaluate(
    input_data=query,
    output_data="\n".join(retrieved_docs),
    model_id="rag-pipeline",
    checks=["compliance"],
)

# Check generated answer for hallucination with source URLs
answer_check = await guard.evaluate(
    input_data=query,
    output_data=generated_answer,
    model_id="rag-pipeline",
    checks=["hallucination"],
    context={"known_urls": source_urls},
)

The hallucination checker uses known_urls in the context to distinguish real source URLs from fabricated ones. See examples/rag_pipeline.py.

Multi-Agent Systems

Use a shared RiskGuard instance across agents for unified audit trails and dashboards:

guard = RiskGuard()

# Each agent uses its own model_id for tracking
planner_report = await guard.evaluate(
    input_data=task, output_data=plan,
    model_id="planner-agent",
)

coder_report = await guard.evaluate(
    input_data=plan, output_data=code,
    model_id="coder-agent",
)

# Per-agent dashboards
planner_stats = await guard.dashboard.get_summary(model_id="planner-agent")
coder_stats = await guard.dashboard.get_summary(model_id="coder-agent")

Escalate when accumulated risk across an agent chain is too high. See examples/multi_agent.py.

Tool-Calling Agents

Validate tool inputs before execution and check outputs before returning to the LLM:

from airiskguard import RiskGuard
from airiskguard.checkers.base import BaseChecker
from airiskguard.checkers.registry import register_checker
from airiskguard.types import CheckResult, RiskLevel

# Custom checker for dangerous tool patterns
class ToolSafetyChecker(BaseChecker):
    name = "tool_safety"

    async def check(self, input_data, output_data, context=None):
        tool_name = input_data.get("tool", "") if isinstance(input_data, dict) else ""
        flags = []
        score = 0.0
        blocked_tools = {"rm", "delete_file", "drop_table", "exec_raw_sql"}
        if tool_name in blocked_tools:
            flags.append(f"blocked_tool: {tool_name}")
            score = 0.95
        risk = RiskLevel.CRITICAL if score >= 0.8 else RiskLevel.LOW
        return CheckResult(
            checker_name=self.name, risk_level=risk,
            passed=score < 0.5, score=score, details={"flags": flags},
        )

register_checker("tool_safety", ToolSafetyChecker)

guard = RiskGuard(config={"enabled_checkers": ["tool_safety", "security", "compliance"]})

See examples/tool_calling_agent.py for a complete agent loop.

Chatbot Middleware (FastAPI)

Add risk governance to a chat API with one-line middleware or explicit evaluation:

from fastapi import FastAPI
from airiskguard import RiskGuard
from airiskguard.integrations.fastapi import add_risk_guard

app = FastAPI()
guard = RiskGuard()

# Option 1: automatic middleware (adds x-risk-score, x-risk-level headers)
add_risk_guard(app, config={"enabled_checkers": ["security", "compliance"]})

# Option 2: explicit evaluation in endpoints
@app.post("/chat")
async def chat(request: dict):
    report = await guard.evaluate(
        input_data=request["message"],
        output_data="",
        model_id="chatbot",
        checks=["security"],
    )
    if report.blocked:
        return {"error": "Message blocked", "risk_level": report.overall_risk.value}
    # ... generate response ...

See examples/fastapi_app.py for a full chat API with streaming.

Streaming Responses

For streaming LLM responses, accumulate chunks and check after generation completes:

chunks = []
async for chunk in llm_stream(user_message):
    chunks.append(chunk)
    yield chunk  # stream to user

full_response = "".join(chunks)

# Post-check the complete response
report = await guard.evaluate(
    input_data=user_message,
    output_data=full_response,
    model_id="chatbot-v1",
    checks=["hallucination", "compliance"],
)
if report.blocked:
    # Log for review; response already streamed
    await guard.review.flag_for_review("chatbot-v1", report)

Custom Checkers

Write domain-specific checkers by extending BaseChecker:

from airiskguard.checkers.base import BaseChecker
from airiskguard.checkers.registry import register_checker
from airiskguard.types import CheckResult, RiskLevel

class ToxicityChecker(BaseChecker):
    name = "toxicity"

    def __init__(self, threshold: float = 0.7):
        self.threshold = threshold

    async def check(self, input_data, output_data, context=None):
        # Your detection logic here (call an API, run a model, etc.)
        toxicity_score = await detect_toxicity(output_data)

        if toxicity_score >= self.threshold:
            risk = RiskLevel.HIGH
            passed = False
        else:
            risk = RiskLevel.LOW
            passed = True

        return CheckResult(
            checker_name=self.name,
            risk_level=risk,
            passed=passed,
            score=toxicity_score,
            details={"toxicity_score": toxicity_score},
        )

# Register so RiskGuard can load it by name
register_checker("toxicity", ToxicityChecker)

# Use it
guard = RiskGuard(config={
    "enabled_checkers": ["toxicity", "security"],
    "checker_configs": {"toxicity": {"threshold": 0.6}},
})

Configuration

YAML Configuration

# airiskguard.yaml
storage_backend: sqlite          # memory | sqlite | json
storage_path: ./airiskguard.db

block_threshold: high            # low | medium | high | critical
review_threshold: medium
score_block_threshold: 0.85

enabled_checkers:
  - security
  - compliance
  - hallucination
  - bias

checker_configs:
  compliance:
    detect_pii: true
    detect_prohibited: true
    custom_rules:
      - name: api_key_pattern
        pattern: '(?:sk|pk)[-_][a-zA-Z0-9]{32,}'
  hallucination:
    use_nli: false               # true requires transformers extra
  security:
    check_encoding: true

audit_enabled: true
review_enabled: true
review_auto_escalate: true       # auto-escalate CRITICAL to review
dashboard_enabled: true

Load via path or dict:

guard = RiskGuard(config="airiskguard.yaml")
# or
guard = RiskGuard(config={"block_threshold": "high", "enabled_checkers": ["security"]})

Configuration Reference

Key Type Default Description
storage_backend str "memory" "memory", "sqlite", or "json"
storage_path str "" Path for sqlite/json backends
block_threshold str "critical" Auto-block if risk >= this level
review_threshold str "high" Flag for human review if risk >= this
score_block_threshold float 0.9 Block if numeric score >= this
enabled_checkers list all five Which checkers to load
checker_configs dict {} Per-checker configuration
audit_enabled bool true Enable immutable audit trail
review_enabled bool true Enable human review workflow
review_auto_escalate bool true Auto-escalate CRITICAL items
dashboard_enabled bool true Record evaluation metrics
anomaly_contamination float 0.1 IsolationForest contamination param
drift_significance float 0.05 KS test p-value threshold

Risk Checkers

Checker Detects
security Prompt injection (~30 patterns), jailbreak (~20 patterns), encoding attacks, system prompt leakage
compliance PII (SSN, email, credit card, phone), prohibited content, custom regex rules
hallucination Fabricated URLs, unverifiable citations, contradictions, overconfident language, NLI-based contradiction
bias Disparate impact (4/5ths rule), demographic parity, equalized odds, biased language
fraud Amount anomaly (z-score), velocity abuse, suspicious patterns (round amounts, currency mismatch)

Checker Details

Security — Detects prompt injection attempts ("ignore previous instructions", system prompt markers, roleplay attacks), jailbreak patterns ("DAN mode", "unrestricted mode"), and encoding attacks (base64-encoded injections, homoglyphs). Also checks LLM output for system prompt leakage.

Compliance — Scans both input and output for PII (SSN: weight 0.9, credit card: 0.9, email: 0.4, phone: 0.5, IP: 0.3). Detects prohibited content (violence instructions, illegal activity, self-harm: score 0.95). Supports custom regex rules.

Hallucination — Heuristic mode detects fabricated URLs (not in context["known_urls"]), suspicious citations ("According to Author (YYYY)"), overconfident language ("100%", "guaranteed"), and internal contradictions (always/never pairs). Optional NLI mode uses cross-encoder/nli-deberta-v3-small for semantic contradiction detection.

Bias — Computes disparate impact ratio against the 4/5ths rule threshold using context["group_outcomes"]. Checks demographic parity gap, equalized odds (TPR/FPR differences using context["predictions"] and context["labels"]), and biased language patterns.

Fraud — Transaction-focused: z-score anomaly on amounts, per-user velocity tracking, pattern rules (round large amounts, currency/country mismatch).

Features

  • Model Registry — register, version, and manage model lifecycles (draft, validation, production, deprecated, retired)
  • Audit Log — immutable SHA-256 hash-chain audit trail with tamper verification
  • Risk Dashboard — aggregate metrics, trends, per-checker breakdowns, JSON export
  • Anomaly Detection — IsolationForest for anomalies, Kolmogorov-Smirnov test for drift
  • Regulatory Reports — GDPR, SOX, EU AI Act compliance reports (JSON + HTML)
  • Human Review — threshold-based flagging with approve/reject/escalate workflows and async callbacks
  • Framework Integration — FastAPI, Flask, ASGI, WSGI middleware with automatic risk headers
  • Decorator Pattern@risk_guard() for wrapping any sync/async function
  • Custom Checkers — extend BaseChecker and register for domain-specific risk detection

Architecture

RiskGuard (orchestrator)
├── Checkers: security, compliance, hallucination, bias, fraud, [custom]
├── AuditLog: immutable hash-chain (SHA-256) per decision
├── RiskDashboard: per-model metrics, trends, checker breakdowns
├── ModelRegistry: lifecycle management (DRAFT → PRODUCTION → RETIRED)
├── ReviewWorkflow: flag → approve/reject/escalate with callbacks
├── AnomalyDetector: IsolationForest + KS drift
├── ReportGenerator: GDPR, SOX, EU AI Act
└── Storage: MemoryStorage | SQLiteStorage | JSONFileStorage

Each evaluate() call runs selected checkers, aggregates risk, logs to audit trail, records dashboard metrics, and optionally flags for human review — all in a single async call.

Examples

Example Description
llm_openai_chat.py Wrapping OpenAI chat completions with pre/post risk checks
rag_pipeline.py RAG pipeline with document compliance + hallucination checking
multi_agent.py Multi-agent orchestrator with per-agent tracking and escalation
tool_calling_agent.py Tool-calling agent with input/output validation and custom checker
fastapi_app.py FastAPI chat API with streaming and risk headers
flask_app.py Flask integration with synchronous evaluation
standalone_usage.py Direct API usage with all core features

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airiskguard-0.3.0.tar.gz (470.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airiskguard-0.3.0-py3-none-any.whl (70.8 kB view details)

Uploaded Python 3

File details

Details for the file airiskguard-0.3.0.tar.gz.

File metadata

  • Download URL: airiskguard-0.3.0.tar.gz
  • Upload date:
  • Size: 470.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for airiskguard-0.3.0.tar.gz
Algorithm Hash digest
SHA256 748ad45ec313e6d8e2d8c731bd7152e52cf0526b28a47e10339f11f5d9e77baf
MD5 b4bc5e681b5037e5568ec5ef2c31126e
BLAKE2b-256 f95f1628347fae637000124a87ac3278046af97a004d9a1de1e76f8b72686a1b

See more details on using hashes here.

File details

Details for the file airiskguard-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: airiskguard-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 70.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for airiskguard-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb87d1f287549a336e0d123fcba8b2473d7d4b2ceac74e34cf7559229b8f232f
MD5 1f4ab1dfec836e521efaa30cdfc55506
BLAKE2b-256 81f81ff60c3f85f299b1fc561c2e54f9bf5c9f2fd88f48529f3b92e193a459b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page