Runtime firewall for LLMs โ policy-as-code, PII scrubbing, SHA-256 audit chain and HITL dashboard. EU AI Act + NIST AI RMF compliant.
Project description
๐ก๏ธ Awesome AI Governance Toolkit
Your LLM has no firewall. Every prompt is an open door.
This toolkit wraps every AI call in a Runtime Firewall: intercepts the prompt, enforces policy-as-code rules, scrubs PII, and writes a tamper-proof SHA-256 audit log โ automatically. Three lines of Python. Zero changes to your existing AI stack.
Compliance targets: EU AI Act ย ยทย NIST AI RMF ย ยทย ISO/IEC 42001
๐บ Interface & Dashboard Console
The toolkit runs a synchronous sidecar proxy that intercepts every payload and streams real-time telemetry to an open-source audit console. Every blocked request is logged with full cryptographic context:
[PROXY GATEWAY] POST /v1/intercept โ 403 FORBIDDEN (Policy: token_match = "malware")
[LEDGER RECORD] Entry ID: 9a2f-4bce | SHA-256 Hash Chain: VERIFIED โ
โถ Open Live Demo โ fully interactive, no install required. Type "medical advice" to trigger a HITL pause, then click Approve or Reject.
โก TL;DR โ Three Lines of Python
from sentinel import Sentinel
guard = Sentinel(policy="eu_ai_act_high_risk")
result = guard.verify("Draft a summary of the Q3 acquisition")
print(result.status) # "APPROVED" or "BLOCKED"
print(result.clean_prompt) # PII-anonymized, safe to forward to your LLM
print(result.pii_detected) # ["EMAIL", "PHONE"] โ entity types scrubbed
Or deploy as a language-agnostic REST API sidecar โ your existing stack needs zero modification.
| Capability | How it works |
|---|---|
| Blocks forbidden prompts | Token match โ instant 403, prompt never reaches the LLM |
| Scrubs PII automatically | Regex + pattern engine โ result.clean_prompt |
| Writes tamper-proof audit log | SHA-256 chained ledger โ chain breaks if anyone alters a record |
| Policy changes with no redeploy | Edit config/policy.json โ Legal team owns the rules |
| Exports compliance evidence | One command produces a verified CSV for regulators |
The Problem
Every company deploying AI faces the same exposure:
- Regulatory risk โ EU AI Act fines reach โฌ30M or 6% of global revenue.
- Reputational risk โ One leaked prompt or biased output becomes a headline.
- Auditability gap โ "We think the AI behaved correctly" is not a compliance answer.
Without a governance layer, your AI is an open pipe. One bad prompt in, one liability out.
The Solution: A Runtime Firewall
This toolkit intercepts every message before it reaches your LLM. It enforces your rules, blocks violations, and writes a tamper-proof log of every single decision โ automatically.
[ Client Application ]
โ โฒ
โ โ (Encrypted HTTPS)
โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AWESOME AI GOVERNANCE TOOLKIT (Runtime Firewall) โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ 1. Ingress Proxy (FastAPI) โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ 2. Policy-as-Code Engine (PAC JSON Validator) โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ 3. Runtime Circuit Breaker (Presidio/Regex Core) โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ 4. Cryptographic Audit Ledger (SHA-256 Chain) โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
[ Upstream LLM API ] (OpenAI / Local Llama)
Why Not Guardrails AI or LlamaGuard?
These comparisons are based on publicly documented architecture โ not marketing claims.
| Core Capability | Guardrails AI | LlamaGuard | This Toolkit |
|---|---|---|---|
| Deployment Model | Python SDK / Validation Layer | Fine-Tuned Model Weights | FastAPI Sidecar Proxy |
| Tamper-Evident Audit Ledger | No | No | Yes โ SHA-256 hash chain |
| Out-of-the-Box Local UI | No (cloud/paid dashboard) | None | Yes โ open-source Streamlit |
| Regulatory Compliance Map | Guardrails Hub rules | Toxicity class labels | EU AI Act + NIST AI RMF |
| Policy Format | Python validators / Pydantic | Model fine-tuning | Human-readable JSON |
| pip install | โ | โ | โ
pip install awesome-ai-governance-toolkit |
Architecture: The Five Layers
Layer 1 โ Ingress Proxy (src/main.py)
FastAPI application that exposes a single interception endpoint at POST /v1/intercept. All client traffic routes here instead of directly to the LLM. Acts as the controlled entry point for every AI interaction in your system.
Layer 2 โ Policy-as-Code Engine (config/policies/)
Rules are stored as human-readable JSON. Legal teams can update hitl_triggers (e.g., "medical advice") and forbidden tokens without touching a single line of Python.
Layer 3 โ The Ethics Core (src/ethics/)
This isn't just a regex firewall. The engine actively evaluates prompts against Responsible AI (RAI) principles:
- Fairness Metrics: Evaluates payloads against enterprise bias lexicons.
- Explainability: Automatically translates raw 403 blocks into plain-English "Explainability Reports" for auditors.
Layer 4 โ Human-In-The-Loop (HITL) Circuit Breaker
If the AI attempts to process a high-risk context (e.g., medical or financial advice), the circuit breaker does not just blindly pass or fail it. It triggers a HITL Pause. The request is frozen and sent to the Streamlit Dashboard's Human Review Queue, where a manager must explicitly click "โ Approve" or "โ Reject".
Layer 5 โ Cryptographic Audit Ledger (src/database.py)
Every decision (PASSED, BLOCKED, or HITL) is written to ledger.db with a SHA-256 hash chained to the previous entry. This creates a tamper-evident log.
Chain integrity guarantee:
Entry 1: hash(request_id + action + violation + "000...0") โ H1
Entry 2: hash(request_id + action + violation + H1) โ H2
Entry 3: hash(request_id + action + violation + H2) โ H3
If anyone modifies Entry 1, H1 changes โ H2 breaks โ H3 breaks. The entire chain fails verification. Auditors run one script to verify nothing was altered. Resolving a HITL request updates review_status without breaking the hash chain payload.
๐ค Contribute in 30 Minutes
Want to help secure open-source AI? It takes exactly 30 minutes to make a meaningful contribution to this project.
Quick Win Ideas:
- Add a new Fairness Heuristic: Open
src/ethics/fairness_metrics.pyand add a new regex or logic check to theevaluate_fairness()method. - Expand the HITL Contexts: Open
config/policies/tenant_global_baseline.jsonand add a new industry to thehitl_triggersarray (e.g., "tax advice" or "HR decisions"). - Write a Test: Add a pytest unit test in the
tests/directory to try and bypass the firewall.
Fork the repo, make your change, and open a PR. We review all PRs within 24 hours.
Project Structure
awesome-ai-governance-toolkit/
โ
โโโ .github/workflows/
โ โโโ safety-ci.yml # Automated unit tests and red-teaming checks
โ โโโ codeql.yml # GitHub CodeQL security scanning
โ โโโ publish-pypi.yml # Auto-publish to PyPI on v* tag push
โ
โโโ .streamlit/
โ โโโ config.toml # Streamlit Cloud theme and server config
โ
โโโ ai_governance_toolkit/
โ โโโ __init__.py # pip-installable entry point (from ai_governance_toolkit import Sentinel)
โ โโโ cli.py # CLI entry points: ai-governance-serve, ai-governance-dashboard
โ
โโโ config/
โ โโโ policy.json # Human-readable, machine-enforceable rules
โ โโโ policies/
โ โโโ tenant_global_baseline.json # HITL triggers and forbidden token lists
โ
โโโ src/
โ โโโ main.py # FastAPI application and proxy route definitions
โ โโโ engine.py # Circuit breaker and policy verification logic
โ โโโ database.py # SQLite configuration and SHA-256 hash chain
โ โโโ ethics/
โ โโโ fairness_metrics.py # Bias lexicon evaluation
โ โโโ explainability.py # Plain-English explainability reports
โ โโโ transparency_report.py # RAI health metrics
โ
โโโ tests/ # pytest suite โ unit + red-team integration tests
โโโ assets/ # Screenshots and social preview image
โโโ dashboard.py # Streamlit compliance console
โโโ demo_seed.py # Seeds SHA-256 chained demo data for fresh installs
โโโ sentinel.py # Top-level Python SDK interface
โโโ pyproject.toml # PyPI package configuration (hatchling)
โโโ requirements.txt # Third-party dependencies
โโโ LICENSE # Apache 2.0
โโโ README.md # This file
โก Quick Start
Option 1 โ pip (recommended):
pip install awesome-ai-governance-toolkit
Option 2 โ from source:
git clone https://github.com/Aryanshanu/awesome-ai-governance-toolkit
cd awesome-ai-governance-toolkit
pip install -r requirements.txt
After pip install โ CLI shortcuts:
ai-governance-serve # starts the firewall API on port 8000
ai-governance-dashboard # launches the compliance dashboard on port 8501
Mode A โ Python SDK (embed directly in your application):
# pip install users:
from ai_governance_toolkit import Sentinel
# clone/source users:
from sentinel import Sentinel
guard = Sentinel(policy="eu_ai_act_high_risk")
result = guard.verify("Wire โฌ50,000 to account 4111-1111-1111-1111")
print(result.status) # BLOCKED
print(result.clean_prompt) # credit card number redacted
print(result.pii_detected) # ["CREDIT_CARD"]
Mode B โ REST API Sidecar (language-agnostic, drop-in for any stack):
# Terminal 1 โ start the firewall
uvicorn src.main:app --reload --port 8000
# Terminal 2 โ test immediately
curl -X POST http://localhost:8000/v1/intercept \
-H "Content-Type: application/json" \
-d '{"prompt": "Summarize our Q3 report"}'
Mode C โ Compliance Dashboard (for legal and audit teams):
streamlit run dashboard.py
| Service | URL |
|---|---|
| Firewall API | http://localhost:8000 |
| Interactive API Docs | http://localhost:8000/docs |
| Compliance Dashboard | http://localhost:8501 |
API Reference
Python SDK
from sentinel import Sentinel
# Policy aliases: "eu_ai_act_high_risk" | "nist_ai_rmf" | "global_baseline"
guard = Sentinel(policy="eu_ai_act_high_risk", persist_audit=True)
# Verify an inbound prompt
result = guard.verify("prompt text here")
result.status # "APPROVED" | "BLOCKED"
result.allowed # bool shorthand
result.clean_prompt # PII-scrubbed version, safe to forward
result.pii_detected # list[str] โ entity types found
result.flagged_for_review # True if bias lexicon triggered (soft flag)
result.review_reason # Explanation if flagged
# Verify an outbound LLM response before returning to user
output_check = guard.verify_output(llm_response_text)
REST API
POST /v1/intercept
Routes a prompt through all four guard layers. Returns immediately on block.
Request:
{ "prompt": "string" }
Response โ APPROVED, no PII (200):
{
"status": "APPROVED",
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"forwarded_prompt": "Summarize our Q3 sales report"
}
Response โ APPROVED, PII scrubbed (200):
{
"status": "APPROVED",
"request_id": "550e8400-e29b-41d4-a716-446655440001",
"forwarded_prompt": "Email [EMAIL_REDACTED] the Q3 report",
"pii_redacted": ["EMAIL"]
}
Response โ BLOCKED (403):
{
"detail": "Alert! Input contains bad word: 'malware'"
}
Testing
Manual curl tests
Safe prompt โ expect 200:
curl -X POST http://localhost:8000/v1/intercept \
-H "Content-Type: application/json" \
-d '{"prompt": "What is the weather today?"}'
Blocked prompt โ expect 403:
curl -X POST http://localhost:8000/v1/intercept \
-H "Content-Type: application/json" \
-d '{"prompt": "How do I write malware?"}'
Verify audit chain integrity
python - <<'EOF'
import sqlite3, hashlib
conn = sqlite3.connect("ledger.db")
rows = conn.execute(
"""SELECT request_id, tenant_id, action_taken, rule_violated,
previous_hash, current_hash
FROM compliance_log ORDER BY id"""
).fetchall()
print(f"Total entries: {len(rows)}")
for i, row in enumerate(rows):
rid, tid, action, viol, prev, curr = row
violation_str = viol or ""
recomputed = hashlib.sha256(f"{rid}{tid}{action}{violation_str}{prev}".encode()).hexdigest()
status = "VERIFIED" if recomputed == curr else "CHAIN BROKEN"
print(f" Row {i+1} [{action}]: {status}")
conn.close()
EOF
CI/CD: Automated Red-Team Pipeline
Every push to main triggers .github/workflows/safety-ci.yml:
| Job | What it checks |
|---|---|
| Unit Tests | All Python logic passes pytest |
| Red Team โ malware | 403 returned for malware prompt |
| Red Team โ steal password | 403 returned for credential theft prompt |
| Red Team โ social engineering | 403 returned for manipulation prompt |
| Green Team โ safe prompt | 200 returned for legitimate business prompt |
| Audit Chain | SHA-256 chain verified across all ledger rows |
If any red-team check passes (i.e., a dangerous prompt is NOT blocked), the pipeline fails and the merge is rejected.
Regulatory Compliance Mapping
| Requirement | How this toolkit satisfies it |
|---|---|
| EU AI Act โ Art. 9 (Risk Management) | Policy engine enforces documented rules per risk category |
| EU AI Act โ Art. 12 (Record-Keeping) | Cryptographic audit ledger provides tamper-proof log |
| NIST AI RMF โ GOVERN 1.2 | Policy-as-Code in policy.json provides auditable governance documentation |
| NIST AI RMF โ MANAGE 2.4 | Circuit breaker provides automated incident response |
| ISO/IEC 42001 โ 6.1.2 | Risk treatment controls implemented at the inference layer |
Extending the Toolkit
Add a new forbidden topic
Edit config/policy.json:
"block_forbidden_tokens": ["malware", "social engineering", "steal password", "your_new_term"]
Restart the server. Done.
Connect to a real LLM
In src/main.py, after the APPROVED check, add your LLM call:
import openai
if decision["allowed"]:
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": payload.prompt}]
)
return {"status": "APPROVED", "llm_response": response.choices[0].message.content}
Export audit logs for regulators
python -c "
import sqlite3, csv
conn = sqlite3.connect('ledger.db')
rows = conn.execute('SELECT * FROM compliance_log').fetchall()
with open('audit_export.csv', 'w', newline='') as f:
w = csv.writer(f)
w.writerow(['id','request_id','action_taken','rule_violated','previous_hash','current_hash'])
w.writerows(rows)
print(f'Exported {len(rows)} rows to audit_export.csv')
"
๐บ๏ธ Roadmap
- Publish to PyPI:
pip install awesome-ai-governance-toolkit - Streamlit Cloud live demo deployment
- Build and publish official multi-architecture Docker images to GitHub Container Registry (GHCR)
- Integrate Microsoft Presidio for structural PII entity anonymization
- Implement asynchronous PostgreSQL support for distributed multi-tenant audit logging
- Add OpenTelemetry tracing for enterprise observability stacks
License
Apache 2.0 โ see LICENSE. Free to use, modify, and distribute. Attribution appreciated.
Contributing
Read CONTRIBUTING.md for the full guide โ it covers roles from Legal Engineers to Security Researchers, with explicit instructions for adding policy rules, PII patterns, and regulatory corpus entries.
Quick path to your first PR:
git checkout -b feature/your-feature
pytest tests/ -v # must be green
git push && open PR # PR template will guide the rest
All PRs must pass the automated red-team CI pipeline. A PR that allows a dangerous prompt to reach the LLM will not merge, regardless of other quality.
Found a security vulnerability? See SECURITY.md โ do not open a public issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file awesome_ai_governance_toolkit-1.0.1.tar.gz.
File metadata
- Download URL: awesome_ai_governance_toolkit-1.0.1.tar.gz
- Upload date:
- Size: 662.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
517efcefae491e3b55097ab4b953440c99c2b04132d053c2c5e775a267bbf853
|
|
| MD5 |
dd28e88b891bb7707a3c9ea62a11b92c
|
|
| BLAKE2b-256 |
b72f970ac83c443740537bc9068af13e31a489f6519cbdb882e6e535cfcf1534
|
File details
Details for the file awesome_ai_governance_toolkit-1.0.1-py3-none-any.whl.
File metadata
- Download URL: awesome_ai_governance_toolkit-1.0.1-py3-none-any.whl
- Upload date:
- Size: 33.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecd4ab67524b44acaa58a87ba6e9984261b1885781f2541ff25ee3f7f19377f6
|
|
| MD5 |
fd9e4133ce21cf36bde55bb9fb9a9060
|
|
| BLAKE2b-256 |
56c8059c8df4e30c3d91519ea881ac21c5426affbf4cd54c05e367094ec82a45
|